Sunday 8 January 2012

Entity Framework Performance Optimization For Include Method

Entity Framework is easier to use than NHibernate, my favourite API in EF is the Include method. For example, if you want to load object graph, you can simply use Include method like this:

var orders = context.Orders.Include("DeliveryAddress").Include("Customer");

In rare occasion, you want to load the entire object graph (not a good practice, but sometimes convenient), you might experience performance issue. If you ever looked at the SQL output of entity query, you will notice that the length of SQL output grows almost exponentially with the number of include methods you use and the depth of object graph you want to load. So you might end up with a query that takes minutes to run.

I recently found out that if you break up the include methods, the entity framework will still load the same data, but much faster:

var orders =  context.Orders.Include("DeliveryAddress");

orders =  context.Orders.Include("Customer");

In this case, the entity framework will generate two much shorter SQL queries, and combine the SQL results into one order object. For large database, this loading technique could save considerable amount of time.

2 comments:

  1. Aren't you losing your DeliveryAddress connection? From MSDN: When you call the Include method, the query path is only valid on the returned instance of the ObjectQuery. Other instances of ObjectQuery and the object context itself are not affected.

    ReplyDelete
  2. I also believe that the first set should be lost went the second query is executed...

    ReplyDelete