Καλώς ορίσατε στο dotNETZone.gr - Σύνδεση | Εγγραφή | Βοήθεια

Dot Net Rules

Yes, to dance beneath the diamond sky with one hand waving free

Ιστορικό Δημοσιεύσεων

Παρουσίαση με Ετικέτες

Όλες οι Ετικέτε... » performance   (RSS)
Entity Framework Performance optimization patterns

In this post I am going to provide you with a hands-on example on how to avoid writing your LINQ to Entities queries in a way that will hinder performance of your application. The aim of this post (hopefully a series of posts on performance and Entity Framework) is to highlight bad coding practices when architecting an applications that uses EF as the ORM to fetch and manipulate data from the data store. I am going to point out some practises and patterns that very often developers use and cause EF to create poor-performing T-SQL statements.

Entity Framework will always create T-SQL, the thing to keep in mind is that we have to make sure that this T-SQL code (that we cannot write ourselves since it is abstracted by EF) if it is poor then when passed to the SQL Server engine (through the optimiser and the creation of the execution plan) will cause our applications to perform poorly especially under heavy load.

Let me talk a bit about query optimisation and T-SQL. T-SQL is declarative by nature. When we write T-SQL statements in a query window in SSMS and execute them,we just say to SQL Server “I want these results back”. We do not provide any details on how the results will be returned.If there was nothing else between our T-SQL code and the SQL Server Database engine, we simply would not get any results back.Luckily for us there is a very important component, the Query Optimizer that generates an imperative plan. By saying imperative I mean detailed. This plan that is called execution plan is what is actually executed by the relational engine.The query optimiser will not look for a perfect plan.It is a cost-based optimiser that must find an efficient plan.The optimiser when deciding upon the execution plan will take in to consideration the type of operations,statistics (must always be up to date),indexes,hardware resources (number of CPUs ,available memory),SQL Server edition,number of active concurrent connections and query hints. If the T-SQL that is generated by the EF is pooly written then the optimiser will not create an optimal plan hence the problems in performance.

Entity Framework is an object-relational mapping (ORM) framework for the .NET Framework.EF addresses the problem of Object-relational impedance mismatch. I will not be talking about that mismatch because it is well documented in many sites on the Internet.

Through that framework we can program against a conceptual application model instead of programming directly against a relational schema-model. By doing so we can decrease the amount of code we do write to access a data storage and thus decrease maintenance time. You can find many posts regarding Entity Framework in this blog.

1) Create an empty ASP.Net Application (Web Forms Application) and give it the name EFoptimisation. I am using Visual Studio 2013 Ultimate edition.

2) Add a new web forms page in the application. Leave the default name. The application I am going to build is very simple web forms application. The user will enter a last name and will get back the first name(s) for that last name.

3) I will use the AdventureWorks2014 database (You can download it here) for this application and more specifically the Person.Person table. I have installed SQL Server 2014 Enterprise edition in my machine. 

4) I will add an ADO.Net Entity data model using Database First paradigm. Follow the wizzard steps, create the connection string and then import into the conceptual model only the Person.Person table which will become an entity in the domain model. If you want to look at those detailed steps if you are new to EF and Database First have a look at this post.

5) Add a textbox and a button to the page. The user will enter the last name in the textbox and will hit enter and then the results will be printed on the page.

This is the code inside the Page_Load event handling routine.

protected void Page_Load(object sender, EventArgs e)
{

using( var ctx = new AdventureWorks2014Entities())

{

string LastName = TextBox1.Text;

var query = from person in ctx.People
select person;


foreach (var p in query)
{

if (p.LastName==LastName)

{

Response.Write(p.FirstName);

   Response.Write("<br/>");
}

}
}

}

The code above is pretty straight forward.

6) Now we are ready to run the application. Before we do that I will launch SSMS and connect to my local instance of SQL Server. Then I will also launch the SQL Profiler and create a new trace. The trace will listen only for the SQL:BatchStarting event. I activate te trace so the trace is running.

7) I build and run my web app. The results I get back when typing "Yuan" as last name is 92 first names.

8) Let me see what the trace recorded in my SQL Profiler and the T-SQL that was generated.



9) If I copy and paste the T-SQL in my SSSM and execute the query (Enable Actual Execution Plan and Client Statistics) I will get the following results - 19972 rows in total. Ηave a look at the piscture below

Now let's have a look at the execution plan created. Have a look at the picture below.

We have a Clustered Index Scan.The Clustered Index Scan means that SQL Server started at the very top of the table and scanned every row until it reached the bottom of the table. Not a very good scenario to have in terms of performance. Now lets have at the Client Statistics tab. Have a look at the picture below.




As you can see the bytes transfered from SQL Server is almost 26 mbytes.That is a huge amount of data to be transfered through the network back to the client to satisfy a single query.

10) We need to refactor our code in order to create more efficient T-SQL code.

protected void Page_Load(object sender, EventArgs e)
{

using( var ctx = new AdventureWorks2014Entities())

{

string LastName = TextBox1.Text;

var query = from person in ctx.People
where person.LastName.Equals(LastName)
select person;


foreach (var p in query)
{

Response.Write(p.FirstName);
Response.Write("<br/>");

}
}

}

As you can see from the code above I am doing now the filtering on the server.

Have a look below to see what the Profiler's trace output was.

The T-SQL now has a Where clause.It is a parametirised query.If I place this query in my SSMS and execute it I will get back 92 rows only and my execution plain will look like this

This is a by far more optimal executon plan(Index Seek & Lookup) that the Clustered Index Seek.

If I look at the Client Statistics tab (Bytes received from the server), I have only 145Kbytes of data compared with the 26Mbytes I had previously.

11) Now we can use projection to retrieve only the columns that we are interested in (FirstName) and get rid of the other ones.

I go back to the Page_Load routine and instead of

var query = from person in ctx.People
where person.LastName.Equals(LastName)
select person;

I rewrite my code to utilize projection

var query = from person in ctx.People
where person.LastName.Equals(LastName)
select new { person.FirstName };


Then I build and run the application with the Profiler running.

This is the T-SQL from the Profiler when I type "Yuan" in the textbox.

exec sp_executesql N'SELECT
1 AS [C1],
[Extent1].[FirstName] AS [FirstName]
FROM [Person].[Person] AS [Extent1]
WHERE [Extent1].[LastName] = @p__linq__0',N'@p__linq__0 nvarchar(4000)',@p__linq__0=N'Yuan'

As you can see from the T-SQL above I get only the FirstName column. If I place the T-SQL code in an SSMS query window (Enable Client Statistics & Actual Execution plan) I get back 92 rows of data and the following picture shows the actual exection plan. Now we see a Non Clustered Index Seek. We are seeking and not scanning which is a great thing.

Let's have a look at the client statistics.As you can see from the picture below the bytes received from the server is just 9183 bytes , 9 Kbytes. This is huge reduction compared to the 145 Kbytes and 26 Mbytes.



To recap, make sure that you do all the filtering on SQL Server and use projection when you do not want all the table columns in your application, when writing apps that use EF as the data access layer.It is always wrong to filter on the client when SQL Server has all the power to do that for us.


Hope it helps!!!

Handling batch operations with Entity Framework and EF Utilities

In this post I am going to provide you with a hands-on example on how to handle efficiently batch operations with Entity Framework and the EF Utilities library.

Entity Framework is an object-relational mapping (ORM) framework for the .NET Framework.EF addresses the problem of Object-relational impedance mismatch. I will not be talking about that mismatch because it is well documented in many sites on the Internet.

Through that framework we can program against a conceptual application model instead of programming directly against a relational schema-model. By doing so we can decrease the amount of code we do write to access a data storage and thus decrease maintenance time. You can find many posts regarding Entity Framework in this blog.

As stated earlier it is easy to implement basic CRUD (create, read, update, delete) functionality using EF. There are performances issues when performing batch operations on thousand of domain objects (records).

In this post I will do a batch update on a table with traditional EF techniques and then use another library called  EF Utilities. The performance can be significantly improved with EF Utilities batch CUD (create, update, delete) operations.

EF Utilities(EntityFramework.Utilities) is created by Mikal Eliasson (http://mikee.se/ - @MikaelEliasson). 

1) I will create an empty ASP.Net Application (Web Forms Application) and I give it the name EFUtilitiesUpdate.  I am using Visual Studio 2013 Ultimate edition.

2) I add a new web forms page in the application. Leave the default name.

3) I will use the AdventureWorks2014 database (You can download it here) for this application and more specifically the Person.Person table. I have installed SQL Server 2014 Enterprise edition in my machine. SQL Express edition will work fine.

4) I will add an ADO.Net Entity data model using Database First. Follow the wizzard steps, create the connection string and then import into the conceptual model the Person.Person table which will become an entity in the domain model. 

Have a look at the picture below to see what you must have in Visual Studio.

 

5) I will write a sql statement to acquire rows from the Person.Person table. Connect to your SQL Server instance. Type the following query.

SELECT  [EmailPromotion]

FROM    [AdventureWorks2014].[Person].[Person]

WHERE   EmailPromotion = 0

This query after execution will return 11158 rows. We wil create a Batch update operation through our ASP.Net web forms application. We will also measure the performance of this batch operation. We will create a batch operation from our application that will affect all the rows (11158). If we did the batch update from T-SQL we would use the following statement.

UPDATE [AdventureWorks2014].[Person].[Person]

SET EmailPromotion = 10

WHERE EmailPromotion =0

6) In our webform in the Page_Load event we write the following code. We add 10 to every row in the Persons entity to the EmailPromotion column.

  protected void Page_Load(object sender, EventArgs e)

        {

  using (AdventureWorks2014Entities ctx = new AdventureWorks2014Entities())

        {

            IQueryable<Person> persons = ctx.People.Where(x => x.EmailPromotion == 0);

            foreach (Person person in persons)

            {

                person.EmailPromotion = person.EmailPromotion + 10;

            }

            ctx.SaveChanges();

        }

        }

Make sure you build your application but do not run it just yet.

7) I will use the Entity Framework Profiler to gain valuable insight in the batch operation and record metrics regarding the performance. You can download it here -http://www.hibernatingrhinos.com/products/efprof. You can acquire a trial license very easily. Make sure you download and install this fantastic EF profiler.

Inside the folder where you downloaded the Entity Framework Profiler locate the HibernatingRhinos.Profiler.Appender DLL.
We will add a reference to this DLL to our web application.Make sure you do that.

In the Global.asax file we need to add the following code to the Application_Start() event handling routine.

      HibernatingRhinos.Profiler.Appender.EntityFramework.EntityFrameworkProfiler.Initialize();

Launch the EF Profiler and then build and run the application. The application will update the 11158 rows.

The results I got for my machine can be seen in the pictures below. You can see that the database has been hit 11158 times and the average duration for each one is 8.41ms.

You can also see the total duration in seconds that the whole batch operation lasted. we needed 111 seconds (almost 2 minutes) for updating 11158 rows. As you can understand this is not acceptible and we need to find a more efficient way to do the batch update.

8) Now we must install the EFUtilities library. We will do that through the Nuget Package Manager.Launch Nuget from VS and install EFUtilities. Have a look at the picture below

Now you must add a reference to using EntityFramework.Utilities library to your application. Inside the WebForm1.aspx.cs add the following

using EntityFramework.Utilities;

9) Comment out everything in the Global.asax file in Application_Start() event handling routine.

    Comment out everything in the webform Page_Load event handling routine. Type only the following inside the Page_Load event handling routine.

       protected void Page_Load(object sender, EventArgs e)

        {

        using (AdventureWorks2014Entities ctx = new AdventureWorks2014Entities())

        {

       //we find the rows with EmailPromotion =10 (11158) rows and add 10 more to each row

          EFBatchOperation.For(ctx, ctx.People).Where(b => b.EmailPromotion == 10).Update(b => b.EmailPromotion, b => b.EmailPromotion + 10);

        }

        }

10) I will use the Sql Server Profiler to get the metrics regarding the batch operation using the EntityFramework.Utilities library. I launch the SQL Server Profiler, I connect to the SQL Server Enterprise edition instance in my machine and then create a trace with the SQL:BatchCompleted, SQL:BatchStarting events and a filter in the Database column.

Have a look at the picture below

11) Build and run your application and see the SQL Server Profiler trace. The duration column indicates that the batch operation needed only 66 ms to update 11158 rows. 

Have a look at picture below

The results are amazing and the reason for such vast performance increase is because in traditional EF a separate UPDATE SQL statement runs for each row in the table, while the batch update operation provided by EF Utilities runs one UPDATE BULK SQL statement utilizing the SqlBulkCopy class. Inserts and deletes also work directly against the database.

My laptop is a Toshiba Qosmio Intel(R) Core(TM) i7-4700 MQ CPU @ 2.40GHz with 32Gbytes RAM with a 256 SSD drive.

Batch Update Operation with EF (11158 rows affected): 111 seconds

Batch Update Operation with EF Utilities (11158 rows affected): 56 ms  - that means it is 1982 times faster

Hope it helps!!!

How to move the ViewState from a hidden field on the client to a session on the server
This is a post relevant to all the developers out there that use Web Forms as their main ASP.Net platform. In this post I would like to talk about ViewState and how we can move it and store it in a session on the server's memory. We know that the default...(read more)
Enhancing performance in Entity Framework applications by precompiling LINQ to Entities queries
This is going to be the tenth post of a series of posts regarding ASP.Net and the Entity Framework and how we can use Entity Framework to access our datastore. You can find the first one here , the second one here , the third one here , the fourth one...(read more)