A Funny Thing Happened While Practicing My Batch Update Demo...

Wednesday, May 25, 2005

Last month I did a presentation on ADO.NET 2.0 for the Israel Visual Basic User Group. Being a dilligent presentor, I made sure to run through my demos a few days before. Many of these demos I had use for previous presentations on ADO.NET 2.0. Of course, since I was now running on a different version of the Framework and of VS, there were a few changes - no big deal.

However, there was one shocker. One of the coolest (if not simplest) enhancements in 2.0 is that you can specify a batch size for the DataAdapters updates. In 1.x, when you call DataAdapter.Update (), it will execute the appropriate commands to update the database, for each modified row in a DataTable. The problem is that each update command is executed separately - meaning a separate round-trip to the server for each modified row.

So the ADO team added the BatchSize property to the DataAdapter class to specify how many records should be sent in a single batch to the server (no, bigger is not neccesarily better - there is a point of diminishing returns, where the batch size is so large that the setup and tear-down time for the batch seriously reduces the benefits of sending in a batch).

So far, so good. But how to you prove this to a (presumably cynical) crowd of developers ? What I had down was run the application with BatchSize =1 and look in the SQL Server Profiler to see the individual commands being sent to the server, and then run it again with BatchSize = 5 and see only 1 command being sent to the server.

When I did this in previous presentations, the above worked as advertised and everyone was impressed. However, in my trial run at home last month, I suddenly saw 5 commands being sent to the server, even with BatchSize = 5 !!!! Imagine my surprise and horror....

To make a long story short, I figured either (1) I had some gross misunderstanding of something that was going on, or (2) this feature was just broken in that CTP version of VS/.NET. But then I remembered a conversation I had a few months earlier with Pablo Castro of the ADO team and realized that the correct answer was (3) it was by design ! I called Pablo (in a panic...) and he confirmed my suspicion and help prove to me that there really was fewer network round-trips, even though multiple commands were arriving at the SQL Server (which is all SQL Profiler can measure).

What had happened was that the whole batch update mechanism was rewritten, mainly to address 2 problems that existed with the earlier Beta implementation: (1) When sending a single all of the updates in a batch as a single command, you would easily hit the limit of about 2K parameters per command, and (2) creating batches of different commands every time would wreak havoc on the query plan cache.

Pablo recently posted a great blog entry that clearly explains all of this. Check it out.

And always practice your demos.

No Comments