Speed Matters : Subquery vs Table Variable vs Temporary Table

This has been perhaps written quite a number of times by a lot of TSQL gurus out there but earlier I was presented with a TSQL code that prompted me to recheck my notions on subqueries, table variables and temporary tables. The TSQL code presented to me is something built dynamically that depends heavily on user inputs. The issue presented was performance. The code used to run acceptably fine until earlier when things are annoyingly slow.

Upon seeing the code, I said to myself, problem solved. Instantly, I thought I was able to spot what is causing the performance issue. The code was written using TSQL’s temporary tables. Based from experience, I would automatically shoot down any attempt to use temporary tables when executing a TSQL batch. And though Microsoft has introduced table variables as a better alternative to temporary tables, better in terms of technicality and programmability, delving with the two options would always be my last resort. In most cases, I’d rather tackle difficult scenarios requiring handling of temporal data using subqueries. In fact, I immediately ruled a rewrite of the code to minimize the use of temporary tables, and at least replace it with table variables which were supposed to be a lot better. But upon further investigation, I ruled a rewrite because of some other things not related to the topics at hand here. On the other hand, am trying to figure out how things can be converted using subqueries which I would primarily prefer.

In my investigation though, something interesting came up and I thought I’d better time the executions of my experiment while I make the templates of how things are done using subqueries. The results now intrigues me a bit, and probably would take a second look at the code I was shown earlier. I might be looking at the wrong place. Nevertheless, the results of my experiment might interest others out there. So here it is:

The Experiment 

I always love to use one of our company’s product, the library system, to test my experiments with anything to do with TSQL as its database is as diverse in design and available test data is in great abundance. I picked one small database sample for my experiment and this is how it went:

  • Test 1 is intended to check how different approaches fare on small sets of data. I have to query all library materials that were written in ‘Filipino’ (which would result into 974 rows out of the 895701 rows in a single table). Based on the result, I have to search for a string pattern “bala” in all of the materials’ titles using the LIKE operator which would eventually result to 7 rows.
  • Test 2 is basically the same query against the same environment with different query conditions to see how the three approaches fare on large sets of data. So I queried all library materials that were written in ‘English’ (which would result into 43896 rows out of the 895701 rows). Based on the result, I have to search for a string pattern “life” in all of the materials’ titles using the LIKE operator which would eventually result to 440 rows.

I wrote three queries that each would use a specific approach to query my desired output in this experiment.

First query using SUB QUERIES:

Second query uses TABLE VARIABLES:

The third and last query uses TEMPORARY TABLES:

I run all three queries (3x each) after a service restart to clean-up tempdb and I got the following results. Figures are presented in the number of seconds:

  Total Rows Temporal Data Count Final Result Sub Queries Table Variables Temporary Tables
Test 1 895701 rows 974 rows 7 rows 2 secs 4 secs 2 secs
Test 2 895701 rows 43896 rows 440 rows 3 secs 23 secs 4 secs

 

The results pretty much proved interesting. I thought, based on the queries I wrote that temporary tables would be slightly slower of the three approaches. The execution plan for table variables and temporary tables are pretty much the same. I was a bit surprised that the execution time of the table variable is almost twice that of a temporary table on small sets of data but temporary tables trumps table variables by a large margin in large set of data. I was expecting a slight difference only as, though both are a bit different in terms of scoping and technicality, and to some extent purpose, both uses TEMPDB as their memory (contrary to the notion that table variables use in-memory) when handling data. What is notable though is that, disk I/O is costlier with Table Variables. I am not sure at this point what causes this cost. Will try to dig deeper when I have more time.

Nevertheless, this particular experiment does not provide a general conclusion that one of the three approaches is better than among the three. I still subscribe to the general notions:

  • That sub queries are neat and would perform better in most if not all cases
  • That temporary tables would be slower on environments with heavy concurrency as sql server would handle more locks especially when using it within a transaction. I have not tested this scenario here though.
  • That temporary tables may be a faster solution if constraints/indexes are needed in between queries, especially when a large set of data is expected, statistics on table variables columns arent created and it can’t have indexes. But surprisingly, this experiment shows Table Variables performed poorly among the three with handling large data set as being its slowest.
  • That table variables are a good option technically/programmatically

You just be the judge. It would be prudent to conduct tests. I can’t say my experiment here is conclusive at best, but from hereon, I would avoid using Table Variables whenever I can. For very simple queries like in the tests, it performed poorly. I can’t imagine how things will be with larger datasets and more complex queries.

I am on to my next experiment!

**************************************************
Toto Gamboa is a consultant specializing on databases, Microsoft SQL Server and software development operating in the Philippines. He is currently a member and one of the leaders of Philippine SQL Server Users Group, a Professional Association for SQL Server (PASS) chapter and is one of Microsoft’s MVP for SQL Server in the Philippines. You may reach him by sending an email to totogamboa@gmail.com

Advertisement

Why Am I Still Stuck with T-SQL?

Not a day pass without something new coming out from software companies like Microsoft. And it has been a challenge to keep up for application developers like me. I happen to start my all-Microsoft stack effort during the heydays of Visual Basic 3 and Access. Prior to that, I was a Borland kind of kid mesmerized at how neat my code was when printed on reams of continuous paper.

I was really fast then in absorbing new software technologies for application development and related products but I am no longer a kid I used to be. I am now a slow slumbering oldie when it comes to absorbing software development technologies. I actually envy those guys now who are doing and using technologies that just came out of the baking oven. I feel they are so smart to figure out new things so fast. Isn’t that great?

And every time I get to mingle with these developers, they often wonder why I am still stuck with Transact-SQL (T-SQL). Every time .. and it never fails. It now makes me wonder why I am stuck with T-SQL when there are a lot of new alternatives. This makes me beg to question if I am still relevant with the times. Well, let’s see.

Am I in? Or Out?

My first serious brush with T-SQL was when I was contracted to develop a school library system. I thought of using VB3 + Access but being a fast kid then, I opted to choose what was the latest and ended up using VB4 and SQL Server 6.5. I was mostly doing VB code and using DAO/ADO to connect to my databases and still doing client-side cursors when going through my tables here and there. I had trouble adapting to sql’s set-based processing and mindset with my Clipper-heavy and Access background. In no time, I was able to absorb T-SQL and began moving some of my data manipulation code to stored procedures.

When VB5 came out I decided to upgrade the school library application with an improved database structure with all data manipulation in stored procedures. This time, no more non-TSQL code for my data. I was able to re-use some TSQL code taken from the app’s previous version.

VB6 came out and Microsoft touted a better data access component in RDO. Around that time, I was able to get more libraries to use my system so I virtually kept up with anything new from Microsoft. I upgraded the front-end portion of my library system while I was able to re-use all my stored procedures.

Shortly after, the Web’s irresistible force dawned on me and I took up ASP and VB Script and migrated a portion of my library application to face the web. During this time, I also upgraded to SQL Server 7.0. I had some inline SQL codes which were prone to SQL Injection but I was able to retain all my stored procedures.

When .NET came out, I had my library system upgraded to an entirely new schema, platform and language (ASP.NET, Windows Forms, ADO.NET, SQL Server 2000/2005). This time, I hired somebody else to code it for me.

Never Obsolete

With all the changes I made to the library system, the only technology that remained constant was T-SQL. In most cases, I was able to re-use code. In all cases, I was able to take advantage of the benefits of re-using my T-SQL experience .. all these while I managed to bulk up my knowledge on T-SQL.

VB4 is gone; my T-SQL is still here. VB5 is gone; my T-SQL is still here. VB6 is gone; my T-SQL is still here. ASP is gone; still my T-SQL is here. DAO, ADO, and RDO are gone but my T-SQL remained. I moved from VB to C#, yet I am still using T-SQL.

Today we have ADO.NET, we have LINQ. Soon they will be gone (I have heard LINQ-To-SQL is now deprecated). And tomorrow I can probably move from ASP.NET to Silverlight or some else new, or from ASP.NET to PHP, but I have the feeling I still be using T-SQL. Microsoft is even realizing in going back to something as basic as tables and queues with Azure but it can’t ignore T-SQL, thus we have SQL Azure.

Betting My Future

I am inclined to think that my investments on T-SQL have already been paid back immensely and I am still reaping its benefits. With the advent of the relatively new cloud computing, and having various players offering various cloud computing technologies and services, I can’t help the urge in identifying which part of the cloud computing technology stack will survive against the onslaught of constant change and would manage to stay relatively stable. I am afraid some of our current investments to other technologies wont be as useful in the cloud but I am betting my future once again with SQL.

What about you?

**************************************************
Toto Gamboa is a consultant specializing on databases, Microsoft SQL Server and software development operating in the Philippines. He is currently a member and one of the leaders of Philippine SQL Server Users Group, a Professional Association for SQL Server (PASS) chapter and is one of Microsoft’s MVP for SQL Server in the Philippines. You may reach him by sending an email to totogamboa@gmail.com