A very common challenge in T-SQL development is filtering a result so it only shows the last row in each group (partition, in this context). Typically, you’ll see these types of queries for SCD 2 dimension tables, where you only want the most recent version for each dimension member. With the introduction of windowed functions in SQL Server, there are a number of ways to do this, and you’ll see that performance can vary considerably.
The SQL Server query optimizer can find interesting ways to tackle seemingly simple operations that can be hard to optimize. Consider the following query on a table with two indexes, one on (a), the other on (b):
SELECT a, b FROM #data WHERE a<=10 OR b<=10000;
The basic problem is that we would really want to use both indexes in a single query.
In this post, we’re going to take a look at a few examples of how this type of query would be optimized, as well as how statistics can affect the query plan, and finally, we’ll take a look at a slightly rare plan operator called “Merge Join (Concatenation)”.
Among the three different types of join operators used by SQL Server, the HASH JOIN does some of the hardest work. It scales pretty well and is very suitable for parallel processing. As such, it can be very powerful in many applications, but hash joins can potentially consume quite a bit of memory, so seeing on in your query plan could be an indicator of a performance tuning issue in your query or data.
I took some time today to read up on Dale Burnett’s series on statistics in SQL Server. In seven detailed posts, Dale goes through different aspects of how statistics work in SQL Server, and how to use them to your advantage.
If you’re into query optimization, and you’ve got your glasses on, chances are this article is for you.
Knowing how to read a query plan is absolutely key to optimizing SQL Server query performance. The query plan tells you how SQL Server goes about running your query, including what indexes are used (and how), what join strategies are applied and a lot of other information. If you can read the query plan, you can make the appropriate changes to indexes, query hints, join conditions, etc to tune your workload for optimum performance.