A very common challenge in T-SQL development is filtering a result so it only shows the last row in each group (partition, in this context). Typically, you’ll see these types of queries for SCD 2 dimension tables, where you only want the most recent version for each dimension member. With the introduction of windowed functions in SQL Server, there are a number of ways to do this, and you’ll see that performance can vary considerably.
The SQL Server query optimizer can find interesting ways to tackle seemingly simple operations that can be hard to optimize. Consider the following query on a table with two indexes, one on (a), the other on (b):
SELECT a, b FROM #data WHERE a<=10 OR b<=10000;
The basic problem is that we would really want to use both indexes in a single query.
In this post, we’re going to take a look at a few examples of how this type of query would be optimized, as well as how statistics can affect the query plan, and finally, we’ll take a look at a slightly rare plan operator called “Merge Join (Concatenation)”.
Here’s a quick tip: When you’re evaluating query strategies, you may want to consider how your query will scale when the volume in your database goes up. This does not neccessarily mean that you have to start filling your tables with gigabytes on gigabytes of data.
Back in 2014 I wrote a blog post on how to calculate a median value using the NTILE window function. It’s far from the best performing solution there is, but it worked on SQL Server 2008, before the introduction of OFFSET-FETCH i SQL Server. In this post, I’m going to look at creating a generalized function that calculates the median (or any percentile) of a series of values.
It’s been absolutely ages since I last wrote a blog post, mostly because I’ve been busy getting my shiny new own consultancy up to speed, but I’ll admit that writer’s block has also been a factor.
But here’s something to write home about. This year’s annual Swedish SQL Server usergroup challenge was as interesting as ever, and it marks my third stab at this prestigious competition. In this post, I’ll go through my contribution, highlighting some of the techniques that I’ve applied to make it go really fast.
Sunday came early this week. In this longer-than-usual post, I’m going to show you the details of my contribution to the 2014 edition of the annual Swedish SQL Server usergroup challenge.
Transactions are great. They keep your data together atomically, so you’re not in for any nasty surprises. But even a novice knows better than to leave transactions open, waiting for user interaction. If you do, lock waits and probably deadlocks will pile up in no time.
So how do you book a flight without blocking all the other users or losing your seat to somebody else while you make up your mind?
Talk to SQL Server developers or DBAs about histograms, and they’ll inevitably think of index statistics. However, a task you may encounter some day is to calculate the distribution of numbers in a table. And although there’s no quick built-in function to do this, it’s not as difficult as you may think.
The MERGE statement is a very powerful way to combine INSERT, UPDATE and/or DELETE in a single statement. But there’s more than meets the eye. There are situations where you could use a MERGE statement to perform just one of those tasks.
Some database operations can be performed in distinctly different manners, with different impacts on query performance. One important example of such an operation is calculating an aggregate. In this article, we’ll take a look at how aggregates can be “blocking” or “non-blocking”, how it affects memory allocation, and ultimately, what impact this has on your query.