You can get into a situation where you have two tables with values associated with date ranges. What’s worse, those date ranges don’t necessarily have to align, which can make joining them a seemingly complex task, but it is surprisingly simple when you learn how to think of overlapping date ranges, along with this relatively simple T-SQL join pattern.
Category: Intermediate
The quirky and wonderful self-join optimization
This blog post started as a “what if” contemplation in my head: Suppose you have a reasonably large table with a clustered index and a number of non-clustered indexes. If your WHERE clause filters by multiple columns covered by those non-clustered indexes, could it potentially be faster to rewrite that WHERE clause to use those non-clustered indexes?
The answer might surprise you.
A quick look at SQL Server UTF-8 collations
A client asked me about SQL Server collations, and if they should consider the new UTF8 collations (new since SQL Server 2019). I tried to hide my blank stare of ignorance, and promised them I’d look it up and get back to them.
Not gonna lie, I think UTF and Unicode can be pretty confusing at times, so I did some googling and some testing, and here’s what I found.
Secure your temporal table history
You may have already discovered a relatively new feature in SQL Server called system-versioned temporal tables. You can have SQL Server set up a history table that keeps track of all the changes made to a table, a bit similar to what business intelligence people would call a “slowly changing dimension”.
CREATE SCHEMA App;
CREATE TABLE App.Customers (
Company_ID int IDENTITY(1, 1) NOT NULL,
CompanyName nvarchar(250) NOT NULL,
Email varchar(250) NOT NULL,
Valid_From datetime2(7) GENERATED ALWAYS AS ROW START NOT NULL,
Valid_To datetime2(7) GENERATED ALWAYS AS ROW END NOT NULL,
CONSTRAINT PK_Customers PRIMARY KEY CLUSTERED (Company_ID),
PERIOD FOR SYSTEM_TIME (Valid_From, Valid_To)
) WITH (SYSTEM_VERSIONING=ON);
What happens behind the scenes is that SQL Server creates a separate table that keeps track of previous versions of row changes, along with “from” and “to” timestamps. That way, you can view the contents of the table as it was at any given point in time.
But how to you version the contents of a table, while hiding things like deleted records from prying eyes?
Querying a single table can use multiple indexes
Can SQL Server piece together two different indexes in a single-table query, rather than just giving up and scanning a suboptimal clustered index? The short answer is: yes, in a fairly narrow band of conditions.
Watch out for Merge Interval with date range Index Seeks
In my last post, I found that DATEDIFF, DATEADD and the other date functions in SQL Server are not as datatype agnostic as the documentation would have you believe. Those functions would perform an implicit datatype conversion to either datetimeoffset or datetime (!), which would noticeably affect the CPU time of a query.
Well, today I was building a query on an indexed date range, and the execution plan contained a Merge Interval operator. Turns out, this operator brings a few unexpected surprises to your query performance. The good news is, it’s a relatively simple fix.
DATEDIFF performs implicit conversions
As I was performance tuning a query, I found that a number of date calculation functions in SQL Server appear to be forcing a conversion of their date parameters to a specific datatype, adding computational work to a query that uses them. In programming terms, it seems that these functions do not have “overloads”, i.e. different code paths depending on the incoming datatype.
So let’s take a closer look at how this manifests itself.
Computing the modulus from very large numbers
… and what of this all has to do with IBAN numbers.
The modulus is the remainder of a division of two integers*. Suppose you divide 12 by 4, the result is 3. But divide 11 by 4, and the result is 2.75. This could also be expressed by saying that 11/4 is 2 with a remainder of 3. Computing that 3 is the work of the modulo operator, which in T-SQL is represented by the % operator.
Let’s explore how to compute the modulus of large numbers in SQL Server, and how this is useful in the real world.
Turn your list into human-readable intervals
If you’ve worked with reporting, you’ve probably come across the following problem. You have a list of values, say “A, B, C, D, K, L, M, N, R, S, T, U, Z” that you want to display in a more user-friendly, condensed manner, “A-D, K-N, R-U, Z”.
Today, we’re going to look at how you can accomplish this in T-SQL, and what this has to do with window functions and gaps and islands.
STRING_SPLIT(), but for quoted names
Can you apply gaps and islands logic on a string? Sure you can.