JSON indexes in SQL Server: First impressions

At the November announcement of SQL Server 2025, a ton of cool new features were unveiled. It’s obviously very heavy on AI and Fabric, but those are the times we live in. If you squint a little, tough, you can make out quite a few really nice developer-centric features. And if you know anything about me, I’m a big fan of developer features and performance improvements.

Starting today, the public preview of SQL Server 2025 is available to download!

One really interesting new feature that got my attention was the addition of JSON indexes. I’m a big fan of everything that makes working with JSON easier, since JSON blobs are so much easier to work with than table variables when you’re moving data from point A to point B. This is especially true when you’re working with complex, relational data.

T-SQL math patterns

As part of spending waaaaaay to much time trying to solve the 2023 Advent of Code challenges, I came across multiple instances where I had to dust off some old math that I hadn’t paid attention to since I went to school back in the 90ies.

So for my own convenience, and yours, I’ve built functions for some common math that you might perhaps encounter at some point. I found this whole experience to be a great way to familiarize myself with a lot of the new functionality in SQL Server 2022, including GENERATE_SERIES(), LEAST(), GREATEST() and more. The Github repo contains a SQL Server 2019 version where I’ve built drop-in versions of the 2022 functions, but they probably won’t perform as well as the built-in stuff.

If there are any other functions you’d like to add, feel free to add a comment to this post, or, by all means, a pull request to the repo.

How to join overlapping date ranges

You can get into a situation where you have two tables with values associated with date ranges. What’s worse, those date ranges don’t necessarily have to align, which can make joining them a seemingly complex task, but it is surprisingly simple when you learn how to think of overlapping date ranges, along with this relatively simple T-SQL join pattern.

Indexing for substring search

A question from a client got me thinking. Relational databases (at least the ones I know and love) can’t really index for queries that use LIKE queries for a substring of a column value. If you want to search for strings beginning with a given string, a regular rowstore index will have you covered. If you’re looking for entire words or sentences, a full text index might be a good call. But because of the very way indexes work, you’ll never get great performance searching for just arbitrary parts of a string.

So today I’ll put on my lab coat and do a little rocket surgery, just to prove to the world that it can be done.

Professional driver on closed roads, always wear protection. Your mileage may vary.

The quirky and wonderful self-join optimization

This blog post started as a “what if” contemplation in my head: Suppose you have a reasonably large table with a clustered index and a number of non-clustered indexes. If your WHERE clause filters by multiple columns covered by those non-clustered indexes, could it potentially be faster to rewrite that WHERE clause to use those non-clustered indexes?

The answer might surprise you.

A quick look at SQL Server UTF-8 collations

A client asked me about SQL Server collations, and if they should consider the new UTF8 collations (new since SQL Server 2019). I tried to hide my blank stare of ignorance, and promised them I’d look it up and get back to them.

Not gonna lie, I think UTF and Unicode can be pretty confusing at times, so I did some googling and some testing, and here’s what I found.

I ❤️ QUOTENAME()

An underrated, and perhaps less well-known T-SQL function is QUOTENAME(). If you haven’t heard of it, chances are that it could do wonders for your dynamic SQL scripts.

To quickly recap quoting, consider the following script:

SELECT N'DROP PROCEDURE '+OBJECT_SCHEMA_NAME([object_id])+N'.'+[name]+N';'
FROM sys.procedures
WHERE [name] LIKE N'%test';

What happens if one of your object names contains a space, a quote, an apostrophe, a square bracket, etc? You’ll end up with a syntax error, or even worse, a SQL injection attack (pretty elaborate, but still quite possible). To solve for this, we quote the object names. In SQL Server, you can surround schema and object names with double quotes (if you’ve set QUOTED_IDENTIFIER) or square brackets.

Simple, right?

SELECT N'DROP PROCEDURE ['+OBJECT_SCHEMA_NAME([object_id])+N'].['+[name]+N'];'
FROM sys.procedures
WHERE [name] LIKE N'%test';

But just adding a [ before and a ] after won’t work if your evil user as embedded square brackets or a semicolon in the object name. What if your object name is “Testing [quoting]; test”?

Secure your temporal table history

You may have already discovered a relatively new feature in SQL Server called system-versioned temporal tables. You can have SQL Server set up a history table that keeps track of all the changes made to a table, a bit similar to what business intelligence people would call a “slowly changing dimension”.

CREATE SCHEMA App;

CREATE TABLE App.Customers (
    Company_ID      int IDENTITY(1, 1) NOT NULL,
    CompanyName     nvarchar(250) NOT NULL,
	Email           varchar(250) NOT NULL,
	Valid_From      datetime2(7) GENERATED ALWAYS AS ROW START NOT NULL,
	Valid_To        datetime2(7) GENERATED ALWAYS AS ROW END NOT NULL,
    CONSTRAINT PK_Customers PRIMARY KEY CLUSTERED (Company_ID),
	PERIOD FOR SYSTEM_TIME (Valid_From, Valid_To)
) WITH (SYSTEM_VERSIONING=ON);

What happens behind the scenes is that SQL Server creates a separate table that keeps track of previous versions of row changes, along with “from” and “to” timestamps. That way, you can view the contents of the table as it was at any given point in time.

But how to you version the contents of a table, while hiding things like deleted records from prying eyes?

Watch out for Merge Interval with date range Index Seeks

In my last post, I found that DATEDIFF, DATEADD and the other date functions in SQL Server are not as datatype agnostic as the documentation would have you believe. Those functions would perform an implicit datatype conversion to either datetimeoffset or datetime (!), which would noticeably affect the CPU time of a query.

Well, today I was building a query on an indexed date range, and the execution plan contained a Merge Interval operator. Turns out, this operator brings a few unexpected surprises to your query performance. The good news is, it’s a relatively simple fix.

DATEDIFF performs implicit conversions

As I was performance tuning a query, I found that a number of date calculation functions in SQL Server appear to be forcing a conversion of their date parameters to a specific datatype, adding computational work to a query that uses them. In programming terms, it seems that these functions do not have “overloads”, i.e. different code paths depending on the incoming datatype.

So let’s take a closer look at how this manifests itself.