I recently worked with a large set of accounting transactions. I needed to split those rows into multiple logical batches, but each batch had to be logically consistent – among other things, those batches had to be properly balanced, because accounting people are kind of fussy like that.
So I designed a little T-SQL logic that would split all of those transactions into evenly sized batches, without violating their logical groupings.
Safety glasses on. Let’s dive in.
Metadata changes, like modifying a clustered index, or many types of column changes, will create locks in SQL Server that will block users from working with that table until the change is completed. In many cases, those locks will extend to the system objects, so you won’t even be able to expand the “Tables” or “Views” nodes in Management Studio.
I want to show you how you can perform those changes using a copy of the table, then instantly switching the table with the copy. The secret is partition switching, and contrary to popular belief, you won’t need Enterprise Edition, or even partitions, to do it.
This insanely cool new performance-related update is one of nicest features in SQL Server 2019, and certainly one I’ve been looking forward to for a long time.
If you’ve done any work around performance tuning and user-defined scalar functions, I’m pretty sure you’ll love this.
Typically, the advice on fill factor is “if it ain’t broke, don’t fix it”. But occasionally, you’ll find a database or even a server with a crazy default setting that just fills your disk and buffer pool without any real benefit. Here’s a nifty script to rebuild any tables and indexes to a fill factor of 100%.
We need to talk about the nullable columns in your database. Specifically, because of how NULL values are compared, they can dramatically affect how some lookup operations perform.
Scalar functions can be a real headache when you’re performance tuning. For one, they don’t parallelize. In fact, if you use a scalar function in a computed column, it will prevent any query that uses that table from going parallel – even if you don’t reference that column at all!
I just remembered a pretty common data challenge the other day. Suppose you have a number of tables, all with similar information in them. You want to union their contents, but you need to prioritize them, so you want to choose all the rows from table A, then rows from table B that are not included in A, then rows from C that are not included in A or B, and so on.
This is a pretty common use case in data cleansing or data warehousing applications. There are a few different ways to go about this, some more obvious than others.
I’m an outspoken advocate of always using a clustered index on each and every table you create as a matter of best practice. But even I will agree that there’s a case for using the odd heap now and then.
For practically every piece of code you develop, there will be trade-offs. Sometimes, you can combine the best of two worlds, other times it comes down to some hard choices. For T-SQL developers, it typically boils down to a few key questions:
- How much time can you spend perfecting code instead of just shipping?
- Can we just fix it when it becomes a problem?
- Is buying more hardware cheaper than paying for developers to tune their code?
- Is better code harder to read, and will a junior developer be able to work with it?
In an attempt to try a different approach, here’s a three-minute video explanation of how the different physical join operators in SQL Server work and why you would choose one over the other.
I’ve written a few blog posts on join operators befores, so if this video wet your appetite, here’s some recommended reading:
I’d love to hear what you think of the short video format! Please leave feedback in the comments below or on Twitter.