Calculating median using the NTILE function

A while back, I was shown an absolutely gorgeous median calculation, using the new OFFSET/FETCH functionality introduced in SQL Server 2012. But this got me thinking. How can you calculate the median in an older SQL Server? Here’s an idea of how to do it using the NTILE() function.

Aggregate string concatenations

A common reporting scenario is that your report has to concatenate an aggregate of string values from rows. Many other database platforms even have built-in aggregate functions that will concatenate text for you (like LISTAGG() on Oracle). In this post, we’ll take a look at how you can achieve the same results in T-SQL, using the APPLY operator and the XML datatype.

Traversing parent-child relations

In this short tutorial, we’ll look at traversing parent-child structures using recursive common table expressions, and turning the data into human-readable lists. This is a great way to represent paths and hierarchy-based data in reports and end-user outputs.

Calculating business days and holidays

A common scenario you may have encountered is the need to calculate the first business day after a given date. There are quite a few ugly ways to solve this, including cursors, but there are also some pretty neat ways to approach the problem, and as a bonus, you’ll learn about recursion and the new LEAD(), LAG() functions and accumulation in T-SQL.

Rebuilding a transaction log file

The database’s transaction log file contains, like the name implies, a log of all the transactions happening in the database. If you’ve set your database to the “simple” recovery model, the log is truncated (emptied) at every checkpoint (more or less all the time). In “bulk logged” or “full” recovery model, you’ll have to truncate the log files yourself, by performing a transaction log backup.

So because of this, the log file has to stay in sync with the rest of the database at all times. But what do you do if your log file is lost or corrupted? Luckily, there’s a simple way to recover it.

Efficient data, part 6: Versioning changes

This installment in the series on efficient data is on versioning changes in a table. The article is a re-post of a post I wrote in september on compressing slowly changing dimensions, although the concept does not only apply to dimensions – it can be used pretty much on any data that changes over time.

The idea is to “compress” a versioned table, so instead of just adding a date column for each version, you can compress multiple, sequential versions into a single row with a “from” date and a “to” date. This can significantly compress the size of the table.

Efficient data, part 1: Normalization

We’ve talked a lot about optimizing queries and query performance, but we haven’t really touched that much on the storage and data modelling aspects. In this series of post, I’ll run through some basic tips on how you can more efficiently model and store your data, which may come in particularly handy when you’re working with large databases and large transaction volumes, but a lot of it also makes good design sense in smaller databases.

In this first article, we’ll cover the normalized data model.

Working with dependencies

Working with dependencies, particularly recursive dependencies, may not always be entirely intuitive, but it could be critical knowledge in your database development work. This article focuses primarily on different ways of visualizing dependencies and how to loop through them using recursive common table expressions.