Windowed DISTINCT aggregates

2016-04-262016-04-26 / Daniel Hutmacher / 5 Comments

You may have discovered that the use of DISTINCT is not supported in windowed functions. A query that uses a distinct aggregate in a windowed function,

SELECT COUNT(DISTINCT something) OVER (PARTITION BY other)
FROM somewhere;

will generate the following error message:

Msg 10759, Level 15, State 1, Line 1
Use of DISTINCT is not allowed with the OVER clause.

There are, however, a few relatively simple workarounds that are suprisingly efficient.

Continue reading →

Last row per group

2016-04-112016-04-09 / Daniel Hutmacher / 6 Comments

A very common challenge in T-SQL development is filtering a result so it only shows the last row in each group (partition, in this context). Typically, you’ll see these types of queries for SCD 2 dimension tables, where you only want the most recent version for each dimension member. With the introduction of windowed functions in SQL Server, there are a number of ways to do this, and you’ll see that performance can vary considerably.

Continue reading →

Segment and Sequence Project

2014-12-072014-12-03 / Daniel Hutmacher / Leave a comment

For windowed functions, SQL Server introduces two new operators in the execution plan; Segment and Sequence Project. If you’ve tried looking them up in the documentation, you’ll know that it’s not exactly perfectly obvious how they work. Here’s my stab at clarifying what they actually do.

Continue reading →

An indented representation of a parent-child hierarchy

2014-03-302018-07-15 / Daniel Hutmacher / 1 Comment

When you’re designing reports, they can often be based on hiearchies represented by “nodes” in a parent-child setup. To the end-user, the parent-child representation doesn’t provide very much readability, so you need to output this information in a human-readable form, for instance in a table where the names/titles are indented.

Continue reading →

Efficient data, part 6: Versioning changes

2013-12-222013-12-05 / Daniel Hutmacher / Leave a comment

This installment in the series on efficient data is on versioning changes in a table. The article is a re-post of a post I wrote in september on compressing slowly changing dimensions, although the concept does not only apply to dimensions – it can be used pretty much on any data that changes over time.

The idea is to “compress” a versioned table, so instead of just adding a date column for each version, you can compress multiple, sequential versions into a single row with a “from” date and a “to” date. This can significantly compress the size of the table.

Continue reading →

Using DELETE on subqueries and common table expressions

2013-10-272013-09-30 / Daniel Hutmacher / Leave a comment

Here’s an interesting feature I found in the code of a colleague the other day. A common task in T-SQL is eliminating duplicate records. This became a lot easier with the introduction of windowed functions way back in SQL Server 2005, such as ROW_NUMBER(), but it turns out, I’ve still been missing out on a really simple and cool solution.

Continue reading →

Slowly changing dimensions (part 2)

2013-09-152013-09-13 / Daniel Hutmacher / 1 Comment

In this second installment of the Slowly Changing Dimensions series (see part one here), we’ll take a look at how to practically create a slowly changing dimension table using T-SQL.

Continue reading →

Ben-Gan on virtual auxiliary table of numbers

2013-05-072018-07-15 / Daniel Hutmacher / Leave a comment

Check out this interesting article from SQL Server superstar Itzik Ben-Gan on Virtual Auxiliary Table of Numbers.

An introduction to windowed functions

2013-03-312013-03-22 / Daniel Hutmacher / 11 Comments

Windowed functions are a powerful feature of T-SQL, allowing you to perform advanced aggregates. They provide a very efficient way of doing this as soon as you just get the hang of the OVER() clause.

Continue reading →

sqlsunday.com

T-SQL tips and tricks, best practices and query plans from the field.

row_number

Windowed DISTINCT aggregates

Last row per group

Segment and Sequence Project

An indented representation of a parent-child hierarchy

Efficient data, part 6: Versioning changes

Using DELETE on subqueries and common table expressions

Slowly changing dimensions (part 2)

Ben-Gan on virtual auxiliary table of numbers