aggregate – sqlsunday.com

Aggregating date and time values

Date and time values are not entirely intuitive to aggregate into averages in T-SQL, although the business case does arguably exist. Suppose, for instance, that you have a production log with a “duration” column (in the “time” datatype), and you want to find the totalt or average duration for a certain group of items.

It’s possible, but I would still call it a workaround.

Grouping dates without blocking operators

It’s not entirely uncommon to want to group by a computed expression in an aggregation query. The trouble is, whenever you group by a computed expression, SQL Server considers the ordering of the data to be lost, and this will turn your buttery-smooth Stream Aggregate operation into a Hash Match (aggregate) or create a corrective Sort operation, both of which are blocking.

Is there anything we can do about this? Yes, sometimes, like when those computed expressions are YEAR() and MONTH(), there is. But you should probably get your nerd on for this one.

Windowed DISTINCT aggregates

You may have discovered that the use of DISTINCT is not supported in windowed functions. A query that uses a distinct aggregate in a windowed function,

SELECT COUNT(DISTINCT something) OVER (PARTITION BY other)
FROM somewhere;

will generate the following error message:

Msg 10759, Level 15, State 1, Line 1
Use of DISTINCT is not allowed with the OVER clause.

There are, however, a few relatively simple workarounds that are suprisingly efficient.

PIVOT, UNPIVOT and performance

Few things deserve the attention of a long rant as much as unneccessarily complicated syntaxes. When you want to achieve something that is clearly defined and supported, but you have to look up the the syntax. PIVOT and UNPIVOT are examples of such features, and in this case, I’ll even show you a more well-performing alternative.

Accumulating values in a parent-child hierarchy

Last week, we looked at how to construct a visual representation of a hierarchy stored as a parent-child table. The obvious next step is to accumulate values stored on those nodes using this hierarchy.

Using TOP n WITH TIES

Here’s an interesting feature to that may greatly simplifie a few queries. TOP n WITH TIES works pretty much like TOP, except it orders the top records “densely” to include tied values.

Creating a top-10 list and aggregating the remainder

This article will show you how to create a “top 10” list of data, as well as providing an aggregate on the remainder of the data. There are a few different approaches to this.

Indexed views

Indexed views are views where the computed view data is already calculated and stored in the database, which make them super-fast for querying. And if you’re using the expensive Enterprise Edition of SQL Server, you’ll enjoy performance increases in other queries that can make use of the indexed view as well.