Working with intervals

At one point or another, you’re going to come across intervals when working in SQL Server. You could say that an interval is where you don’t have a single value, but actually a range of values, commonly delimited within a start and an end value. This range could be a group of accounts, versions of dimension members (in an SCD) or date/time intervals.

In this article, we’ll be looking at intervals using a fictional client booking calendar, where each appointment has a start and an end time, and the range is the time in between the two. The logic, however, is fairly general and can be applied to a multitude of scenarios.

Setting up some test data

In this article, we’ll be working with an example table called “Appointments”, containing booked client meetings. We’ll need information about what employee is booked, the start and end time of the appointment, as well as the client.

CREATE TABLE dbo.Appointments (
	EmployeeID	int NOT NULL,
	StartTime	datetime NOT NULL,
	EndTime		datetime NOT NULL,
	ClientID	int NOT NULL,
	CONSTRAINT PK_Appointments PRIMARY KEY CLUSTERED (EmployeeID, StartTime)
);

INSERT INTO dbo.Appointments (EmployeeID, StartTime, EndTime, ClientID)
SELECT 100, {ts '2013-06-03 09:00:00.000'}, {ts '2013-06-03 10:00:00.000'}, 200 UNION ALL
SELECT 100, {ts '2013-06-03 10:00:00.000'}, {ts '2013-06-03 11:00:00.000'}, 210 UNION ALL
SELECT 100, {ts '2013-06-03 13:00:00.000'}, {ts '2013-06-03 14:30:00.000'}, 208 UNION ALL
SELECT 100, {ts '2013-06-03 14:45:00.000'}, {ts '2013-06-03 16:30:00.000'}, 214 UNION ALL
SELECT 101, {ts '2013-06-03 08:30:00.000'}, {ts '2013-06-03 09:15:00.000'}, 264 UNION ALL
SELECT 101, {ts '2013-06-03 09:15:00.000'}, {ts '2013-06-03 11:35:00.000'}, 223 UNION ALL
SELECT 101, {ts '2013-06-03 12:45:00.000'}, {ts '2013-06-03 14:30:00.000'}, 216 UNION ALL
SELECT 102, {ts '2013-06-03 13:30:00.000'}, {ts '2013-06-03 13:55:00.000'}, 275 UNION ALL
SELECT 102, {ts '2013-06-03 14:00:00.000'}, {ts '2013-06-03 15:30:00.000'}, 223 UNION ALL
SELECT 102, {ts '2013-06-03 15:00:00.000'}, {ts '2013-06-03 16:00:00.000'}, 211 UNION ALL
SELECT 102, {ts '2013-06-03 16:00:00.000'}, {ts '2013-06-03 16:55:00.000'}, 262 UNION ALL
SELECT 100, {ts '2013-06-04 09:45:00.000'}, {ts '2013-06-04 12:00:00.000'}, 237 UNION ALL
SELECT 100, {ts '2013-06-04 12:45:00.000'}, {ts '2013-06-04 13:25:00.000'}, 237 UNION ALL
SELECT 100, {ts '2013-06-04 15:30:00.000'}, {ts '2013-06-04 17:00:00.000'}, 237 UNION ALL
SELECT 101, {ts '2013-06-04 15:30:00.000'}, {ts '2013-06-04 17:00:00.000'}, 297 UNION ALL
SELECT 102, {ts '2013-06-04 09:00:00.000'}, {ts '2013-06-04 11:00:00.000'}, 219 UNION ALL
SELECT 103, {ts '2013-06-03 00:00:00.000'}, {ts '2013-06-07 23:59:59.000'}, -1;

For the sake of argument, let’s say that the last row of the example, that spans across multiple days, is “out of office” or something similar.

For the example queries, we can also use a list of workdays in a table.

CREATE TABLE dbo.Workdays (
    [Date]       date NOT NULL,
    IsWorkday    bit NOT NULL,
    StartsAt    datetime NULL,
    EndsAt    datetime NULL,
    CONSTRAINT PK_Workdays PRIMARY KEY CLUSTERED ([Date])
);

INSERT INTO dbo.Workdays ([Date], IsWorkday, StartsAt, EndsAt)
SELECT {d '2013-06-01'}, 0, NULL, NULL UNION ALL
SELECT {d '2013-06-02'}, 0, NULL, NULL UNION ALL
SELECT {d '2013-06-03'}, 1, {ts '2013-06-03 08:30:00.000'}, {ts '2013-06-03 17:00:00.000'} UNION ALL
SELECT {d '2013-06-04'}, 1, {ts '2013-06-04 08:30:00.000'}, {ts '2013-06-04 17:00:00.000'} UNION ALL
SELECT {d '2013-06-05'}, 1, {ts '2013-06-05 08:30:00.000'}, {ts '2013-06-05 17:00:00.000'} UNION ALL
SELECT {d '2013-06-06'}, 0, NULL, NULL UNION ALL
SELECT {d '2013-06-07'}, 1, {ts '2013-06-07 08:30:00.000'}, {ts '2013-06-07 17:00:00.000'};

Basic example of an interval

A fundamental concept to understand is that JOINs between tables can be performed pretty much with any valid SQL condition. It does not neccessary need to be the “equal to” operator that defines a relationship between two tables. It can for instance be “less than” or “larger than”, or any other operator you could think of. In fact, the JOIN criteria does not really have to indicate an actual relation between the two tables at all – it only has to evaluate to true or false.

In this case, let’s say that we want to measure how much time each of our employees spend in client meetings each day. Note how we join the two tables together:

SELECT d.[Date], apt.EmployeeID, SUM(
     DATEDIFF(mi, apt.StartTime, apt.EndTime)) AS SumMinutes
FROM dbo.Appointments AS apt
INNER JOIN dbo.Workdays AS d ON
    --- Appointment ends on or after workday begins:
    d.[Date]<=apt.EndTime AND
    --- Appointment starts before the start of the next day
    DATEADD(dd, 1, d.[Date])>apt.StartTime
GROUP BY d.[Date], apt.EmployeeID
ORDER BY d.[Date], apt.EmployeeID;

So, what the query above does is join workdays and appointments when any part of the appointment overlaps or is contained inside the workday. This is true when

  • the end of the appointment is on or after the workday begins, and
  • the beginning of the appointment is before the beginning of the next day

You could visually represent this with a timeline:

Interval, joined within one day

If you’ve tried the query, you’ll notice that there is one more thing to correct. Look what happens to EmployeeID 103 (the appointment spans multiple days). The DATEDIFF function that calculates the number of minutes will return the entire length of the appointment, and not just how much time is within each day. The correct solution actually looks more like this:

SELECT d.[Date], apt.EmployeeID,
    SUM(DATEDIFF(mi,
        --- Start of day or appointment, which ever's last:
        (CASE WHEN apt.StartTime<d.[Date]
            THEN d.[date]
            ELSE apt.StartTime END),
        --- End of day or appointment, which ever's first.
        (CASE WHEN apt.EndTime>DATEADD(dd, 1, d.[Date])
            THEN DATEADD(dd, 1, d.[Date])
            ELSE apt.EndTime END))) AS SumMinutes
FROM dbo.Appointments AS apt
INNER JOIN dbo.Workdays AS d ON
    --- Appointment ends on or after workday begins:
    d.[Date]<=apt.EndTime AND
    --- Appointment starts before the start of the next day
    DATEADD(dd, 1, d.[Date])>apt.StartTime
GROUP BY d.[Date], apt.EmployeeID
ORDER BY d.[Date], apt.EmployeeID;

A graphical timeline representation of the query with an appointment that spans multiple days would look something like this. What happens is that we evaluate if apt.StartDate is less than d.Date, and if it is, we use d.Date instead, so as to show “one day at a time”, in a manner of speaking.

Interval, joined over multiple days

But the key take-away from this example is the JOIN condition – there’s no “equal” operator. Instead, there’s another condition (in this case, “less than” and “greater than”) that binds the two tables together.

Overlapping intervals

Using this scenario, you might also be interested in finding overlaps (i.e. double-bookings). This may seem a tricky process at first, but it’s actually fairly simple in practice.

If we break it down, here’s what we want to do:

  • Find two bookings, a and b, that
  • share the same employee, where
  • a ends after b starts, and
  • a begins before b ends.

Interval overlap

In T-SQL, you’d write these conditions (JOIN conditions, really) like this:

SELECT a.EmployeeID,
    b.StartTime AS OverlapStartTime,
    a.EndTime AS OverlapEndTime,
    a.ClientID AS ClientID1,
    b.ClientID AS ClientID2
FROM dbo.Appointments AS a
INNER JOIN dbo.Appointments AS b ON
    --- a and b share the same employee
    a.EmployeeID=b.EmployeeID AND
    --- a ends after b starts
    a.EndTime>b.StartTime AND
    --- a starts before b ends
    a.StartTime<b.EndTime
--- Eliminate duplicate representations of overlaps:
WHERE a.StartTime<b.StartTime;

Note the WHERE clause at the end. The reason for this is that every overlap can be represented in two ways – a overlaps b and b overlaps a. So, in order to eliminate those duplicates in the output and make the list more readable, we’ll using this WHERE condition to always select which one of the two representations that gets chosen, namely the one where a is the first appointment and b is the second one.

Finding gaps

Here’s a tricky one. Let’s say you want to find a free slot to book a new client consultation. Where would you fit it, who’s available, and for how many minutes?

First of all, I’ll start by constructing a common table expression, “apts”, that contains all the neccessary information from the Appointments table, along with an ordinal number (appointment of the day, starting at 1 for each employee and and day). We’ll also construct a column called “Date” which contains only the date part of StartTime, without the time, by converting it to a float, rounding down, and converting it back to datetime.

WITH apts (EmployeeID, [Date], StartTime, EndTime, AptOfDay)
AS (
    SELECT EmployeeID,
        CAST(FLOOR(CAST(StartTime AS float))
            AS datetime) AS [Date],
        StartTime, EndTime,
        ROW_NUMBER() OVER (
            PARTITION BY EmployeeID,
                DATEPART(yy, StartTime),
                DATEPART(mm, StartTime),
                DATEPART(dd, StartTime)
            ORDER BY StartTime) AS AptOfDay
    FROM dbo.Appointments)

Because we have our appointment ordinal (AptOfDay), we can now join one appointment (a) with the next one (b). Using the starting and ending times of these appointments, we can deduce if they’re back-to-back or if there’s a free gap.

Interval gaps

Here’s the simple version:

(using the CTE "apts" from above)

SELECT a.EmployeeID,
    --- The end of the previous appointment is also the
    --- the start of this gap:
    a.EndTime AS GapStartTime,
    --- The start of the next appointment is where this
    --- gap ends:
    b.StartTime AS GapEndTime,
    --- The difference between the two is the length
    --- of the gap:
    DATEDIFF(mi, a.EndTime, b.StartTime) AS GapMinutes
FROM apts AS a
INNER JOIN apts AS b ON
    a.EmployeeID=b.EmployeeID AND   --- Same employee
    a.[Date]=b.[Date] AND           --- Same day
    a.AptOfDay+1=b.AptOfDay         --- Next appointment
ORDER BY 1, 2;

Finding gaps (advanced)

The keen reader will have spotted the obvious flaw – the INNER JOIN means that we only see gaps between two appointments, not the gaps that are before the day’s first appointment, or after the day’s last one.

To solve this, we can change the INNER JOIN to a FULL JOIN (kind of a LEFT and RIGHT JOIN at the same time), allowing the previous or next appointment of the day to be non-existent. This is how it could be written:

(using the CTE "apts" from above)

SELECT
    ISNULL(a.EmployeeID, b.EmployeeID) AS EmployeeID,
    --- The end of the previous appointment is also the
    --- the start of this gap:
    a.EndTime AS GapStartTime,
    --- The start of the next appointment is where this
    --- gap ends:
    b.StartTime AS GapEndTime,
    --- The difference between the two is the length
    --- of the gap:
    DATEDIFF(mi, a.EndTime, b.StartTime) AS GapMinutes
FROM apts AS a
FULL JOIN apts AS b ON
    a.EmployeeID=b.EmployeeID AND   --- Same employee
    a.[Date]=b.[Date] AND           --- Same day
    a.AptOfDay+1=b.AptOfDay         --- Next appointment
ORDER BY 1, ISNULL(b.StartTime, a.EndTime);

To accommodate the possibility of “a” being empty, I’ve applied the ISNULL() construct on the EmployeeID column.

Finally, let’s add the “Workdays” table to this query, so we’ll know when your employee comes to work in the morning and leaves work in the afternoon – after all, if the first appointment starts at 08:30, it doesn’t automatically mean that you can book another appointment before it..

(using the CTE "apts" from above)

SELECT
    ISNULL(a.EmployeeID, b.EmployeeID) AS EmployeeID,
    --- End of previous appointment, or start of workday:
    ISNULL(a.EndTime, d.StartsAt) AS GapStartTime,
    --- Start of next appointment, or end of workday:
    ISNULL(b.StartTime, d.EndsAt) AS GapEndTime,
    --- Minutes between GapStartTime and GapEndTime:
    DATEDIFF(mi,
        ISNULL(a.EndTime, d.StartsAt),
        ISNULL(b.StartTime, d.EndsAt)) AS GapMinutes
FROM apts AS a
FULL JOIN apts AS b ON
    a.EmployeeID=b.EmployeeID AND   --- Same employee
    a.[Date]=b.[Date] AND           --- Same day
    a.AptOfDay+1=b.AptOfDay         --- Next appointment
INNER JOIN dbo.Workdays AS d ON
    ISNULL(a.[Date], b.[Date])=d.[Date] AND
    d.IsWorkday=1
ORDER BY 1, 2;

2 thoughts on “Working with intervals

  1. Pingback: Distributing values with overlapping ranges « Sunday morning T-SQL

  2. Pingback: Joining two SCD2 tables « Sunday morning T-SQL

Let me hear your thoughts!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s