Calculating business days and holidays

A common scenario you may have encountered is the need to calculate the first business day after a given date. There are quite a few ugly ways to solve this, including cursors, but there are also some pretty neat ways to approach the problem, and as a bonus, you’ll learn about recursion and the new LEAD(), LAG() functions and accumulation in T-SQL.

In this brief tutorial, we’re going to look at two ways to calculate the first business day starting with a given date. A starting point is the notion that if the “current” day is a bank holiday, we’ll just increment the date by one day. However, if it’s a saturday, you would end up on a sunday, so we’re going to have to build a recursive function that allows for a large number of bank holidays in sequence, such as for instance christmas or other major holidays.

Prerequisites

Because SQL Server does not automatically know which dates are bank holidays or business days, any solution you build has to be built around a table with knowledge of those dates. I’ve chosen to build a table called simply BankHolidays, that contains one record with a single date column for each date that isn’t a business day.

--- BankHoliday table:
CREATE TABLE dbo.BankHolidays (
    [date]        date NOT NULL,
    CONSTRAINT PK_BankHolidays PRIMARY KEY CLUSTERED ([date])
);

--- Swedish bank holidays for december 2013 and january 2014:
INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-01'});
INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-07'});
INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-08'});
INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-14'});
INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-15'});
INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-21'});
INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-22'});
INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-24'});
INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-25'});
INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-26'});
INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-28'});
INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-29'});
INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-31'});
INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-01'});
INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-04'});
INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-05'});
INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-06'});
INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-11'});
INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-12'});
INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-18'});
INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-19'});
INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-25'});
INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-26'});

An iterating scalar function

Here’s a very simple solution in the form of a scalar function. The input date is assigned to a variable, which is then incremented by one day until it doesn’t match a row in the bank holiday table.

CREATE FUNCTION dbo.fn_firstBusinessDay(@date date)
RETURNS date
AS

BEGIN;
    --- The iteration/output variable:
    DECLARE @businessDay date;
    SELECT @businessDay=@date;

    --- Increment the value by one day if this
    --- is a bank holiday:
    WHILE (EXISTS (SELECT [date]
                   FROM dbo.BankHolidays
                   WHERE [date]=@businessDay))
        SET @businessDay=DATEADD(dd, 1, @businessDay);

    --- Done:
    RETURN @businessDay;
END;

Obviously, for every iteration, this function will perform an index seek in the BankHolidays table, which makes it a less-than-optimal choice from a performance standpoint. It becomes particularly bad if you’re applying it to a large batch of records, in which case it will hit the BankHolidays table with a seek operation once for every record and iteration.

Using a recursive CTE

A better idea that comes to mind, given that this is a recursive problem, is to use a recursive common table expression.

ALTER FUNCTION dbo.fn_firstBusinessDay(@date date)
RETURNS date
AS

BEGIN;
    --- The output variable:
    DECLARE @businessDay date;

    --- Recursive common table expression:
    WITH rcte ([date], nextBusinessDay)
    AS (
        --- The anchor is the input date (if it's a bank holiday)
        SELECT [date], DATEADD(dd, 1, [date]) AS nextBusinessDay
        FROM dbo.BankHolidays
        WHERE [date]=@date
        UNION ALL
        --- ... and the recursion adds a day to the
        --- "nextBusinessDay" column as long as it's
        --- still a bank holiday:
        SELECT h1.[date],
            DATEADD(dd, 1, h1.nextBusinessDay) AS nextBusinessDay
        FROM rcte AS h1
        INNER JOIN dbo.BankHolidays AS h2 ON
            h1.nextBusinessDay=h2.[date]
        )

    --- Finally, the result of the CTE will contain one record for
    --- each iteration, so we'll filter out the "last" one, which
    --- is the one with the latest "nextBusinessDay", hence the
    --- MAX() function:
    SELECT @businessDay=MAX(nextBusinessDay)
    FROM rcte
    WHERE [date]=@date

    --- But if @date wasn't a bank holiday from the start, we'll
    --- just return the original date:
    IF (@businessDay IS NULL)
        SET @businessDay=@date;

    RETURN @businessDay;
END;

The use of a recursive CTE instead of iterating SELECT statements on the BankHolidays table allows SQL Server to just scan the table once, and then work with this data using a table spool, which is a kind of internal temp table, to radically improve performance. But what if you could eliminate the recursions/iterations all together?

Using the LEAD() function

The LEAD() and LAG() functions are some of the new additions to T-SQL as of SQL Server 2012, and they allow you to look at the previous or next column value in a recordset. Take a look at the following query:

SELECT [date] AS bankHoliday,
       LEAD([date], 1)
           OVER (ORDER BY [date]) AS nextHoliday
FROM dbo.BankHolidays;

This query returns the date of each row in the BankHolidays table, as well as the next bank holiday. This is useful, because all of the days between this bank holiday and the next are business days (although some rows will have zero days between the two).

If we package the above query in a subquery and isolate only bank holidays where the next day is a business day, we can produce a list of “work weeks”. In this context, a “work week” is a contiguous sequence of one or more business days.

SELECT
    LAG(nextHoliday, 1, {d '1900-01-01'})
        OVER (ORDER BY bankHoliday) AS lastHoliday,
    DATEADD(dd, 1, bankHoliday) AS startBusinessDay,
    DATEADD(dd, -1, nextHoliday) AS endBusinessDay
FROM (
    SELECT [date] AS bankHoliday,
        LEAD([date], 1, {d '2099-12-31'})
            OVER (ORDER BY [date]) AS nextHoliday
    FROM dbo.BankHolidays) AS sub
WHERE DATEADD(dd, 1, bankHoliday)<nextHoliday;

The first column is the day after the previous “week”, the second and third columns are the first and last work days of the “week”. You’ll notice that the day after “endBusinessDay” is the same as “lastHoliday” on the next row.

Here’s the scalar function again, where this time, this data is calculated in a common table expression called “weeks”:

ALTER FUNCTION dbo.fn_firstBusinessDay(@date date)
RETURNS date
AS

BEGIN;
    DECLARE @businessDay date;

    --- This CTE contains "weeks" where each week starts with
    --- with one or several bank holidays and ends on the day
    --- before the next bank holiday.
    WITH weeks (lastHoliday, startBusinessDay, endBusinessDay)
    AS (
        SELECT
            --- The first bank holiday of the previous week:
            LAG(nextHoliday, 1, {d '1900-01-01'})
                OVER (ORDER BY bankHoliday) AS lastHoliday,
            --- The first business day of this week:
            DATEADD(dd, 1, bankHoliday) AS startBusinessDay,
            --- The last business day of this week:
            DATEADD(dd, -1, nextHoliday) AS endBusinessDay
        FROM (
            SELECT [date] AS bankHoliday,
                LEAD([date], 1, {d '2099-12-31'})
                    OVER (ORDER BY [date]) AS nextHoliday
            FROM dbo.BankHolidays) AS sub
        --- Skip sequential holidays:
        WHERE DATEADD(dd, 1, bankHoliday)<nextHoliday
        )

    --- The first business day is the first business day
    --- of a week that includes @date between "lastHoliday"
    --- and "endBusinessDay".
    --- If @date is before the first business day, we'll
    --- use startBusinessDay, otherwise @date is fine.
    SELECT @businessDay=(CASE WHEN @date<startBusinessDay
        THEN startBusinessDay
        ELSE @date END)
    FROM weeks
    WHERE lastHoliday<=@date AND @date<=endBusinessDay;

    RETURN @businessDay;
END;

So far, we’ve eliminated iterations and recursive queries, but there’s still one major performance eye-sore left: Because fn_firstBusinessDay is a scalar function (i.e. it returns a single value for each call), applying it to a large dataset is going to be costly, because it’ll query the dbo.BankHolidays table once for every record.

A set-based solution

The solution is to extract the “weeks” table that we’re using in the LEAD/LAG solution, and join this table directly to the recordset that we want to apply it to. In this example, I’m creating an inline table function, but you could just as easily put this query in a view.

CREATE FUNCTION dbo.fn_workWeeks()
RETURNS TABLE
AS

    RETURN (
        SELECT
            --- The first bank holiday of the previous week:
            LAG(nextHoliday, 1, {d '1900-01-01'})
                OVER (ORDER BY bankHoliday) AS lastHoliday,
            --- The first business day of this week:
            DATEADD(dd, 1, bankHoliday) AS startBusinessDay,
            --- The last business day of this week:
            DATEADD(dd, -1, nextHoliday) AS endBusinessDay
        FROM (
            SELECT [date] AS bankHoliday,
            LEAD([date], 1, {d '2099-12-31'})
                OVER (ORDER BY [date]) AS nextHoliday
            FROM dbo.BankHolidays) AS sub
        --- Skip sequential holidays:
        WHERE DATEADD(dd, 1, bankHoliday)<nextHoliday
        );

All that remains now is to JOIN this dataset to your table and apply the CASE condition, just like we did in the scalar function:

SELECT *, (CASE WHEN someDates.[date]<startBusinessDay
        THEN startBusinessDay
        ELSE someDates.[date] END)
FROM someDates
INNER JOIN dbo.fn_workWeeks() AS w ON
    lastHoliday<=someDates.[date] AND
    someDates.[date]<=endBusinessDay;

The cumulative count approach

Here’s yet another approach to the problem. Let’s start simple, by creating a list of all the dates. We’ll use the LEAD function to produce a CTE called “holidays” that contains the date of each bank holiday, as well as the date of the next one. You’ll recognize this approach from before:

WITH holidays ([date], nextDate)
AS (
    SELECT [date],
        LEAD([date], 1) OVER (ORDER BY [date]) AS nextDate
    FROM dbo.BankHolidays)

SELECT [date], nextDate
FROM holidays
ORDER BY 1;

To this, we can add a recursive CTE consisting of integers between 0 and, say, 10. Adding the number of days in the integer to the “date” column of the “holidays” CTE will give us a complete calendar, with all the dates.

WITH holidays ([date], nextDate)
AS (
    SELECT [date],
        LEAD([date], 1) OVER (ORDER BY [date]) AS nextDate
    FROM dbo.BankHolidays),

     iterator (i)
AS (
    SELECT 0 AS i UNION ALL
    SELECT i+1 FROM iterator WHERE i<10)

SELECT DATEADD(dd, i.i, h.[date]) AS [date]
FROM holidays AS h
INNER JOIN iterator AS i ON
    i.i<DATEDIFF(dd, h.[date], h.nextDate)
ORDER BY 1;

Now, from here on, let’s LEFT JOIN “holidays” again, to see which days are bank holidays and which ones are not:

WITH holidays ([date], nextDate)
AS (
    SELECT [date],
        LEAD([date], 1) OVER (ORDER BY [date]) AS nextDate
    FROM dbo.BankHolidays),

     iterator (i)
AS (
    SELECT 0 AS i UNION ALL
    SELECT i+1 FROM iterator WHERE i<10)

SELECT
    DATEADD(dd, i.i, h.[date]) AS [date],
    (CASE WHEN bh.[date] IS NULL THEN 1 ELSE 0 END) AS isBusinessDay
FROM holidays AS h
INNER JOIN iterator AS i ON
    i.i<DATEDIFF(dd, h.[date], h.nextDate)
LEFT JOIN holidays AS bh ON
    DATEADD(dd, i.i, h.[date])=bh.[date]
ORDER BY 1;

So, with the CASE block, we’re evaluating if this specific date is a business day or not. Now, all we have to do is to calculate a running total on the “isBusinessDay” column. Because this column consists just of 1:s and 0:s, a running total will be like a ROW_NUMBER(), except it only increments for rows that are business days.

Running totals are calculated with a SUM() windowed function, with the ROWS UNBOUNDED PRECEDING keywords, like follows:

WITH iterator (i)
AS (
    SELECT 0 AS i UNION ALL
    SELECT i+1 FROM iterator WHERE i<10),

     holidays ([date], nextDate)
AS (
    SELECT [date], LEAD([date], 1) OVER (ORDER BY [date]) AS nextDate
    FROM dbo.BankHolidays)

SELECT
    DATEADD(dd, i.i, h.[date]) AS [date],
    (CASE WHEN bh.[date] IS NULL THEN 1 ELSE 0 END) AS isBusinessDay,
    --- Accumulated "businessDayNumber" over dates:
    SUM((CASE WHEN bh.[date] IS NULL THEN 1 ELSE 0 END)) OVER (
        ORDER BY DATEADD(dd, i.i, h.[date])
        ROWS UNBOUNDED PRECEDING) AS businessDayNumber
FROM holidays AS h
INNER JOIN iterator AS i ON
    i.i<DATEDIFF(dd, h.[date], h.nextDate)
LEFT JOIN holidays AS bh ON
    DATEADD(dd, i.i, h.[date])=bh.[date]
ORDER BY 1;

The resulting “businessDayNumber” column is a “business day sequence number”, which we can use in our calculation. First off, let’s package the code above in a handy inline table function:

CREATE FUNCTION dbo.fn_businessDays()
RETURNS TABLE
AS

    RETURN (
        WITH iterator (i)
        AS (
            SELECT 0 AS i UNION ALL
            SELECT i+1 FROM iterator WHERE i<10),

             holidays ([date], nextDate)
        AS (
            SELECT [date],
                LEAD([date], 1) OVER (ORDER BY [date]) AS nextDate
            FROM dbo.BankHolidays)

        SELECT
            DATEADD(dd, i.i, h.[date]) AS [date],
            (CASE WHEN bh.[date] IS NULL
                  THEN 1 ELSE 0 END) AS isBusinessDay,
            --- Accumulated "businessDayNumber" over dates:
            SUM((CASE WHEN bh.[date] IS NULL
                THEN 1 ELSE 0 END)) OVER (
                    ORDER BY DATEADD(dd, i.i, h.[date])
                    ROWS UNBOUNDED PRECEDING) AS businessDayNumber
        FROM holidays AS h
        INNER JOIN iterator AS i ON
            i.i<DATEDIFF(dd, h.[date], h.nextDate)
        LEFT JOIN holidays AS bh ON
            DATEADD(dd, i.i, h.[date])=bh.[date]);

Now, we can use this function to calculate the number of business days between two given dates. Note that with this method, we can calculate multiple days, not just the next business day!

--- Four business days from 2013-12-23:
SELECT a.[date] AS fromDate,
    b.businessDayNumber-a.businessDayNumber AS businessDays,
    b.[date] AS toDate
FROM dbo.fn_businessDays() AS a
INNER JOIN dbo.fn_businessDays() AS b ON
    b.isBusinessDay=1
WHERE a.[date]={d '2013-12-23'} AND
    b.businessDayNumber-a.businessDayNumber=4;

--- Three business days before 2014-01-05:
SELECT a.[date] AS fromDate,
    b.businessDayNumber-a.businessDayNumber AS businessDays,
    b.[date] AS toDate
FROM dbo.fn_businessDays() AS a
INNER JOIN dbo.fn_businessDays() AS b ON
    a.isBusinessDay=1
WHERE b.[date]={d '2014-01-05'} AND
    b.businessDayNumber-a.businessDayNumber=3;

--- How many business days from 2013-12-13 to 2014-01-02?
SELECT a.[date] AS fromDate,
    b.businessDayNumber-a.businessDayNumber AS businessDays,
    b.[date] AS toDate
FROM dbo.fn_businessDays() AS a
CROSS JOIN dbo.fn_businessDays() AS b
WHERE a.[date]={d '2013-12-13'} AND
    b.[date]={d '2014-01-02'};

That’s it for this post! Let me know if there’s any particular subject you want more info on, or if I’ve left anything out.

7 thoughts on “Calculating business days and holidays

  1. Congratulations for this outstanding article. I just would avoided to store saturday and sunday dates. Thanks for the great inspiration!

  2. Code looks very elegant, just wanted to know how to add the business days to given date where Saturday and Sunday is not inserted in BankHolidays.

      • Sorry, I was not able to put my requirement properly. As the holidays are saved in BankHolidays but if we does not include the Saturday and Sunday as holiday in this table. How to get the certain number of business days from Particulate date.
        e.g.
        Four business days from 2013-12-23:

    • I suppose you could replace the reference to dbo.BankHolidays with a view that unions every saturday and sunday with the contents of dbo.BankHoliday?

      For example:

      CREATE VIEW dbo.BankHolidaysAndWeekends
      AS
      
      SELECT [date]
      FROM dbo.BankHolidays
      
      UNION
      
      SELECT DATEADD(day, iterator.i*7, x.[date]) AS [date]
      FROM (
      	VALUES ({d '2014-01-04'}), ({d '2014-01-05'})
      	) AS x([date])
      CROSS APPLY (
      	SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS i
      	FROM sys.columns --- or some number table
      	) AS iterator
      
  3. Hi,

    I tried to use following script

    — How many business days from 2013-12-13 to 2014-01-02?
    SELECT a.[date] AS fromDate,
    b.businessDayNumber-a.businessDayNumber AS businessDays,
    b.[date] AS toDate
    FROM dbo.fn_businessDays() AS a
    CROSS JOIN dbo.fn_businessDays() AS b
    WHERE a.[date]={d ‘2013-12-13’} AND
    b.[date]={d ‘2014-01-02’};

    but it was taking more time to execute has compare to the one of below

    CREATE FUNCTION dbo.fn_WorkDays
    (
    @StartDate DATETIME,
    @EndDate DATETIME = NULL –@EndDate replaced by @StartDate when DEFAULTed
    )
    RETURNS INT
    AS
    BEGIN
    DECLARE @Swap DATETIME
    IF @StartDate IS NULL
    RETURN NULL
    IF @EndDate IS NULL
    SELECT @EndDate = @StartDate
    –Strip the time element from both dates (just to be safe) by converting to whole days and back to a date.
    –Usually faster than CONVERT.
    –0 is a date (01/01/1900 00:00:00.000)
    SELECT @StartDate = DATEADD(dd,DATEDIFF(dd,0,@StartDate), 1),
    @EndDate = DATEADD(dd,DATEDIFF(dd,0,@EndDate) , 0)

    –If the inputs are in the wrong order, reverse them.
    IF @StartDate > @EndDate
    SELECT @Swap = @EndDate,
    @EndDate = @StartDate,
    @StartDate = @Swap

    –Calculate and return the number of workdays using the input parameters.
    –This is the meat of the function.
    –This is really just one formula with a couple of parts that are listed on separate lines for documentation purposes.
    RETURN (
    SELECT
    –Start with total number of days including weekends
    (DATEDIFF(dd,@StartDate, @EndDate)+1)
    –Subtact 2 days for each full weekend
    -(DATEDIFF(wk,@StartDate, @EndDate)*2)
    –If StartDate is a Sunday, Subtract 1
    -(CASE WHEN DATENAME(dw, @StartDate) = ‘Sunday’
    THEN 1
    ELSE 0
    END)
    –If EndDate is a Saturday, Subtract 1
    -(CASE WHEN DATENAME(dw, @EndDate) = ‘Saturday’
    THEN 1
    ELSE 0
    END)
    )
    END

    Is there any glitch of using this one?

Let me hear your thoughts!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s