A common scenario you may have encountered is the need to calculate the first business day after a given date. There are quite a few ugly ways to solve this, including cursors, but there are also some pretty neat ways to approach the problem, and as a bonus, you’ll learn about recursion and the new LEAD(), LAG() functions and accumulation in T-SQL.
In this brief tutorial, we’re going to look at two ways to calculate the first business day starting with a given date. A starting point is the notion that if the “current” day is a bank holiday, we’ll just increment the date by one day. However, if it’s a saturday, you would end up on a sunday, so we’re going to have to build a recursive function that allows for a large number of bank holidays in sequence, such as for instance christmas or other major holidays.
Prerequisites
Because SQL Server does not automatically know which dates are bank holidays or business days, any solution you build has to be built around a table with knowledge of those dates. I’ve chosen to build a table called simply BankHolidays, that contains one record with a single date column for each date that isn’t a business day.
--- BankHoliday table: CREATE TABLE dbo.BankHolidays ( [date] date NOT NULL, CONSTRAINT PK_BankHolidays PRIMARY KEY CLUSTERED ([date]) ); --- Swedish bank holidays for december 2013 and january 2014: INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-01'}); INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-07'}); INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-08'}); INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-14'}); INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-15'}); INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-21'}); INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-22'}); INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-24'}); INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-25'}); INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-26'}); INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-28'}); INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-29'}); INSERT INTO dbo.BankHolidays VALUES ({d '2013-12-31'}); INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-01'}); INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-04'}); INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-05'}); INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-06'}); INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-11'}); INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-12'}); INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-18'}); INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-19'}); INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-25'}); INSERT INTO dbo.BankHolidays VALUES ({d '2014-01-26'});
An iterating scalar function
Here’s a very simple solution in the form of a scalar function. The input date is assigned to a variable, which is then incremented by one day until it doesn’t match a row in the bank holiday table.
CREATE FUNCTION dbo.fn_firstBusinessDay(@date date) RETURNS date AS BEGIN; --- The iteration/output variable: DECLARE @businessDay date; SELECT @businessDay=@date; --- Increment the value by one day if this --- is a bank holiday: WHILE (EXISTS (SELECT [date] FROM dbo.BankHolidays WHERE [date]=@businessDay)) SET @businessDay=DATEADD(dd, 1, @businessDay); --- Done: RETURN @businessDay; END;
Obviously, for every iteration, this function will perform an index seek in the BankHolidays table, which makes it a less-than-optimal choice from a performance standpoint. It becomes particularly bad if you’re applying it to a large batch of records, in which case it will hit the BankHolidays table with a seek operation once for every record and iteration.
Using a recursive CTE
A better idea that comes to mind, given that this is a recursive problem, is to use a recursive common table expression.
ALTER FUNCTION dbo.fn_firstBusinessDay(@date date) RETURNS date AS BEGIN; --- The output variable: DECLARE @businessDay date; --- Recursive common table expression: WITH rcte ([date], nextBusinessDay) AS ( --- The anchor is the input date (if it's a bank holiday) SELECT [date], DATEADD(dd, 1, [date]) AS nextBusinessDay FROM dbo.BankHolidays WHERE [date]=@date UNION ALL --- ... and the recursion adds a day to the --- "nextBusinessDay" column as long as it's --- still a bank holiday: SELECT h1.[date], DATEADD(dd, 1, h1.nextBusinessDay) AS nextBusinessDay FROM rcte AS h1 INNER JOIN dbo.BankHolidays AS h2 ON h1.nextBusinessDay=h2.[date] ) --- Finally, the result of the CTE will contain one record for --- each iteration, so we'll filter out the "last" one, which --- is the one with the latest "nextBusinessDay", hence the --- MAX() function: SELECT @businessDay=MAX(nextBusinessDay) FROM rcte WHERE [date]=@date --- But if @date wasn't a bank holiday from the start, we'll --- just return the original date: IF (@businessDay IS NULL) SET @businessDay=@date; RETURN @businessDay; END;
The use of a recursive CTE instead of iterating SELECT statements on the BankHolidays table allows SQL Server to just scan the table once, and then work with this data using a table spool, which is a kind of internal temp table, to radically improve performance. But what if you could eliminate the recursions/iterations all together?
Using the LEAD() function
The LEAD() and LAG() functions are some of the new additions to T-SQL as of SQL Server 2012, and they allow you to look at the previous or next column value in a recordset. Take a look at the following query:
SELECT [date] AS bankHoliday, LEAD([date], 1) OVER (ORDER BY [date]) AS nextHoliday FROM dbo.BankHolidays;
This query returns the date of each row in the BankHolidays table, as well as the next bank holiday. This is useful, because all of the days between this bank holiday and the next are business days (although some rows will have zero days between the two).
If we package the above query in a subquery and isolate only bank holidays where the next day is a business day, we can produce a list of “work weeks”. In this context, a “work week” is a contiguous sequence of one or more business days.
SELECT LAG(nextHoliday, 1, {d '1900-01-01'}) OVER (ORDER BY bankHoliday) AS lastHoliday, DATEADD(dd, 1, bankHoliday) AS startBusinessDay, DATEADD(dd, -1, nextHoliday) AS endBusinessDay FROM ( SELECT [date] AS bankHoliday, LEAD([date], 1, {d '2099-12-31'}) OVER (ORDER BY [date]) AS nextHoliday FROM dbo.BankHolidays) AS sub WHERE DATEADD(dd, 1, bankHoliday)<nextHoliday;
The first column is the day after the previous “week”, the second and third columns are the first and last work days of the “week”. You’ll notice that the day after “endBusinessDay” is the same as “lastHoliday” on the next row.
Here’s the scalar function again, where this time, this data is calculated in a common table expression called “weeks”:
ALTER FUNCTION dbo.fn_firstBusinessDay(@date date) RETURNS date AS BEGIN; DECLARE @businessDay date; --- This CTE contains "weeks" where each week starts with --- with one or several bank holidays and ends on the day --- before the next bank holiday. WITH weeks (lastHoliday, startBusinessDay, endBusinessDay) AS ( SELECT --- The first bank holiday of the previous week: LAG(nextHoliday, 1, {d '1900-01-01'}) OVER (ORDER BY bankHoliday) AS lastHoliday, --- The first business day of this week: DATEADD(dd, 1, bankHoliday) AS startBusinessDay, --- The last business day of this week: DATEADD(dd, -1, nextHoliday) AS endBusinessDay FROM ( SELECT [date] AS bankHoliday, LEAD([date], 1, {d '2099-12-31'}) OVER (ORDER BY [date]) AS nextHoliday FROM dbo.BankHolidays) AS sub --- Skip sequential holidays: WHERE DATEADD(dd, 1, bankHoliday)<nextHoliday ) --- The first business day is the first business day --- of a week that includes @date between "lastHoliday" --- and "endBusinessDay". --- If @date is before the first business day, we'll --- use startBusinessDay, otherwise @date is fine. SELECT @businessDay=(CASE WHEN @date<startBusinessDay THEN startBusinessDay ELSE @date END) FROM weeks WHERE lastHoliday<=@date AND @date<=endBusinessDay; RETURN @businessDay; END;
So far, we’ve eliminated iterations and recursive queries, but there’s still one major performance eye-sore left: Because fn_firstBusinessDay is a scalar function (i.e. it returns a single value for each call), applying it to a large dataset is going to be costly, because it’ll query the dbo.BankHolidays table once for every record.
A set-based solution
The solution is to extract the “weeks” table that we’re using in the LEAD/LAG solution, and join this table directly to the recordset that we want to apply it to. In this example, I’m creating an inline table function, but you could just as easily put this query in a view.
CREATE FUNCTION dbo.fn_workWeeks() RETURNS TABLE AS RETURN ( SELECT --- The first bank holiday of the previous week: LAG(nextHoliday, 1, {d '1900-01-01'}) OVER (ORDER BY bankHoliday) AS lastHoliday, --- The first business day of this week: DATEADD(dd, 1, bankHoliday) AS startBusinessDay, --- The last business day of this week: DATEADD(dd, -1, nextHoliday) AS endBusinessDay FROM ( SELECT [date] AS bankHoliday, LEAD([date], 1, {d '2099-12-31'}) OVER (ORDER BY [date]) AS nextHoliday FROM dbo.BankHolidays) AS sub --- Skip sequential holidays: WHERE DATEADD(dd, 1, bankHoliday)<nextHoliday );
All that remains now is to JOIN this dataset to your table and apply the CASE condition, just like we did in the scalar function:
SELECT *, (CASE WHEN someDates.[date]<startBusinessDay THEN startBusinessDay ELSE someDates.[date] END) FROM someDates INNER JOIN dbo.fn_workWeeks() AS w ON lastHoliday<=someDates.[date] AND someDates.[date]<=endBusinessDay;
The cumulative count approach
Here’s yet another approach to the problem. Let’s start simple, by creating a list of all the dates. We’ll use the LEAD function to produce a CTE called “holidays” that contains the date of each bank holiday, as well as the date of the next one. You’ll recognize this approach from before:
WITH holidays ([date], nextDate) AS ( SELECT [date], LEAD([date], 1) OVER (ORDER BY [date]) AS nextDate FROM dbo.BankHolidays) SELECT [date], nextDate FROM holidays ORDER BY 1;
To this, we can add a recursive CTE consisting of integers between 0 and, say, 10. Adding the number of days in the integer to the “date” column of the “holidays” CTE will give us a complete calendar, with all the dates.
WITH holidays ([date], nextDate) AS ( SELECT [date], LEAD([date], 1) OVER (ORDER BY [date]) AS nextDate FROM dbo.BankHolidays), iterator (i) AS ( SELECT 0 AS i UNION ALL SELECT i+1 FROM iterator WHERE i<10) SELECT DATEADD(dd, i.i, h.[date]) AS [date] FROM holidays AS h INNER JOIN iterator AS i ON i.i<DATEDIFF(dd, h.[date], h.nextDate) ORDER BY 1;
Now, from here on, let’s LEFT JOIN “holidays” again, to see which days are bank holidays and which ones are not:
WITH holidays ([date], nextDate) AS ( SELECT [date], LEAD([date], 1) OVER (ORDER BY [date]) AS nextDate FROM dbo.BankHolidays), iterator (i) AS ( SELECT 0 AS i UNION ALL SELECT i+1 FROM iterator WHERE i<10) SELECT DATEADD(dd, i.i, h.[date]) AS [date], (CASE WHEN bh.[date] IS NULL THEN 1 ELSE 0 END) AS isBusinessDay FROM holidays AS h INNER JOIN iterator AS i ON i.i<DATEDIFF(dd, h.[date], h.nextDate) LEFT JOIN holidays AS bh ON DATEADD(dd, i.i, h.[date])=bh.[date] ORDER BY 1;
So, with the CASE block, we’re evaluating if this specific date is a business day or not. Now, all we have to do is to calculate a running total on the “isBusinessDay” column. Because this column consists just of 1:s and 0:s, a running total will be like a ROW_NUMBER(), except it only increments for rows that are business days.
Running totals are calculated with a SUM() windowed function, with the ROWS UNBOUNDED PRECEDING keywords, like follows:
WITH iterator (i)
AS (
SELECT 0 AS i UNION ALL
SELECT i+1 FROM iterator WHERE i<10),
holidays ([date], nextDate)
AS (
SELECT [date], LEAD([date], 1) OVER (ORDER BY [date]) AS nextDate
FROM dbo.BankHolidays)
SELECT
DATEADD(dd, i.i, h.[date]) AS [date],
(CASE WHEN bh.[date] IS NULL THEN 1 ELSE 0 END) AS isBusinessDay,
--- Accumulated "businessDayNumber" over dates:
SUM((CASE WHEN bh.[date] IS NULL THEN 1 ELSE 0 END)) OVER (
ORDER BY DATEADD(dd, i.i, h.[date])
ROWS UNBOUNDED PRECEDING) AS businessDayNumber
FROM holidays AS h
INNER JOIN iterator AS i ON
i.i<DATEDIFF(dd, h.[date], h.nextDate)
LEFT JOIN holidays AS bh ON
DATEADD(dd, i.i, h.[date])=bh.[date]
ORDER BY 1;
The resulting “businessDayNumber” column is a “business day sequence number”, which we can use in our calculation. First off, let’s package the code above in a handy inline table function:
CREATE FUNCTION dbo.fn_businessDays()
RETURNS TABLE
AS
RETURN (
WITH iterator (i)
AS (
SELECT 0 AS i UNION ALL
SELECT i+1 FROM iterator WHERE i<10),
holidays ([date], nextDate)
AS (
SELECT [date],
LEAD([date], 1) OVER (ORDER BY [date]) AS nextDate
FROM dbo.BankHolidays)
SELECT
DATEADD(dd, i.i, h.[date]) AS [date],
(CASE WHEN bh.[date] IS NULL
THEN 1 ELSE 0 END) AS isBusinessDay,
--- Accumulated "businessDayNumber" over dates:
SUM((CASE WHEN bh.[date] IS NULL
THEN 1 ELSE 0 END)) OVER (
ORDER BY DATEADD(dd, i.i, h.[date])
ROWS UNBOUNDED PRECEDING) AS businessDayNumber
FROM holidays AS h
INNER JOIN iterator AS i ON
i.i<DATEDIFF(dd, h.[date], h.nextDate)
LEFT JOIN holidays AS bh ON
DATEADD(dd, i.i, h.[date])=bh.[date]);
Now, we can use this function to calculate the number of business days between two given dates. Note that with this method, we can calculate multiple days, not just the next business day!
--- Four business days from 2013-12-23: SELECT a.[date] AS fromDate, b.businessDayNumber-a.businessDayNumber AS businessDays, b.[date] AS toDate FROM dbo.fn_businessDays() AS a INNER JOIN dbo.fn_businessDays() AS b ON b.isBusinessDay=1 WHERE a.[date]={d '2013-12-23'} AND b.businessDayNumber-a.businessDayNumber=4; --- Three business days before 2014-01-05: SELECT a.[date] AS fromDate, b.businessDayNumber-a.businessDayNumber AS businessDays, b.[date] AS toDate FROM dbo.fn_businessDays() AS a INNER JOIN dbo.fn_businessDays() AS b ON a.isBusinessDay=1 WHERE b.[date]={d '2014-01-05'} AND b.businessDayNumber-a.businessDayNumber=3; --- How many business days from 2013-12-13 to 2014-01-02? SELECT a.[date] AS fromDate, b.businessDayNumber-a.businessDayNumber AS businessDays, b.[date] AS toDate FROM dbo.fn_businessDays() AS a CROSS JOIN dbo.fn_businessDays() AS b WHERE a.[date]={d '2013-12-13'} AND b.[date]={d '2014-01-02'};
That’s it for this post! Let me know if there’s any particular subject you want more info on, or if I’ve left anything out.
Congratulations for this outstanding article. I just would avoided to store saturday and sunday dates. Thanks for the great inspiration!
Code looks very elegant, just wanted to know how to add the business days to given date where Saturday and Sunday is not inserted in BankHolidays.
Not sure I understand your question. 🙂
Sorry, I was not able to put my requirement properly. As the holidays are saved in BankHolidays but if we does not include the Saturday and Sunday as holiday in this table. How to get the certain number of business days from Particulate date.
e.g.
Four business days from 2013-12-23:
I suppose you could replace the reference to dbo.BankHolidays with a view that unions every saturday and sunday with the contents of dbo.BankHoliday?
For example:
Thanks a lot Daniel, it helped me to work on my requirement.
Hi,
I tried to use following script
— How many business days from 2013-12-13 to 2014-01-02?
SELECT a.[date] AS fromDate,
b.businessDayNumber-a.businessDayNumber AS businessDays,
b.[date] AS toDate
FROM dbo.fn_businessDays() AS a
CROSS JOIN dbo.fn_businessDays() AS b
WHERE a.[date]={d ‘2013-12-13’} AND
b.[date]={d ‘2014-01-02’};
but it was taking more time to execute has compare to the one of below
CREATE FUNCTION dbo.fn_WorkDays
(
@StartDate DATETIME,
@EndDate DATETIME = NULL –@EndDate replaced by @StartDate when DEFAULTed
)
RETURNS INT
AS
BEGIN
DECLARE @Swap DATETIME
IF @StartDate IS NULL
RETURN NULL
IF @EndDate IS NULL
SELECT @EndDate = @StartDate
–Strip the time element from both dates (just to be safe) by converting to whole days and back to a date.
–Usually faster than CONVERT.
–0 is a date (01/01/1900 00:00:00.000)
SELECT @StartDate = DATEADD(dd,DATEDIFF(dd,0,@StartDate), 1),
@EndDate = DATEADD(dd,DATEDIFF(dd,0,@EndDate) , 0)
–If the inputs are in the wrong order, reverse them.
IF @StartDate > @EndDate
SELECT @Swap = @EndDate,
@EndDate = @StartDate,
@StartDate = @Swap
–Calculate and return the number of workdays using the input parameters.
–This is the meat of the function.
–This is really just one formula with a couple of parts that are listed on separate lines for documentation purposes.
RETURN (
SELECT
–Start with total number of days including weekends
(DATEDIFF(dd,@StartDate, @EndDate)+1)
–Subtact 2 days for each full weekend
-(DATEDIFF(wk,@StartDate, @EndDate)*2)
–If StartDate is a Sunday, Subtract 1
-(CASE WHEN DATENAME(dw, @StartDate) = ‘Sunday’
THEN 1
ELSE 0
END)
–If EndDate is a Saturday, Subtract 1
-(CASE WHEN DATENAME(dw, @EndDate) = ‘Saturday’
THEN 1
ELSE 0
END)
)
END
Is there any glitch of using this one?