This week’s post is a requirement that I see very regularly as a developer. You get a plaintext string containing one or more ranges. Each range is comma delimited, and the start and end values of the range are separated by a dash. The string could look something like this, for example: 100-120,121-499,510,520,790-999.
Wouldn’t it be practical if we could construct a table value function that returns one row for each range, with columns for the start and end of each range?
The first challenge is understanding how to split a string using a delimiter of some sort. In our case, there are two delimiters: the comma (between ranges) and the dash (between each range’s start and end value). For this purpose, I find the CHARINDEX() function the most useful.
DECLARE @string varchar(100)='abcdef,ghijkl'; SELECT CHARINDEX(',', @string);
If your delimiter is a wildcard, you’d use the PATINDEX() function instead of CHARINDEX(), but the rest is the same.
The above query returns the integer value 7, because the 7th character in the string is the first comma. Now, if we combine this with the LEFT() and the SUBSTRING() functions, we can retrieve anything to the left of the comma, and anything after it.
DECLARE @string varchar(100)='abcdef,ghijkl'; SELECT --- The first 6 characters: LEFT(@string, CHARINDEX(',', @string)-1), --- Remaining characters, starting at offset 8: SUBSTRING(@string, CHARINDEX(',', @string)+1, LEN(@string));
But if the string doesn’t contain a comma, the CHARINDEX() function will return 0, in which case the LEFT() call will result in an error, because you can’t pass -1 as the second argument to the LEFT() function.
DECLARE @string varchar(100)='abcdefghijkl'; SELECT --- Characters before first comma: LEFT(@string, CHARINDEX(',', @string)-1), --- Characters after first comma: SUBSTRING(@string, CHARINDEX(',', @string)+1, LEN(@string));
This query will return
Msg 537, Level 16, State 2, Line 3 Invalid length parameter passed to the LEFT or SUBSTRING function.
To remedy this possibility, we’re going to add a comma of our own to the string in the CHARINDEX() function call, making sure it will always return a working value.
DECLARE @string varchar(100)='abcdefghijkl'; SELECT --- Characters before first comma: LEFT(@string, CHARINDEX(',', @string+',')-1), --- Characters after first comma: SUBSTRING(@string, CHARINDEX(',', @string+',')+1, LEN(@string));
Using this logic, we can now iterate through a comma-delimited string using a recursive common table expression:
DECLARE @string varchar(100)='100-120,121-499,510,520,790-999'; WITH ranges ([range], remain) AS ( --- This is the anchor of the recursive CTE: SELECT CAST(NULL AS varchar(max)) AS [range], @string AS remain UNION ALL SELECT --- This is the first remaining range: CAST(LEFT(remain, CHARINDEX(',', remain+',')-1) AS varchar(max)) AS [range], --- This is the remainder of the string: SUBSTRING(remain, CHARINDEX(',', remain+',')+1, LEN(remain)) AS remain FROM ranges WHERE remain!='') SELECT * FROM ranges;
This query will now return the following result set:
range remain NULL 100-120,121-499,510,520,790-999 100-120 121-499,510,520,790-999 121-499 510,520,790-999 510 520,790-999 520 790-999 790-999
Now all we have to do is split the “range” column once more, this time into a start and an end value, using the same logic that we applied to the comma separated list above, except we’re using the dash as the separator character, and we can safely assume that there will never be more than one dash in one range.
SELECT --- Anything to the left of the first dash (if any) LEFT([range], CHARINDEX('-', [range]+'-')-1), --- .. and anything after: SUBSTRING([range], CHARINDEX('-', [range])+1, LEN([range])) FROM ranges WHERE [range]!='';
Finally, we’ll package all of this code into an easy-to-use table value function:
CREATE FUNCTION dbo.fn_parseRanges(@range varchar(max)) RETURNS @tbl TABLE ( rangeStart varchar(max) NOT NULL, rangeEnd varchar(max) NOT NULL, PRIMARY KEY CLUSTERED (rangeStart, rangeEnd) ) AS BEGIN; WITH ranges ([range], remain) AS ( --- This is the anchor of the recursive CTE: SELECT CAST(NULL AS varchar(max)) AS [range], @range AS remain UNION ALL SELECT --- This is the first remaining range: CAST(LEFT(remain, CHARINDEX(',', remain+',')-1) AS varchar(max)) AS [range], --- This is the remainder of the string: SUBSTRING(remain, CHARINDEX(',', remain+',')+1, LEN(remain)) AS remain FROM ranges WHERE remain!='') --- Output each range into @tbl: INSERT INTO @tbl (rangeStart, rangeEnd) SELECT DISTINCT --- The start of the range: LEFT([range], CHARINDEX('-', [range]+'-')-1), --- ... and the end of the range: SUBSTRING([range], CHARINDEX('-', [range])+1, LEN([range])) FROM ranges WHERE [range]!=''; RETURN; END;
… which can be called like this:
SELECT * FROM dbo.fn_parseRanges('100-120,121-499,510,520,790-999');
That’s it for this week. Let me know if there’s anything I’ve left out, and don’t forget to like the Facebook page to receive an update whenever there’s a new post!