Turning semicolon delimited data into a table

Here’s a handy conversion tool that I built for myself in order to convert semi-colon delimited data (very common in files, etc) into a table. It comes in the form of a scalar function, that converts a varchar(max) to an XML value.

The reason I’ve chosen XML as the return format is because this allows for a dynamic number of columns, even on a row-for-row basis. I’ve named the elements in the return xml so that they will construct an HTML table (TABLE, TR and TD), but you could, of course, call them anything you like with a simple modification of the code.

First off, the CREATE FUNCTION statement and some variable declarations that we’re going to need:

CREATE FUNCTION dbo.sdv2xml(@txt varchar(max))
RETURNS xml
AS

BEGIN;
    --- Declarations:
    DECLARE
        @row_sep    varchar(10)=CHAR(13)+CHAR(10),
        @col_sep    varchar(10)=';',
        @xml        xml;

Note that you can change the @col_sep and @row_sep constants so you, in theory, could have the function work with any kind of delimited file. For example, you could set @col_sep to CHAR(9) in order to work with tab separated files instead.

Next up, I’m building a recursive common table expression that splits the text string into one record per line (using @row_sep).

    --- [rows] is a recursive common table expression that
    --- splits the @sdv variable into rows, where each row
    --- in the data set represents a row in @sdv.
    -----------------------------------------------------------

    WITH [rows] ([row], row_string, remain)
    AS (
        --- Anchor: "Row 0" is the anchor row of the
        --- recursive CTE.
        SELECT 0 AS [row],
            CAST(NULL AS varchar(max)) AS row_string,
            @txt AS remain

        UNION ALL

        --- Recursion: Iterate through the "remain" column of
        --- the [rows] common table expression. Each recursion
        --- adds another row, and increments the [row] counter
        --- column by 1.
        SELECT [row]+1 AS [row],
            CAST(LEFT(remain,
                CHARINDEX(@row_sep, remain+@row_sep)-1)
                AS varchar(max)) AS row_string,
            CAST(SUBSTRING(remain,
                CHARINDEX(@row_sep, remain+@row_sep)+
                    LEN(@row_sep),
                LEN(remain)) AS varchar(max)) AS remain
        FROM [rows]
        WHERE NULLIF(remain, '') IS NOT NULL),

Now that we have a recordset, “rows”, with one record per row in the string, here’s another recursive common table expression that splits these rows into columns, using more or less the same method. The resulting recordset will contain one record for each row and column of the original string:

    --- cols is a recursive common table expression, based on
    --- the [rows] cte, that splits each row in [rows] into
    --- columns.
    -----------------------------------------------------------

         cols ([row], col, [value], remain)
    AS (
        --- Anchor: "Column 0" of each row is the anchor of
        --- the recursive CTE.
        SELECT [row],
            0 AS col,
            CAST(NULL AS varchar(max)) AS [value],
            row_string AS remain
        FROM [rows]
        WHERE [row]>0    --- Excluding the blank anchor.

        UNION ALL

        --- Recursion: Iterate through each column of the
        --- "remain" cte variable.
        SELECT [row],
            col+1 AS col,
            CAST(LEFT(remain,
                CHARINDEX(@col_sep, remain+@col_sep)-1)
                    AS varchar(max)) AS [value],
            CAST(SUBSTRING(remain,
                CHARINDEX(@col_sep, remain+@col_sep)+
                    LEN(@col_sep),
                LEN(remain)) AS varchar(max)) AS remain
        FROM cols
        WHERE NULLIF(remain, '') IS NOT NULL)

And finally, here’s the SELECT query to piece it all together into an XML document. If you’re new to FOR XML PATH(), check out my other article on the subject.

    --- Using the [rows] and cols common table expressions,
    --- create a single XML variable.
    -----------------------------------------------------------

    SELECT @xml=(
        --- A TABLE element surrounds the entire set:
        SELECT (

            --- Rows in a TR elements:
            SELECT (

                --- Columns in TD elements:
                SELECT [value] AS TD
                FROM cols
                WHERE cols.[row]=[rows].[row]
                ORDER BY [row]
                FOR XML PATH (''), TYPE
                --- End of TD

                ) AS [TR]
            FROM [rows]
            ORDER BY [row]
            FOR XML PATH (''), TYPE
            --- End of TR

            ) AS [TABLE]
        FOR XML PATH (''), TYPE
        --- End of TABLE
        );

Finally, returning @xml value to the caller.

    --- Return the result
    -----------------------------------------------------------

    RETURN @xml;

END;

That’s it. Here’s a test query:

DECLARE @x xml;

SET @x=dbo.sdv2xml(
'a1;bb1;ccc1;dddd1;eeeee1;ffffff1
a2;bb2;ccc2;dddd2;eeeee2;ffffff2
a3;bb3;ccc3;dddd3;eeeee3;ffffff3
a4;bb4;ccc4;dddd4;eeeee4;ffffff4');

SELECT @x;

Comments, ideas, questions – leave them in the comment field below!

One thought on “Turning semicolon delimited data into a table

Let me hear your thoughts!

This site uses Akismet to reduce spam. Learn how your comment data is processed.