Creating XML data from a recordset

SQL Server has a wide range of XML features natively built into the platform, and this includes methods to create XML from tabular data, working with XML data and, of course, turning XML data back into a recordset. This primer will go through some common methods of creating XML from a regular SELECT query.

The XML datatype

SQL Server has a native datatype for XML data called “xml”. It can be used like any other datatype – it can be declared for variables or set on table columns.

DECLARE @x xml;
SET @x='<test><element/></test>';
SELECT @x;

When selecting a recordset with XML columns in SQL Server Management Studio, an XML column will contain a clickable link. When clicked, the link will open the contained XML document in a new window or tab, where you can inspect it and expand/collapse nodes.

As you would expect, an XML variable or column will validate anything you insert into it, so if you try to insert for instance an unclosed element, you’ll get an error message.

However, more commonly needed and more useful than constructing your XML blocks manually from string variables, you can create XML structures directly from recordsets by appending a FOR XML clause to more or less any SELECT statement.

There are four different methods to do this.

FOR XML AUTO

The AUTO method will use table names (or aliases) and column/expression names/aliases to construct the elements of the XML document. This is probably the easiest way of creating XML structures, but does not really leave you that much freedom in designing the hierarchy.

Example:

SELECT TOP 10 BusinessEntityID, FirstName, LastName
FROM Person.Person AS Entity
FOR XML AUTO;

.. will return:

<Entity BusinessEntityID="285" FirstName="Syed" LastName="Abbas" />
<Entity BusinessEntityID="293" FirstName="Catherine" LastName="Abel" />
<Entity BusinessEntityID="295" FirstName="Kim" LastName="Abercrombie" />
<Entity BusinessEntityID="2170" FirstName="Kim" LastName="Abercrombie" />
<Entity BusinessEntityID="38" FirstName="Kim" LastName="Abercrombie" />
<Entity BusinessEntityID="211" FirstName="Hazem" LastName="Abolrous" />
<Entity BusinessEntityID="2357" FirstName="Sam" LastName="Abolrous" />
<Entity BusinessEntityID="297" FirstName="Humberto" LastName="Acevedo" />
<Entity BusinessEntityID="291" FirstName="Gustavo" LastName="Achong" />
<Entity BusinessEntityID="299" FirstName="Pilar" LastName="Ackerman" />

Note how the name of the element, “Entity” is the table alias, and the properties are named after the columns in the SELECT statement. If you add multiple tables, the XML will be created with nested elements.

FOR XML RAW

FOR XML RAW will do the same thing as FOR XML AUTO, except the element name will be “row” instead of the table alias.

FOR XML EXPLICIT

The EXPLICIT mode has nothing to do with profane lyrics, but is a very cumbersome way of constructing an XML document, but with a lot more creative freedom. The format, however, is not really practical to work with unless you have only a very small number of elements and don’t mind writing queries with a lot of UNIONs, so I won’t go into more detail from here.

If you really want to work with FOR XML EXPLICIT, you can read more on MSDN.

FOR XML PATH

My personal favourite of the four methods, FOR XML PATH, sports a very nice mix of creative freedom and readability. Here’s an example of FOR XML PATH:

SELECT TOP 10 FirstName AS [Entity/@FirstName],
    LastName AS [Entity/@LastName],
    BusinessEntityID AS Entity
FROM Person.Person
FOR XML PATH('');

This query will return a number of “Entity” elements with the attributes”FirstName” and “LastName” (attributes are designated with @ characters in the code). Here’s an example output:

<Entity FirstName="Syed" LastName="Abbas">285</Entity>
<Entity FirstName="Catherine" LastName="Abel">293</Entity>
<Entity FirstName="Kim" LastName="Abercrombie">295</Entity>
<Entity FirstName="Kim" LastName="Abercrombie">2170</Entity>
<Entity FirstName="Kim" LastName="Abercrombie">38</Entity>
<Entity FirstName="Hazem" LastName="Abolrous">211</Entity>
<Entity FirstName="Sam" LastName="Abolrous">2357</Entity>
<Entity FirstName="Humberto" LastName="Acevedo">297</Entity>
<Entity FirstName="Gustavo" LastName="Achong">291</Entity>
<Entity FirstName="Pilar" LastName="Ackerman">299</Entity>

The argument to PATH() is the “base” XPath. In this case, we left it blank, but it allows you to nest your elements in a parent element for each row of the recordset, if you want to.

Assigning FOR XML results to a variable

Because FOR XML turns an entire recordset into a single row with a single column (i.e. a value, as opposed to a recordset), the XML result can be placed in a variable or in an XML column.

DECLARE @people xml;

SET @people=(
    SELECT TOP 10 FirstName AS [Entity/@FirstName],
        LastName AS [Entity/@LastName],
        BusinessEntityID AS Entity
    FROM Person.Person
    FOR XML PATH(''));

SELECT @people;

Nesting XML documents to create trees

Because XML data is contained in a single value, instead of in a recordset with rows and columns, FOR XML queries can be nested within each other. Here’s an example:

To start out with, let’s make a fairly flat XML structure with only distinct last names:

SELECT l.LastName AS [LastName/@LastName]
FROM Person.Person AS l
GROUP BY l.LastName
ORDER BY l.LastName
FOR XML PATH('');

Note: XML data cannot be used with a DISTINCT clause, so I’ll stick to GROUP BY for the same purposes. The results of the query will look something like this:..
<LastName LastName=”Chambers” />
<LastName LastName=”Champion” />
..

Now, let’s add a correlated subquery as a new column:

SELECT l.LastName AS [LastName/@LastName],

    (SELECT BusinessEntityID AS [FirstName/@ID], FirstName
     FROM Person.Person AS f
     WHERE f.LastName=l.LastName
     ORDER BY FirstName
     FOR XML PATH(''), TYPE) AS [LastName/FirstNames]

FROM Person.Person AS l
GROUP BY l.LastName
ORDER BY l.LastName
FOR XML PATH('');

Note the TYPE directive in the subquery! Without it, the returning XML would actually be interpreted as a string and thus encoded, which would make it useless for our example. The TYPE directive tells SQL Server that the returned data should be in the native XML format: This means that it integrates with the “parent” XML without any encoding taking place.

Anyway, back to the query: For each row of the outer query (LastName), we’re running the inner query (FirstName) and including the results as a new column, which will become a new element with the same name, FirstName, in the LastName element. Here’s an excerpt of the result:

..
<LastName>Chambers
  <FirstNames>
    <FirstName ID="667">Pat</FirstName>
  </FirstNames>
</LastName>
<LastName>Champion
  <FirstNames>
    <FirstName ID="17242">Terry</FirstName>
    <FirstName ID="4219">Tommy</FirstName>
  </FirstNames>
</LastName>
..

That’s it for this time, but check back next week for more on working with XML in SQL Server!

2 comments

Leave a comment

Your email address will not be published. Required fields are marked *