Introduction
This is the second post in a series of five on Sending Multiple Rows of Data to SQL Server 2008(R2).
In this post we’ll look at how to use the OpenXML Rowset Provider to shred an XML document into a rowset which can then be used in Transact-SQL statements in which rowset providers such as a table, view, or the OPENROWSET function can appear. We’ll also look at how .Net Developers can use OpenXML to reduce the number of SQL Server calls required to handle data modifications.
For details of other posts in this series, please refer to Sending Multiple Rows of Data to SQL Server 2008(R2).
Brief History
The OpenXML Rowest Provider was first introduced in SQL Server 2000 where it could be used to shred an XML string into columns of a rowset by evaluating XPath 1.0 expressions. In SQL Server 2005 Microsoft enhanced OpenXML so that it could accept the newly introduced XML Data Type in sp_xml_preparedocument, and to generate XML and new SQL type columns in a rowset.
Syntax
The basic OpenXML syntax is:
OPENXML( idoc int [ in] , rowpattern nvarchar [ in ] , [ flags byte [ in ] ] )
[ WITH ( SchemaDeclaration | TableName ) ]
Now that all the boring stuff is out of the way, let’s look at a very simple example of shredding an XML document into a rowest and then using that rowset to update records in our database.
Generating XML with .Net
For the purpose of this example, a Passenger is the primary person on a cruise booking and PAX are additional persons traveling with this passenger. Now, I know you’re probably already thinking “Why has he got FirstName and LastName in both tables?”. I’ve done this to make the example easier to follow.
Figure 1 represents an ADO.NET Dataset called dsPassengers which contains two tables (Passenger and PAX) and it has been created based on two tables in our SQL Server database called Passenger and PAX. As this is an internal use dataset only, I have blanked out the Namespace property for this dataset (I will elaborate on my reasons for doing this later).
The PersonID, PAXID, FirstName and LastName columns are all defined as System.String, the Newsletter column has been defined as System.Boolean, and the PAXOrder column has been defined as a System.Byte.

Figure 1
Let’s assume that I have a .Net application running in the background which has populated this dataset with the following records:
Passenger
PersonID: 7a7c18a0-ad47-e011-9c71-00155d991421
FirstName: Jeff
LastName: Wharton
Newsletter: False
PAX
PAXID: 868b8fde-068a-4468-b9b6-9660166e630d
FirstName: Lily
LastName: Wharton
PAXOrder: 3
PAXID: bb2770e0-b0e0-479f-8c03-1330ce87cb64
FirstName: Ben
LastName: Wharton
PAXOrder: 2
This gives me a total of 3 records (rows) that I’d like to send to SQL Server.
Given this situation, most people would probably set up a loop to traverse the tables in the dataset, set up another loop to traverse the records in each table, and send each row to the database one at a time resulting in 3 calls being made. This is where the power of XML comes into play.
If you call the GetXML() method of an ADO.NET Dataset, it returns an XML representation of the data held within it. This is one of the benefits of using ADO.NET Datasets; it looks after all the messy work involved in creating an XML representation of your data. So, for our dataset, the XML would look like:
<dsPassengers>
<Passenger>
<Newsletter>false</Newsletter>
<PersonID>7a7c18a0-ad47-e011-9c71-00155d991421</PersonID>
<FirstName>Jeff</FirstName>
<LastName>Wharton</LastName>
</Passenger>
<PAX>
<PAXID>868b8fde-068a-4468-b9b6-9660166e630d</PAXID>
<FirstName>Lily</FirstName>
<LastName>Wharton</LastName>
<PaxOrder>3</PaxOrder>
</PAX>
<PAX>
<PAXID>bb2770e0-b0e0-479f-8c03-1330ce87cb64</PAXID>
<FirstName>Ben</FirstName>
<LastName>Wharton</LastName>
<PaxOrder>2</PaxOrder>
</PAX>
</dsPassengers>
You can see from the returned XML that the Passenger and PAX tables are nested within dsPassenger and that there is 1 Passenger record (row) and two PAX records (rows).
Now that I’ve got this XML, I can start work on creating a stored procedure to process the XML (We’ll come back to the .Net side of things a bit later on).
Using OpenXML
The first step is to create two variables in SQL Server Query Analyser; one to hold the XML returned above and another to be used with OpenXML:
DECLARE @idoc int
DECLARE @doc XML = '
<dsPassengers>
<Passenger>
<Newsletter>false</Newsletter>
<PersonID>7a7c18a0-ad47-e011-9c71-00155d991421</PersonID>
<FirstName>Jeff</FirstName>
<LastName>Wharton</LastName>
</Passenger>
<PAX>
<PAXID>868b8fde-068a-4468-b9b6-9660166e630d</PAXID>
<FirstName>Lily</FirstName>
<LastName>Wharton</LastName>
<PaxOrder>3</PaxOrder>
</PAX>
<PAX>
<PAXID>bb2770e0-b0e0-479f-8c03-1330ce87cb64</PAXID>
<FirstName>Ben</FirstName>
<LastName>Wharton</LastName>
<PaxOrder>2</PaxOrder>
</PAX>
</dsPassengers>'
Now I need to write the code that reads the XML text provided as input, parses the text by using the MSXML parser (Msxmlsql.dll), and provides the parsed document in a state ready for consumption.
EXEC sp_xml_preparedocument @idoc OUTPUT, @doc
sp_xml_preparedocument returns a handle (assigned to the @idoc variable I created earlier) that can be used to access the newly created internal representation of the XML document.
Now I need to write the code that takes this handle and passes it to the OpenXML Rowset Provider:
SELECT PersonID, FirstName, LastName, Newsletter
FROM OPENXML (@idoc, '/dsPassengers/Passenger',2)
WITH (PersonID varchar(50),
FirstName varchar(20),
LastName varchar(20),
Newsletter BIT)
SELECT PersonID, PAXID, FirstName, LastName
FROM OPENXML (@idoc, '/dsPassengers/PAX',2)
WITH (PAXID varchar(50),
FirstName varchar(20),
LastName varchar(20),
PaxOrder int,
-- If we use the optional ColPattern option, we can
-- access data anywhere within the XML
PersonID varchar(50) '../Passenger/PersonID')
ORDER BY PaxOrder
I then need to write the code that removes the internal representation of the XML document specified by the document handle and invalidates the document handle.
EXEC sp_xml_removedocument @idoc
It is very important that you execute sp_xml_removedocument as soon as you can as a parsed document is stored in the internal cache of SQL Server and the MSXML parser (Msxmlsql.dll) uses one-eighth of the total memory available for SQL Server so we don’t want these little buggers hanging around for too long!
That’s it. If I execute the above code, it will return the following:
Because OpenXML has turned my XML data into a rowset, I can use the results in Transact-SQL statements in which rowset providers such as a table, view, or the OPENROWSET function can appear. For example:
SELECT CruiseCode
FROM OPENXML (@idoc, '/dsPassengers/Passenger',2)
WITH (PersonID varchar(50),
FirstName varchar(20),
LastName varchar(20),
Newsletter BIT) XMLData
INNER JOIN Booking.Actual BA ON XMLData.PersonID = BA.PersonID
UPDATE Person.Detail
SET FirstName = XMLData.FirstName, LastName = XMLData.LastName,
Newsletter = XMLData.Newsletter
FROM OPENXML (@idoc, '/dsPassengers/Passenger',2)
WITH (PersonID varchar(50),
FirstName varchar(20),
LastName varchar(20),
Newsletter BIT) XMLData
WHERE Person.Detail.PersonID = XMLData.PersonID
Namespaces
Earlier on I stated that I was blanking out the Namespace property of dsPassenger. Here’s the reason why.
If I add a Namespace to our dataset (I’ve used the default assigned by ADO.NET), the XML returned from the GetXML() method looks this:
<dsPassengers xmlns="http://tempuri.org/dsPassengers.xsd">
<Passenger>
<Newsletter>false</Newsletter>
<PersonID>7a7c18a0-ad47-e011-9c71-00155d991421</PersonID>
<FirstName>Jeff</FirstName>
<LastName>Wharton</LastName>
</Passenger>
</dsPassengers>
I now need to account for this Namespace when I code our sp_xml_preparedocument statement
EXEC sp_xml_preparedocument @hdoc OUTPUT, @PassengerDetail,
'<dsPassengers xmlns:a="http://tempuri.org/dsPassengers.xsd"/>'
and also in our OpenXML statement
SELECT [a:PersonID] PersonID, [a:FirstName] FirstName,
[a:LastName] LastName, [a:Newsletter] Newsletter
FROM OPENXML(@hDoc, '//a:Passenger', 2)
WITH ([a:PersonID] varchar(50),
[a:FirstName] varchar(20),
[a:LastName] varchar(20),
[a:Newsletter] BIT)
Now, for the XML purists out there, this is great. For someone whose not sharing their XML outside of their organisation and just wants to shred-and-store, all this extra work is a pain in the backside so I never use a Namespace.
Stored Procedure
So now that I know my code is working and I’ve removed all the ‘”Namespace” changes, I’ll turn it into a Stored Procedure.
CREATE PROC [Person].[usp_UpdatePassengerDetail]
(
@PassengerDetail XML
)
AS
SET NOCOUNT ON
DECLARE @idoc int
EXEC sp_xml_preparedocument @idoc OUTPUT, @PassengerDetail
UPDATE Person.Detail
SET FirstName = XMLData.FirstName, LastName = XMLData.LastName,
Newsletter = XMLData.Newsletter
FROM OPENXML (@idoc, '/dsPassengers/Passenger',2)
WITH (PersonID varchar(50),
FirstName varchar(20),
LastName varchar(20),
Newsletter BIT) XMLData
WHERE Person.Detail.PersonID = XMLData.PersonID
UPDATE Person.PAX
SET PersonID = XMLData.PersonID, FirstName = XMLData.FirstName,
LastName = XMLData.LastName
FROM OPENXML (@idoc, '/dsPassengers/PAX',2)
WITH (PAXID varchar(50),
FirstName varchar(20),
LastName varchar(20),
PaxOrder int,
PersonID varchar(50) '../Passenger/PersonID') XMLData
WHERE Person.PAX.PAXID = XMLData.PAXID
EXEC sp_xml_removedocument @idoc
IF @@ERROR <> 0
-- Return 99 to the calling program to indicate failure.
RETURN 99
ELSE
-- Return 0 to the calling program to indicate success.
RETURN 0
SET NOCOUNT OFF
I now have a stored procedure which accepts 1 parameter of type XML so let’s take a quick look at how to call it from a .Net application.
Putting It All Together
To call the stored procedure from .Net and pass my our XML data is very straight forward.
I’m using the Microsoft Enterprise Library 5.0 as it handles the majority of the database code required to connect to SQL Server and execute the stored procedure.
using Microsoft.Practices.EnterpriseLibrary.Data;
using Microsoft.Practices.EnterpriseLibrary.Data.Sql;
// Custom Dataset Schema Class
//****************************
using Ozcruising.Schemas;
//****************************
using System.Data.Common;
using System;
using System.Data;
class OpenXML
{
public int UpdatePassengerPaxDetails(dsPassengers ds)
{
SqlDatabase db = (SqlDatabase)DatabaseFactory.CreateDatabase();
// This is the stored procedure we created earlier
DbCommand dbCommand =
db.GetStoredProcCommand("[Person].[usp_UpdatePassengerDetail]");
// Add a paramter of type XML and assign the
// XML from the dsPassengers dataset
db.AddInParameter(dbCommand, "@PassengerDetail", DbType.Xml, ds.GetXml());
// Add a ReturnValue paramter of type INT for error checking
db.AddParameter(dbCommand, "@RETURN_VALUE", DbType.Int16, 0,
ParameterDirection.ReturnValue, false, 0, 0, "",
DataRowVersion.Current, 0);
// Execute the stored procedure
db.ExecuteNonQuery(dbCommand);
//Get error indicator
int erroCode =
Convert.ToInt32(db.GetParameterValue(dbCommand, "@RETURN_VALUE"));
return erroCode;
}
}
How simple is that! I’ve replaced 3 calls to the database with one.
Conclusion
As you can see from the above example, using XML to send multiple rows of data to SQL Server is very simple to implement. That said, there are a number of tips and tricks that can be used to limit the impact shredding XML data has on SQL Server and I’ll discuss these in Part 5 – OpenXML and XQuery Optimisation Tips.