Monday, 23 November 2009

System.Data Serialisation Bug - Scenario 1


System.Data contains a bug which can cause it to throw an exception during serlisation or deserialisation of a DataSet which contains a column that uses System.Object as its .NET type.

The bug manifests itself in two known scenarios:

  • Serialising a DataSet containing a column of type System.Object using a System.Xml.XmlTextWriter instance created via the static Create method; this throws a System.ArgumentException.
  • Deserialising a DataSet containing a column of type System.Object after having successfully passed it across the wire via Windows Communication Foundation (WCF) using netTcp binding; this throws a System.Data.DataException.

I reported the bug to Microsoft back in 2007 whilst using .NET 2.0 and 3.0. At the time was told "it's definitely in our system for a Post Orcas release". The bug still exists in .NET Framework 3.5 SP1, although I haven't checked any of the .NET 4.0 betas yet.

This post covers the first scenario you might come across – simply serialising the DataSet with an XmlTextWriter. I'll do a follow-up next week covering the second scenario.

Reproducing the Bug

To demonstrate this behaviour we're going to need a DataSet with a System.Object column, and we're going to need to populate it with some data.

static System.Data.DataSet CreateAndPopulateDataSet()
  // create a DataSet containing two columns: a string (Key) and an object (Value)
  System.Data.DataSet dataSet = new System.Data.DataSet();
  System.Data.DataTable dataTable = dataSet.Tables.Add();
  dataTable.Columns.Add("Key", typeof(string));
  dataTable.Columns.Add("Value", typeof(object));

  // add two rows, one containing an integer Value, and one containing a string Value
  dataTable.LoadDataRow(new object[] { "foo", 42 }, true);
  dataTable.LoadDataRow(new object[] { "bar", "Hello World" }, true);

  // return the DataSet
  return dataSet;

We can serialise this to XML easily using a System.Xml.XmlTextWriter:

static void SerializeDataSetUsingXmlTextWriter(System.Data.DataSet dataSet)
  // serialise the DataSet into a memoryStream using an XmlTextWriter
  System.IO.MemoryStream memoryStream = new System.IO.MemoryStream();
  System.Xml.XmlWriter xmlWriter = System.Xml.XmlTextWriter.Create(memoryStream);

  // write the contents of the MemoryStream to the console

We can plug the two methods above into a console application, together with the following entry-point, to test the serialisation.

static void Main(string[] args)
    // create and populate the DataSet we'll use for the test
    System.Data.DataSet dataSet = CreateAndPopulateDataSet();

    // serialise the DataSet
  catch (System.Exception ex)

  // pause

Doing so yields the following result:

System.ArgumentException: Invalid name character in 'xmlns:xs'. The ':' characte
r, hexadecimal value 0x3A, cannot be included in a name.
   at System.Xml.XmlWellFormedWriter.CheckNCName(String ncname)
   at System.Xml.XmlWellFormedWriter.WriteStartAttribute(String prefix, String l
ocalName, String namespaceName)
   at System.Data.DataTextWriter.WriteStartAttribute(String prefix, String local
Name, String ns)
   at System.Xml.XmlWriter.WriteAttributeString(String localName, String value)
   at System.Data.XmlDataTreeWriter.XmlDataRowWriter(DataRow row, String encoded
   at System.Data.XmlDataTreeWriter.Save(XmlWriter xw, Boolean writeSchema)
   at System.Data.DataSet.WriteXml(XmlWriter writer, XmlWriteMode mode)
   at System.Data.DataSet.WriteXml(XmlWriter writer)
   at SystemDataSerialisationBugScenario1.Program.SerializeDataSetUsingXmlTextWr
iter(DataSet dataSet) in C:\Users\Ian.Picknell\Documents\Blog\System.Data Serial
isation Bug\SystemDataSerialisationBugScenario1\Program.cs:line 31
   at SystemDataSerialisationBugScenario1.Program.Main(String[] args) in C:\User
s\Ian.Picknell\Documents\Blog\System.Data Serialisation Bug\SystemDataSerialisat
ionBugScenario1\Program.cs:line 45

Understanding the Bug

So, what's going on here? Well, the exception occurred during System.Xml.XmlWellFormedWriter.CheckNCName which is clearly upset at finding a ':' character within the name 'xmlns:xs'. But 'xmlns:xs' isn't actually a name – the name is 'xs' which is within the namespace identified by the prefix 'xmlns'. Looking down the stack trace we can see a call to System.Xml.XmlWriter.WriteAttributeString(String localName, String value). We don't know what was being passed to this method, but by looking the documentation for System.Xml.XmlWriter.WriteAttributeString we can see that it has several overloads:

WriteAttributeString(String, String)When overridden in a derived class, writes out the attribute with the specified local name and value.
WriteAttributeString(String, String, String)When overridden in a derived class, writes an attribute with the specified local name, namespace URI, and value.
WriteAttributeString(String, String, String, String)When overridden in a derived class, writes out the attribute with the specified prefix, local name, namespace URI, and value.

Note that the overload which takes two strings (which is the overload being called) is clearly designed to accept a local name and a value. We could have guessed that from the parameter names localName and value in the stack trace, but it's nice to have this backed-up by the documentation. We've clearly got 'xmlns:xs' being passed around somewhere, which clearly isn't a local name – it's a namespace prefix and colon and a local name. Let's fire-up .NET Reflector and see what's going on in the call to System.Xml.XmlWriter.WriteAttributeString, which we can see from the stack trace occurs within System.Data.XmlDataTreeWriter.XmlDataRowWriter. So we're looking for calls to System.Xml.XmlWriter.WriteAttributeString from within System.Data.XmlDataTreeWriter.XmlDataRowWriter. There are actually quite a few such calls, so I won't list them all. They all look perfectly reasonable and innocent until we get to the last one:

this._xmlw.WriteAttributeString("xmlns:xs", "");

Now, that certainly looks wrong. As it's the two-string overload which is being called, the first parameter is expected to be a local name and it's clearly not. This is the bug.

Working Around the Bug

The solution is actually quite simple. In the SerializeDataSetUsingXmlTextWriter method we used to re-produce the bug, we created our System.Xml.XmlWriter instance using the static Create method of System.Xml.XmlTextWriter, as has been the recommended practice since .NET 2.0. But we can still create a System.Xml.XmlWriter using one of System.Xml.XmlTextWriter's instance constructors. So we simply replace this line in SerializeDataSetUsingXmlTextWriter:

System.Xml.XmlWriter xmlWriter = System.Xml.XmlTextWriter.Create(memoryStream);

with this one:

System.Xml.XmlWriter xmlWriter = new System.Xml.XmlTextWriter(memoryStream, null);

Running the program with this single change now produces the following output:

<NewDataSet><Table1><Key>foo</Key><Value xsi:type="xs:int" xmlns:xs="http://www." xmlns:xsi="">42<
/Value></Table1><Table1><Key>bar</Key><Value xsi:type="xs:string" xmlns:xs="http
://" xmlns:xsi="
ce">Hello World</Value></Table1></NewDataSet>

That's better isn't it? Note that System.Data.XmlDataTreeWriter.XmlDataRowWrite is still calling the wrong overload of System.Xml.XmlWriter.WriteAttributeString; it's just that a System.Xml.XmlTextWriter created via the instance constructor is more forgiving than one created using the static Create method.

In my next post I'll describe a second scenario where this bug can manifest itself (using the netTcp binding within WCF) where the workaround provided above cannot be used.

See Also

1 comment:

  1. Thanks for this. Appears to still be the same bug in .net 4