Thursday, 9 April 2009

Using ShouldSerialize for conditional omission of properties in the XmlSerializer

Wow

I've just found out about the undocumented ShouldSerialize technique for XmlSerializers in .NET. Possibly the most useful System.Xml discovery I've made for years. Basically, while it's not very OO (but when is codegen particularly OO?) it allows you to provide programattic control of when a property should be serialized to the XML result stream inside the generated XmlSerializer for that type.




Here's a link to the only MSDN article I've seen on the subject http://msdn.microsoft.com/en-us/library/53b8022e(VS.71).aspx

Why

You have to comply with an existing XML consumer (probably bespoke and third party) that isn't tolerant of fully schema-bound XML, and you want to use the built-in XmlSerializer in .NET. And why wouldn't you? it's extremely powerful and configurable - and there's a huge sense of satisfaction to be gained from getting it working with some esoteric consumer - done well, your code is clean, readable and functional - far better than messing around with StringBuilders and XmlWriters.




Now I've gotten quite good at manipulating the XmlSerializer over the years, so it tends to be my first port of call when I need to generate some XML for some reason. Who wouldn't want to replace gobs and gobs of String concatenation code with three lines of a call to a serializer. It's clear what you're doing and far far easier to maintain. The point being that XML is structured data, it's not just text, just because it can be represented that way.

The Scenario

I'm trying to automatically Serialize some CAML to send to SharePoint. Specifically a Query which looks something like this:

<Query xmlns="http://schemas.microsoft.com/sharepoint/soap/">
  <OrderBy>
    <FieldRef Name="Title" />
  </OrderBy>
</Query>

(1) Required XML from the serializer

I have a Query class which contains a List<FieldRef> called OrderBy. So far so good. Now, I also added a subclass of List<FieldRef> called GroupBy, adding the extra property Collapse, which you can see the schema requires. Now consider how I get the XML above from the serializer. My API looks something like this:


Query query = new Query();
query.OrderBy.Add(
new FieldRef ("Title"));

(2) API to build the XML in (1) above.

so I default the GroupBy and OrderBy properties to the Query to new instances, but since I haven't added a GroupBy, when I run I'll get this XML back.

<Query xmlns="http://schemas.microsoft.com/sharepoint/soap/">
  <OrderBy>
    <FieldRef Name="Title" />
  </OrderBy>
  <GroupBy />
</Query>

(3) XML generated by the serializer under normal circumstances.


See the extra GroupBy? That's no good. So how do we get rid of this empty element? Ok, I could annotate the GroupBy property with [DefaultValue(null)] and have the GroupBy property instantiate lazily. That's all well and good, and would work... until we want to remove a FieldRef from the list, leaving it empty. Same problem, the list isn't null although it's empty, so it serializes and we get the XML (3) above.




The problem is that DefaultValue doesn't allow conditional evaluation. What we need is something that behaves like the DefaultValueAttribute which tells the Serializer to skip the property when it has that value, but that does allow conditional evaluation. Of course we could performs some weird hacks, checking for an empty list in the property getter and returning null... that would keep the serializer happy, but it would break the API, forcing us to explicitly create the list from client code, and that's something easily forgotten leading to potential errors.



Enter ShouldSerialize. It turns out that creating a public boolean method called ShouldSerialize[PropertyName] in the serializable class tells the generated XmlSerializer to call that method to determine whether it should try to serialize the property or not. So, I create a method called ShouldSerializeGroupBy in my Query class, which checks for null or empty, returning false in that case, and BAM! tests pass, and I am happy. So my Query class now looks like this below.


[
XmlRoot("Query", Namespace = Namespaces.SharePointSoap),
XmlType("Query", Namespace = Namespaces.SharePointSoap),
Serializable
]
public class Query
{
private List orderBy = new List();
private GroupBy groupBy = new GroupBy();

[XmlArray("OrderBy")]
public List OrderBy
{
get { return orderBy; }
set { orderBy = value; }
}

public bool ShouldSerializeGroupBy()
{
return groupBy != null && groupBy.Count > 0;
}

[XmlArray("GroupBy"), DefaultValue(null)]
public GroupBy GroupBy
{
get { return groupBy; }
set { groupBy = value; }
}
}


Hope this helps you. It's saved me a lot of trouble. TTFN :)