About a month ago, there was another chapter in the long discussion about DataSets and web services. An old cohort Scott Hanselman fired off the first salvo, which was taken up by Ted Neward. Doug Purdy responds that Purchase Orders are root of all evil and lots of other chimed in. I'm just catching up…

This comes up again as I work with SQL Server 2005 web services. They too produce “SqlRowset == Schema + Any" in WSDL. Both XML and object afficianados tend to dislike this, but it exactly reflects the relational resultset, rowset, whatever you'd like to call it.

Relational results are columns and rows, where each row contains the same number of columns and each column contains the same data type for each row. Been around for a while now. The resultset is a "generic container class" or special 2-dimensional array if you want to think in those terms, in which each instance can contain different types, but the shape is rectangular. Resultsets, looking at an Open Data Services program which makes the protocol explicit, are preceeded by a sequence of describe packets that contain the name, data type, and other information about each column. So "Schema + Any" means exactly that. *Instance* schema, followed by the instance it describes. Corresponds quite nicely with relational databases, where the majority of corporate data is stored.

The problems stem from the fact that lots of web services toolkits, that are doing XML-to-Object mapping, don't recognize this paradigm. They are looking for an individual (static) schema, not an instance (dynamic) schema. In addition, the XML Schema spec states that how you find a schema from an instance of an XML document is implementation dependent. It mentions a hint (xsi:schemaLocation) and some location strategies for schemas with namespaces that are URLs, but says processors don't have to support the hint. Most WSDL that uses "Schema + Any" doesn't provide the hint. With Microsoft XML processors, this may be because "Schema + Any" is part of the implementation, an "implementation dependent" location strategy.

There's a few ways to resolve this, if you're looking for strongly typed objects at compile time. One way is strongly-typed DataSets, which fetch out the information at compile time, make a DataSet wrapper class, and, for web services, write WSDL that amounts to "Any from a specific namespace", and import the namespace schema, which includes the strong type. Another way is to write your own WSDL for each resultset.

If you write a stored procedure in SQL Server that looks like this:

SELECT au_id, au_fname, au_lname FROM authors

you know exactly what that resultset will look like. Unless the DBA changes the column type, in which case you'll have other problems. Even if you use the "SELECT * FROM …" concept (which is not usually recommended) this works unless the DBA adds or deletes a column, or changes the order of definition. I haven't been able to find a metadata table in SQL Server (or in the ANSI/SQL spec) that lists the shape of all resultsets returned by stored procedures (there is one for *parameter* information) , so you have to use your knowledge of the resultset to handcraft the WSDL. In ADO.NET, you can use the DataSet's "WriteXmlSchema" method as a starter. Or use "SET FMTONLY ON;SET NO_BROWSETABLE ON;EXECUTE…". SQL Server 2005 web services that map stored procedure invocation to web service methods will let you specify a custom WSDL-generating stored procedure. Statically store the metadata (instance schemas) in the database or dynamic generate them at WSDL generation time.

BTW, this mapping of resultsets to XML schemas amounts to relational-to-XML mapping. So web service toolkits can accomplish their XML-to-object mapping. I'll return to this theme in future…