Haven't been blogging as much recently, as I've been teaching and travelling quite a bit. Last week though, I received an interesting query on XQuery query plans (that's more uses of the word query in one sentence than I've seen before). We were looking at SQL query plans when XQuery is involved. I'd remembered that:

1. The primary XML index builds a node table over an XML column
2. If there is no XML index, the node table must be built at query time

This seems to quite a big effect on number of estimated rows. Take a 1-row table with an untyped XML column. The row contains a fairly simple document from the W3C XQuery use cases. Let's do a simple query (like /BookStore/Book) over the document.

Without the primary XML index, the execution plan contains 3 UDX expressions, two of them have a large number of estimated rows, 1000 and 10000 estimated rows. The plan step that estimates 1000 rows returns 3 actual rows; the 10000 estimate step returns 114 rows. On the other hand, once the primary index is added, these two UDX expressions are replaced by clustered index seeks, with a fairly close estimate of rows vs the actual rows.

Looks like, if you're going to be doing any non-trivial amount of queries against the XML data type a primary XML index is pretty close to being a requirement. Building it as an index bodes much better than building it at execution time.