I’ve been doing some investigation of the new cardinality estimator in SQL Server 2014. I’ve always thought the best way to see how the estimation process worked was through the query_optimizer_estimate_cardinality XEvent. I wrote about this event in an earlier CTP1 post. Joe Sack used this event to show how the new estimator uses techniques to improve the query plan, I’d encourage you to read his series of posts. If you don’t know what a cardinality estimator is or why you’d care, Ben Nevarez’ article is a nice intro. And, as Joe says and I agree, documentation on the XEvent is sparse.

Lately, I thought it would be nice to demonstrate how the cardinality information emitted by the XEvent shows up in the query plan, and have been studying a number of plans v. event traces to find a pattern by brute force. But it was right in front of my face… While setting up the XEvent trace (again), I happened upon this description one of the event’s fields.

stats_collection_id – “ID of the stats collection generated for the input. This same ID will also appear in ShowplanXML as an attribute of the RelOp element when the event is enabled.”

Ohhh. So, you need to enable the event (i.e. turn on the XEvent trace), then run the query with the query plan turned on. Turning on either an estimated plan or actual plan will work, but actual plan is the best because you can see the estimated v. actual rows affected, and see not only whether the estimate was good, but how the cardinality estimator arrived at the estimate.

And when you do this, each plan iterator contains a new property, StatsCollectionId. Which corresponds to stats_collection_id in your XEvent trace. You need to show the properties window (click on the iterator to select it, then hit F4) to see this property. Or look in the XML directly (and we know how much fun that is…). So, graphic showplan and F4 is your friend.

The first time I tried this, with a really simple query, I was really confused. The plan consisted of a single ClusteredIndexScan, and that iterator had a StatsCollectionId of 1. The XEvent trace had 5 events with stats_collection_id of 2,2,3,3 and 5. Hmmm…

Then I tried a more complex plan, figuring that I had more chances to see some correlation. More iterators to try. Sure enough. There are no cardinality estimate rows for a scan. Because they don’t estimate cardinality for a scan, they know how many rows there are in each table or index. So you won’t find correlation between XEvent and scan iterator, or I just haven’t seen it so far. On every other iterator in the plan, there’s nice correlation. OK, now that we have correlation, let’s extract the information from the event.

There’s three main fields, defined in the event description like this:
1. input_relation – “The input relation on which cardinality is being updated”. This is, which plan logical operator (e.g. LogOp_GbAgg) is being estimated, what statistics were used as input.
2. calculator – “The strategy used for the estimate”. Which calculator was used to calculated cardinality. BTW, it is perfectly valid for this fields to be empty (empty Calculator element), guess that means they didn’t need a cardinality estimation strategy.
3. stats_collection – “The stats collection generated for the input”. What type of stats were used and which stats were loaded.

So each record in the trace represents Input -> Calculator -> Output …And you can read your way through a whole plan/events pair this way.

This post is getting kind of long (there will be more to come) and I’ll end with the observation that there are lots more XEvents in the trace than there are iterators in the plan. What’s the extra ones for? At this point I’m thinking that, since this information is gathered as the plan is being produced, these events represent intermediate steps in the estimation process, and only one step is tied to the plan iterator. But that’s a guess for now.

For your enjoyment, I’ve included an example with a SQL query, query plan, XEvent trace, and SQL/XML query to take apart the XEvent trace and look at what I thought were relevent subelements and attributes of the event fields. There are also some information and samples in my previous post, and in Joe’s posts, to get you started.

Cheers, @bobbeauch

FiveTableJoin_query.zip