SQL Server 2012 introduces the new inaccurate_cardinality_estimate extended event, and I had a few miscellaneous thoughts about it that I wanted to discuss on this post.

Background reading

If you’re unfamiliar with this event, there isn’t anything as of today in Books Online, but there are two blog references / discussions:

How the query execution engine calculates the threshold it uses to detect inaccurate cardinality estimation and fire the inaccurate_cardinality_estimate extended event? – By my friend and former PFE colleague Nacho Alonso Portillo (blog)

Extended Events – inaccurate_cardinality_estimate – David Ballantyne’s (blog | @davebally)

Give these both a read before digging in further on this event.  The event output of “actual_rows” and “estimated_rows” can be a bit confusing at first, so Nacho walks through the algorithm for when this event fires.

When should (or could) you use this event?

The primary tool I use for detecting cardinality estimates is in using the Actual (not estimated) query execution plan.  Pulling a plan from cache isn’t helpful, because it is just the estimated plan, so the best scenario for troubleshooting suspected cardinality estimate issues is to generate the actual plan.

And yet, there are scenarios where it isn’t practical or possible.  For example – what if the suspected cardinality estimate is related to a long running data modification statement?  And what if the issue only occurs in production?  You may not be able to actually capture the Actual plan in a contained way in SQL Server Management Studio without polluting production data. And yes, you could capture the plan via “Showplan XML Statistics Profile” or “Showplan Statistics Profile” – but at what cost to your production activity (remember the “observer overhead”).  Those are events that you should be very careful about capturing in a production environment.

But this leads us to a similar cost/benefit discussion of the inaccurate_cardinality_estimate event.  The description from Microsoft of this event is very clear about it’s overhead:

“Occurs when an operator outputs significantly more rows than estimated by the Query Optimizer. Use this event to identify queries that may be using sub-optimal plans due to cardinality estimate inaccuracy. Using this event can have a significant performance overhead so it should only be used when troubleshooting or monitoring specific problems for brief periods of time.”

This event can help us spot cardinality estimate issues for live activity – and even map to the nodes in the query plan tree (more on this later) – but Microsoft is very clear that there can be “significant” performance overhead.  This is similar to the warning about pulling the execution plan too, by the way: “Occurs after a SQL statement is executed. This event returns an XML representation of the actual query plan. Using this event can have a significant performance overhead so it should only be used when troubleshooting or monitoring specific problems for brief periods of time.”.

So if you’re going to use this new inaccurate_cardinality_estimate event or other execution plan related events, don’t keep this running for a long period of time – and think carefully before enabling this on very high throughput systems or those systems running near to capacity.  What is the overhead percentage?  I haven’t tested this specifically using the inaccurate_cardinality_estimate extended in isolation, but I may do so eventually.  It isn’t urgent, as Jonathan Kehayias (blog | @sqlpoolboy) has already done plenty of work in this area (see Measuring “Observer Overhead” of SQL Trace vs. Extended Events”) to establish overhead with even some of the more standard events.

Where is inaccurate_cardinality_estimate in the GUI?  I don’t see it.

David Ballantyne (blog | @davebally) created a Connect item on this subject, and Microsoft noted that this was a debug event, requiring the debug channel to be enabled (checked off) in order to see it.

snaghtmlfbfd75 thumb Thinking about the inaccurate cardinality estimate Event

And of course, you can skip the New Session Wizard and just reference it directly in your CREATE EVENT SESSION definition.  For example:

CREATE EVENT SESSION [Track_CE_Issues] ON SERVER
ADD EVENT sqlserver.inaccurate_cardinality_estimate
ADD TARGET package0.ring_buffer
WITH (STARTUP_STATE=ON);
GO

What’s this “nodes” thing you mentioned earlier?

I’ll demonstrate with the following query in the Credit database that has a significant enough cardinality estimate skew to trigger the inaccurate_cardinality_estimate event:

USE [Credit];
GO

DECLARE @Column INT = 2,
    @Value INT = 10;

SELECT  [member].[member_no],
        [member].[street],
        [member].[city],
        [charge].[charge_no],
        [charge].[provider_no],
        [charge].[category_no],
        [charge].[charge_dt],
        [charge].[charge_amt],
        [charge].[charge_code]
FROM    [dbo].[charge]
INNER JOIN [dbo].[member]
        ON [member].[member_no] = [charge].[member_no]
WHERE   CHOOSE(@Column, [charge].[provider_no], [charge].[category_no]) = @Value;
GO

SQL Sentry Plan Explorer shows the following:

image thumb Thinking about the inaccurate cardinality estimate Event

I added in Node ID in the Plan Tree tab for cross reference.

As for the captured inaccurate_cardinality_estimate event data, we see the following:

image thumb Thinking about the inaccurate cardinality estimate Event

7 event rows were captured with seven pieces of information, including the estimated vs. actual rows (and again, read Nacho’s blog for the reason why actual_rows doesn’t match the final “actual rows” value).  We also see the node_id and thread_id associated with each event. (As an aside – if you’re wondering about the estimated_rows (4532 vs. 36257, for example) – note that this was a parallel execution plan which used 8 threads.)

You see that the operators associated with the firing include node_id 5 (Clustered Index Scan on [charge].[ChargePK]) and node_id 1 (Hash Match).  Also notice that while there are skews for the parallelism-related operators, they didn’t fire inaccurate_cardinality_estimate, which seems acceptable to me because the skew I really care about originates at node_id 5.

image thumb Thinking about the inaccurate cardinality estimate Event

Anything else?

This post was just a mix of thoughts about this new event.  Time will tell if it is practical (or safe) enough to use with any significant frequency.