Non-Join Cross-Table Predicate Correlation Changes

This post is a continuation of the SQL Server 2014 Cardinality Estimator enhancements exploration series:

In previous posts for this series, I discussed how the assumption of independence with regards to multiple predicates against the same table, in absence of multi-column stats, is blunted a bit with the new Cardinality Estimator for SQL Server 2014. So your estimates may increase for this scenario.

On the flip-side, when it comes to joins between two tables, you may see a reduction in join estimate values for scenarios where there are non-join filter predicates on the tables being joined.

Take the following example, using pre-SQL Server 2014 CE behavior:

USE [master]
GO

ALTER DATABASE [Credit] SET COMPATIBILITY_LEVEL = 110
GO

USE [Credit];
GO

SELECT  [m].[member_no] ,
[m].[lastname] ,
[p].[payment_no] ,
[p].[payment_dt] ,
[p].[payment_amt]
FROM    dbo.[member] AS [m]
INNER JOIN dbo.[payment] AS [p]
ON      [m].[member_no] = p.[member_no]
WHERE   [m].[region_no] = 2
AND [p].[payment_dt] = '1999-09-02 00:00:00.000'
OPTION  ( RECOMPILE );
GO

The SQL Sentry Plan Explorer plan tree view is as follows:

We see that the estimates are spot-on for the Clustered Index Scan and Table Scan, but we have an over-estimate for the Hash Match operation (1,767 estimated vs. 1,003 actual).

Now if I set the database compatibility level to 120 and re-execute the query, here is what we see instead:

We still have identical estimate vs. actual values for the Clustered Index Scan and Table Scan, and now our over-estimate for the Hash Match is less pronounced (1,140 rows estimated instead of the 1,767 rows previously estimated).

For the pre-SQL Server 2014 cardinality estimation process, the assumption is that the non-join predicates for the two tables are somehow correlated (in our example, region_no = 2 and payment_dt = ‘1999-09-02 00:00:00.000’). This is called “Simple Containment”. For the new Cardinality Estimator, these non-join predicates are assumed to be independent (called “Base Containment”), and so this can translate into a reduced row estimate for the join.

Other articles

New Course: “SQL Server: Common Query Tuning Problems and Solutions – Part 2″

Today, Pluralsight published my latest course, “SQL Server: Common Query Tuning Problems and Solutions – Part 2“. Part two is 2 hours and 17 minutes long

New Course: “SQL Server: Common Query Tuning Problems and Solutions – Part 1”

Pluralsight published my latest course, “SQL Server: Common Query Tuning Problems and Solutions – Part 1“. Part 1 is 2 hours and 31 minutes and

SQLIntersection Post-Conference Session

I am really happy to announce that Kimberly Tripp and I will be delivering a post-conference in November’s SQLIntersection conference at the MGM Grand. The

Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator

I’ve been working on writing a new Microsoft white paper since January covering the main changes made in the SQL Server 2014 Cardinality Estimator, and

Finding Timeout-Prone Code with max_elapsed_time

Consider a scenario where your application timeout setting is 30 seconds and you’ve been asked to proactively start tuning workloads that are at risk of

Deck from “Practical SQL Server Cardinality Estimation”

I uploaded the deck from my session for SQLSaturday #287 Madison 2014 and you can download it here. It was an excellent event! Great organizers,

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.