This post is a continuation of the SQL Server 2014 Cardinality Estimator enhancements exploration series:

In previous posts for this series, I discussed how the assumption of independence with regards to multiple predicates against the same table, in absence of multi-column stats, is blunted a bit with the new Cardinality Estimator for SQL Server 2014.  So your estimates may increase for this scenario.

On the flip-side, when it comes to joins between two tables, you may see a reduction in join estimate values for scenarios where there are non-join filter predicates on the tables being joined.

Take the following example, using pre-SQL Server 2014 CE behavior:

USE [master]


USE [Credit];

SELECT  [m].[member_no] ,
[m].[lastname] ,
[p].[payment_no] ,
[p].[payment_dt] ,
FROM    dbo.[member] AS [m]
INNER JOIN dbo.[payment] AS [p]
ON      [m].[member_no] = p.[member_no]
WHERE   [m].[region_no] = 2
AND [p].[payment_dt] = '1999-09-02 00:00:00.000'

The SQL Sentry Plan Explorer plan tree view is as follows:

image thumb Non Join Cross Table Predicate Correlation Changes

We see that the estimates are spot-on for the Clustered Index Scan and Table Scan, but we have an over-estimate for the Hash Match operation (1,767 estimated vs. 1,003 actual).

Now if I set the database compatibility level to 120 and re-execute the query, here is what we see instead:

image thumb1 Non Join Cross Table Predicate Correlation Changes

We still have identical estimate vs. actual values for the Clustered Index Scan and Table Scan, and now our over-estimate for the Hash Match is less pronounced (1,140 rows estimated instead of the 1,767 rows previously estimated).

For the pre-SQL Server 2014 cardinality estimation process, the assumption is that the non-join predicates for the two tables are somehow correlated (in our example, region_no = 2 and payment_dt = ’1999-09-02 00:00:00.000′).  This is called “Simple Containment”. For the new Cardinality Estimator, these non-join predicates are assumed to be independent (called “Base Containment”), and so this can translate into a reduced row estimate for the join.