sqlskills-logo-2015-white.png

Non-Join Cross-Table Predicate Correlation Changes

This post is a continuation of the SQL Server 2014 Cardinality Estimator enhancements exploration series:

In previous posts for this series, I discussed how the assumption of independence with regards to multiple predicates against the same table, in absence of multi-column stats, is blunted a bit with the new Cardinality Estimator for SQL Server 2014.  So your estimates may increase for this scenario.

On the flip-side, when it comes to joins between two tables, you may see a reduction in join estimate values for scenarios where there are non-join filter predicates on the tables being joined.

Take the following example, using pre-SQL Server 2014 CE behavior:

USE [master]
GO

ALTER DATABASE [Credit] SET COMPATIBILITY_LEVEL = 110
GO

USE [Credit];
GO

SELECT  [m].[member_no] ,
[m].[lastname] ,
[p].[payment_no] ,
[p].[payment_dt] ,
[p].[payment_amt]
FROM    dbo.[member] AS [m]
INNER JOIN dbo.[payment] AS [p]
ON      [m].[member_no] = p.[member_no]
WHERE   [m].[region_no] = 2
AND [p].[payment_dt] = '1999-09-02 00:00:00.000'
OPTION  ( RECOMPILE );
GO

The SQL Sentry Plan Explorer plan tree view is as follows:

image

We see that the estimates are spot-on for the Clustered Index Scan and Table Scan, but we have an over-estimate for the Hash Match operation (1,767 estimated vs. 1,003 actual).

Now if I set the database compatibility level to 120 and re-execute the query, here is what we see instead:

image

We still have identical estimate vs. actual values for the Clustered Index Scan and Table Scan, and now our over-estimate for the Hash Match is less pronounced (1,140 rows estimated instead of the 1,767 rows previously estimated).

For the pre-SQL Server 2014 cardinality estimation process, the assumption is that the non-join predicates for the two tables are somehow correlated (in our example, region_no = 2 and payment_dt = ‘1999-09-02 00:00:00.000’).  This is called “Simple Containment”. For the new Cardinality Estimator, these non-join predicates are assumed to be independent (called “Base Containment”), and so this can translate into a reduced row estimate for the join.

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.