This post is a continuation of the SQL Server 2014 Cardinality Estimator enhancements exploration series:
- A first look at the query_optimizer_estimate_cardinality XE event
- “CSelCalcCombineFilters_ExponentialBackoff” Calculator
- “CSelCalcCombineFilters_ExponentialBackoff” Calculator– Part II
- The CSelCalcAscendingKeyFilter Calculator
- Cardinality Estimation Model Version
- Comparing Root-Level Skews in the new Cardinality Estimator
- Using Legacy Methods to Lessen SQL Server 2014 Cardinality Estimator Skews
Continuing the subject of exponential backoffs (from the 2nd and 3rd posts in this series), let’s restore the Credit sample database back to the baseline version and execute the following script:
USE [master]; GO ALTER DATABASE [Credit] SET COMPATIBILITY_LEVEL = 120; GO USE [Credit]; GO -- Add four new columns ALTER TABLE [dbo].[member] ADD [arbitrary_1] BIGINT NULL; ALTER TABLE [dbo].[member] ADD [arbitrary_2] BIGINT NULL; ALTER TABLE [dbo].[member] ADD [arbitrary_3] BIGINT NULL; ALTER TABLE [dbo].[member] ADD [arbitrary_4] BIGINT NULL;
I changed the database to the latest version so we use the new CE and then added four new columns.
Next, let’s update the values of the four new columns using different distributions:
;WITH CTE_NTILE AS ( SELECT [member_no] , NTILE(10) OVER ( ORDER BY [member_no] DESC ) AS [arbitrary_1] , NTILE(2) OVER ( ORDER BY [member_no] DESC ) AS [arbitrary_2] , NTILE(4) OVER ( ORDER BY [member_no] DESC ) AS [arbitrary_3] , NTILE(250) OVER ( ORDER BY [member_no] DESC ) AS [arbitrary_4] FROM [dbo].[member] ) UPDATE [dbo].[member] SET [arbitrary_1] = [c].[arbitrary_1] , [arbitrary_2] = [c].[arbitrary_2] , [arbitrary_3] = [c].[arbitrary_3] , [arbitrary_4] = [c].[arbitrary_4] FROM [dbo].[member] AS [m] INNER JOIN CTE_NTILE AS [c] ON [c].[member_no] = [m].[member_no]; GO
Looking at the estimates for single-predicate queries, if I execute the following, I’ll get an estimate of 1,000 rows:
SELECT [member_no] FROM [dbo].[member] WHERE [arbitrary_1] = 1 OPTION ( RECOMPILE );
For this next query I’ll get an estimate of 5,000 rows:
SELECT [member_no] FROM [dbo].[member] WHERE [arbitrary_2] = 1 OPTION ( RECOMPILE );
And for this next query, an estimate of 2,500 rows:
SELECT [member_no] FROM [dbo].[member] WHERE [arbitrary_3] = 1 OPTION ( RECOMPILE );
And lastly (for single-predicate examples anyhow), an estimate of 40 rows:
SELECT [member_no] FROM [dbo].[member] WHERE [arbitrary_4] = 1 OPTION ( RECOMPILE );
Now let’s start adding multiple predicates per statement. The first example with multiple predicates uses two predicates – one with a selectivity of 0.1 and one of 0.5:
SELECT [member_no] FROM [dbo].[member] WHERE [arbitrary_1] = 1 AND -- 0.1 selectivity [arbitrary_2] = 1 -- 0.5 selectivity OPTION ( RECOMPILE );
The estimate for this query is 707.107 with the new CE, which we can derive using the POWER function in T-SQL as follows (I used Excel last time to do this, so see the previous posts for the background information on this calculation):
SELECT 10000 *0.10 * POWER(0.500000, 0.50);
That returned 707.107.
Now what about a query with three predicates, with selectivities of 0.1, 0.5 and 0.25?
SELECT [member_no] FROM [dbo].[member] WHERE [arbitrary_1] = 1 AND -- .1 selectivity [arbitrary_2] = 1 AND -- .5 selectivity [arbitrary_3] = 1 -- .25 selectivity OPTION ( RECOMPILE );
The estimate for this was 420.448, and we can derive this via the following expression (and notice the order of selectivities goes from smallest to highest):
-- Notice the selectivity order (0.10, 0.25, .50) SELECT 10000 * 0.10 * POWER(0.250000,0.50) * POWER(0.500000, 0.25);
Now let’s reference all four columns (with selectivities of 0.1, 0.5, 0.25 and 0.004):
SELECT [member_no] FROM [dbo].[member] WHERE [arbitrary_1] = 1 AND -- .1 selectivity [arbitrary_2] = 1 AND -- .5 selectivity [arbitrary_3] = 1 AND -- .25 selectivity [arbitrary_4] = 1 -- 0.004 selectivity OPTION ( RECOMPILE );
The estimate is 8.20193 and we can derive this via the following:
SELECT 10000 * 0.004* POWER(0.1000000, 0.50) * POWER(0.2500000, 0.25) * POWER(0.5000000, 0.125);
The selectivities are ordered from most selective to least selective, and the the less selective values get the “back offs” in order of none, 1/2, 1/4, and 1/8.