One more post for the day, just while I have the blogging bug\u2026<\/p>\n
Per my last post<\/a>, we saw that the query_optimizer_estimate_cardinality event can provide some interesting information in SQL Server 2014 around how cardinality estimates get calculated.<\/p>\n I have a setup I\u2019ve used in the past to demonstrate column correlation challenges (see my post, Exploring Column Correlation and Cardinality Estimates<\/a>, for the full example).\u00a0 I\u2019ll use that setup for this post as well, using the Credit database on SQL Server 2014 with the database compatibility level set to 120:<\/p>\n So let\u2019s look at the cardinality estimate for a specific operator in the following query (using RECOMPILE to get fresh query_optimizer_estimate_cardinality events on each execution):<\/p>\n Below is the abridged plan and the operator and estimates I\u2019m interested in:<\/p>\n In SQL Server 2014 CTP2 \u2013 we estimate 316.228 rows for the Clustered Index Scan against the member table \u2013 and the actual number of rows is 1,000.<\/p>\n Now if I flip my database compatibility mode to pre-120, I\u2019ll see the following actual vs. estimated instead:<\/p>\n Before SQL Server 2014, the estimate for\u00a0 “[city] = ‘Minneapolis’ AND\u00a0 [state_prov] = ‘MN'” is 100 rows \u2013 and if we added the supporting statistics or index on city and state_prov, we\u2019ll get a better estimate (but there are limitations to how useful this can be \u2013 another topic altogether, so see this post<\/a>).\u00a0 We know that, in this example, the two columns are correlated.\u00a0 But we would need to help SQL Server if we didn\u2019t want it to assume each predicate was independent.<\/p>\n Anyhow \u2013 the original issue pre SQL Server 2014 was that our city and state_prov columns were seen as independent, and so we saw our estimates reflect this (10% of row estimate for city multiplied by 10% of row estimate for state_prov).\u00a0 So we wind up with an underestimate.<\/p>\n Now in SQL Server 2014 CTP2, with the Credit database compatibility mode put back to 120, we see the skew still exists, but the gap isn\u2019t as extreme \u2013 showing 316.228 instead of 100 as an estimate.\u00a0 Still incorrect, but less so.<\/p>\n Now putting this in Excel, we can see the following:<\/p>\n Earlier we updated 10% of the 10,000 rows for various city\/state combinations, so what\u2019s up with the 32% for the state predicate selectivity?<\/p>\n That 32% is, using Excel formula style, =POWER((10%),1\/2).<\/p>\n And if we look at the query_optimizer_estimate_cardinality for the 2014 CTP2 version, we\u2019ll see the following calculator reference (abridged output):<\/p>\n And here is the full calculator blurb for the CSelCalcCombineFilters_ExponentialBackoff:<\/p>\n <CalculatorList> We see the selectivity of state_prov and city are both 0.100.\u00a0 And yet unlike the pre-2014 version that multiples the two selectivity percentages, we\u2019re not doing that in 2014 CTP2. The calculator name references \u201cExponentialBackoff\u201d \u2013 which according to Wikipedia<\/a> (yeah I know), is defined as \u201can algorithm that uses feedback to multiplicatively decrease the rate of some process, in order to gradually find an acceptable rate\u201d.<\/p>\n What does this mean in this specific case?<\/p>\n In absence of multi-column statistics for city and state_prov, our estimated rows is not underestimated nearly as much as it used to be pre-2014.<\/p>\n Homework assignment \u2013 add another member column to the mix that has a 0.100 selectivity (for example \u2013 county) and compare the estimates using a pre-2014 and 2014 CTP2 compatibility mode.<\/p>\n","protected":false},"excerpt":{"rendered":" One more post for the day, just while I have the blogging bug\u2026 Per my last post, we saw that the query_optimizer_estimate_cardinality event can provide some interesting information in SQL Server 2014 around how cardinality estimates get calculated. I have a setup I\u2019ve used in the past to demonstrate column correlation challenges (see my post, […]<\/p>\n","protected":false},"author":7,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[42,28],"tags":[],"class_list":["post-1084","post","type-post","status-publish","format-standard","hentry","category-cardinality-estimation","category-performance"],"yoast_head":"\n\r\nUSE [Credit];\r\nGO\r\n\r\n-- Modifying the Credit data to set up the DBA's scenario\r\nUPDATE\u00a0 [dbo].[member]\r\nSET\u00a0\u00a0\u00a0\u00a0 [city] = 'Minneapolis',\r\n [state_prov] = 'MN'\r\nWHERE\u00a0\u00a0 [member_no] % 10 = 0;\r\n\r\nUPDATE\u00a0 [dbo].[member]\r\nSET\u00a0\u00a0\u00a0\u00a0 [city] = 'New York',\r\n[state_prov] = 'NY'\r\nWHERE\u00a0\u00a0 [member_no] % 10 = 1;\r\n\r\nUPDATE\u00a0 [dbo].[member]\r\nSET\u00a0\u00a0\u00a0\u00a0 [city] = 'Chicago',\r\n[state_prov] = 'IL'\r\nWHERE\u00a0\u00a0 [member_no] % 10 = 2;\r\n\r\nUPDATE\u00a0 [dbo].[member]\r\nSET\u00a0\u00a0\u00a0\u00a0 [city] = 'Houston',\r\n[state_prov] = 'TX'\r\nWHERE\u00a0\u00a0 [member_no] % 10 = 3;\r\n\r\nUPDATE\u00a0 [dbo].[member]\r\nSET\u00a0\u00a0\u00a0\u00a0 [city] = 'Philadelphia',\r\n[state_prov] = 'PA'\r\nWHERE\u00a0\u00a0 [member_no] % 10 = 4;\r\n\r\nUPDATE\u00a0 [dbo].[member]\r\nSET\u00a0\u00a0\u00a0\u00a0 [city] = 'Phoenix',\r\n[state_prov] = 'AZ'\r\nWHERE\u00a0\u00a0 [member_no] % 10 = 5;\r\n\r\nUPDATE\u00a0 [dbo].[member]\r\nSET\u00a0\u00a0\u00a0\u00a0 [city] = 'San Antonio',\r\n[state_prov] = 'TX'\r\nWHERE\u00a0\u00a0 [member_no] % 10 = 6;\r\n\r\nUPDATE\u00a0 [dbo].[member]\r\nSET\u00a0\u00a0\u00a0\u00a0 [city] = 'San Diego',\r\n[state_prov] = 'CA'\r\nWHERE\u00a0\u00a0 [member_no] % 10 = 7;\r\n\r\nUPDATE\u00a0 [dbo].[member]\r\nSET\u00a0\u00a0\u00a0\u00a0 [city] = 'Dallas',\r\n[state_prov] = 'TX'\r\nWHERE\u00a0\u00a0 [member_no] % 10 = 8;\r\nGO\r\n<\/pre>\n
\r\nSELECT\u00a0 m.[lastname],\r\nm.[firstname],\r\nSUM(c.[charge_amt]) AS [Total_Charge_amt]\r\nFROM\u00a0\u00a0\u00a0 [dbo].[member] AS [m]\r\nINNER JOIN [dbo].[charge] AS [c] ON\r\nm.[member_no] = c.[member_no]\r\nWHERE\u00a0\u00a0 [city] = 'Minneapolis' AND\r\n[state_prov] = 'MN'\r\nGROUP BY m.[lastname],\r\nm.[firstname]\r\nOPTION (RECOMPILE);\r\nGO\r\n<\/pre>\n
<\/a><\/p>\n
<\/a><\/p>\n
<\/a><\/p>\n
<\/a><\/p>\n
<\/a><\/p>\n
\n<FilterCalculator CalculatorName=”CSelCalcCombineFilters_ExponentialBackoff<\/strong>” Conjunction=”true”>
\n<SubCalculator>
\n<FilterCalculator CalculatorName=”CSelCalcColumnInInterval” Selectivity=”0.100″ TableName=”[m]” ColumnName=”state_prov” StatId=”5″ \/>
\n<FilterCalculator CalculatorName=”CSelCalcColumnInInterval” Selectivity=”0.100″ TableName=”[m]” ColumnName=”city” StatId=”4″ \/>
\n<\/SubCalculator>
\n<\/FilterCalculator>
\n<\/CalculatorList><\/p><\/blockquote>\n