{"id":2850,"date":"2020-01-19T18:43:23","date_gmt":"2020-01-20T02:43:23","guid":{"rendered":"http:\/\/3.209.169.194\/blogs\/kimberly\/?p=2850"},"modified":"2020-01-21T12:44:55","modified_gmt":"2020-01-21T20:44:55","slug":"sql101_whystatisticsareimportant","status":"publish","type":"post","link":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/sql101_whystatisticsareimportant\/","title":{"rendered":"SQLskills SQL101: Why are Statistics so Important?"},"content":{"rendered":"<p>In my years working with SQL Server, I\u2019ve found there are a few topics that are often ignored. Ignored because people fear them; they think they\u2019re harder than they actually are OR they think they\u2019re not important. Sometimes they even think \u2013 I don\u2019t need to know that because SQL Server \u201cdoes it for me.\u201d I\u2019ve heard this about indexes. I\u2019ve heard this about statistics.<\/p>\n<p>So, let\u2019s talk about why statistics are so important and why a minimal amount of knowledge of their importance can make a HUGE difference to your query\u2019s performance.<\/p>\n<p>There are SOME aspects to statistics that ARE handled automatically:<\/p>\n<ul>\n<li>SQL Server automatically creates statistics for indexes<\/li>\n<li>SQL Server automatically creates statistics for columns \u2013 when SQL Server NEEDS more information to optimize your query\n<ul>\n<li>IMPORTANT: This only occurs when the database option: auto_create_statistics is ON; this option is on by default in ALL versions\/editions of SQL Server. However, there are some products\/guidance out there that recommends that you turn this off. I completely DISAGREE with this. <\/li>\n<\/ul>\n<\/li>\n<li>SQL Server automatically updates the statistical information kept for indexes\/columns for a statistic that\u2019s been \u201cinvalidated\u201d \u2013 when SQL Server NEEDS it to optimize your query\n<ul>\n<li>IMPORTANT: This only occurs when the database option: auto_update_statistics is ON; this option is on by default in ALL versions\/editions of SQL Server. However, there are some products\/guidance out there that recommends that you turn this off. I mostly DISAGREE with this. If you want to read a bit more about this \u2013 check out Erin\u2019s SQL101 post: <a href=\"https:\/\/www.sqlskills.com\/blogs\/erin\/sqlskills-sql101-updating-sql-server-statistics-part-i-automatic-updates\/\">SQLskills SQL101: Updating SQL Server Statistics Part I \u2013 Automatic Updates<\/a> and why it\u2019s generally better to update statistics manually (but through a more selective\/automated process) here: <a href=\"https:\/\/www.sqlskills.com\/blogs\/erin\/sqlskills-sql101-updating-sql-server-statistics-part-ii-scheduled-updates\/\">SQLskills SQL101: Updating SQL Server Statistics Part II \u2013 Scheduled Updates<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>However, it\u2019s important to know that while some aspects of statistics ARE handled automatically; they\u2019re not perfect. And, SQL Server often needs some with certain data patterns and data sets \u2013 to really optimize effectively. To understand this, I want to talk a bit about accessing data and the process of optimization.<\/p>\n<h3>Accessing Data<\/h3>\n<p>When you submit a request to SQL Server to access data, you usually write a Transact-SQL statement \u2013 in the form of an actual SELECT or possibly execute a stored procedure (and, yes, there are other options). However, the main point is that you\u2019re requesting a data set \u2013 not HOW that data set should be found\/processed. So, how does SQL Server \u201cget\u201d to the data?<\/p>\n<h3>Processing Data<\/h3>\n<p>Statistics help SQL Server with the gathering \/ processing of the data. They help by giving SQL Server insight into how <strong>much data will be accessed<\/strong>. If a small amount of data is going to be accessed then the processing might be easier (and require a different process) than if the query were to process millions of rows.<\/p>\n<p>Specifically, SQL Server uses a cost-based optimizer. There are other options for optimization but cost-based optimizers tend to be the most commonly used today. Why? Cost-based optimizers use specific information about the data \u2013 to come up with more effective, more optimal, and more focused plans \u2013 specific to the data. Generally, this process is ideal. However, with<br \/>\nplans that are saved for subsequent executions (cached plans), it might not <em>always<\/em> be ideal. However, most other options for optimization can suffer from even worse problems.<\/p>\n<p>IMPORTANT: I\u2019m not talking about cached plans here\u2026 I\u2019m talking about the initial process of optimization and how SQL Server determines how much data will be accessed. The subsequent<br \/>\nexecution of a cached plan brings additional potential problems with it (known as parameter sniffing problems or parameter-sensitive plan problems). I talk a lot about that in other posts.<\/p>\n<p>To explain cost-based optimization, let me cover other possible forms of optimization; this will help to explain a lot of the benefits of cost-based optimization.<\/p>\n<h2><strong>Syntax-based Optimization <\/strong><\/h2>\n<p>SQL Server could use just your syntax to process your query. Instead of taking time to determine the best possible processing order for your tables, an optimizer <em>could<\/em> just join your tables in the order they\u2019re listed in your FROM clause. While it might be fast to START this process; the actual gathering of the data might not be ideal. In general, joining larger tables to smaller tables is a lot less optimal than joining smaller tables to larger. To give you a quick view of this, check out these two statements:<\/p>\n<p>USE [WideWorldImporters];<br \/>\nGO<\/p>\n<p>SET STATISTICS IO ON;<br \/>\nGO<\/p>\n<p>SELECT [so]., [li].<br \/>\nFROM [sales].[Orders] AS [so]<br \/>\nJOIN [sales].[OrderLines] AS [li]<br \/>\nON [so].[OrderID] = [li].[OrderID]<br \/>\nWHERE [so].[CustomerID] = 832 AND [so].[SalespersonPersonID] = 2<br \/>\nOPTION (FORCE ORDER);<br \/>\nGO<\/p>\n<p>SELECT [so]., [li].<br \/>\nFROM [sales].[OrderLines] AS [li]<br \/>\nJOIN [sales].[Orders] AS [so]<br \/>\nON [so].[OrderID] = [li].[OrderID]<br \/>\nWHERE [so].[CustomerID] = 832 AND [so].[SalespersonPersonID] = 2<br \/>\nOPTION (FORCE ORDER);<br \/>\nGO<\/p>\n<p>Review the cost of the two plans &#8211; side-by-side, you can see that the second is more expensive.<\/p>\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"691\" src=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20200120statistics-1-1024x691.png\" alt=\"\" class=\"wp-image-2853\" srcset=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20200120statistics-1-1024x691.png 1024w, https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20200120statistics-1-300x202.png 300w, https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20200120statistics-1-768x518.png 768w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Costs of the same query with FORCE ORDER and only the two tables in the FROM clause reversed between the 1st and 2nd execution<\/figcaption><\/figure>\n<p>Yes, this is an oversimplification of join optimization techniques but the point is that it\u2019s unlikely that the order you\u2019ve listed your tables in the FROM clause is the most optimal. The good news though \u2013 if you\u2019re having performance problems, you CAN force syntax-based optimization techniques such joining in the order as your tables are listed (as in the example). And, there are quite a few other possible \u201chints\u201d that you can use:<\/p>\n<ul>\n<li><a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/t-sql\/queries\/hints-transact-sql-query?view=sql-server-ver15\">QUERY hints<\/a> (forcing level of parallelism [MAXDOP], forcing SQL Server to optimize for the FIRST row \u2013 not the overall set with FAST <em>n<\/em>, etc\u2026)<\/li>\n<li><a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/t-sql\/queries\/hints-transact-sql-table?view=sql-server-ver15\">TABLE hints<\/a> (forcing an index to be used [INDEX], forcing seeks, forcing an index join [index intersection])<\/li>\n<li><a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/t-sql\/queries\/hints-transact-sql-join?view=sql-server-ver15\">JOIN hints<\/a> (LOOP \/ MERGE \/ HASH) <\/li>\n<\/ul>\n<p><strong>Having said that, these should ALWAYS be a last resort IMO.<\/strong> I would try quite a few other things before I\u2019d FORCE anything in production code. By forcing one (or, more) of these hints, you don\u2019t allow SQL Server to further optimize your queries as you add\/update indexes, as you add\/update statistics, or as you update\/upgrade SQL Server. Unless, of course, you\u2019ve documented these hints and you test to make sure they\u2019re still the best option for your query after maintenance routines, routinely after data modifications, and after updates\/upgrades to SQL Server. To be honest, I don\u2019t see this done as often as it should be. Most hints are added as if they\u2019re perfect and left until later problems come up \u2013 even when they\u2019re possibly floundering inefficiently for days, weeks, months, etc\u2026<\/p>\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"640\" src=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20180717_1811-klr_6789-1024x640.jpg\" alt=\"\" class=\"wp-image-2858\" srcset=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20180717_1811-klr_6789-1024x640.jpg 1024w, https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20180717_1811-klr_6789-300x188.jpg 300w, https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20180717_1811-klr_6789-768x480.jpg 768w, https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20180717_1811-klr_6789.jpg 2016w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Understand statistics and stop letting your queries flounder!<br \/>(sorry, I had to&#8230; it was my photo post for Sunday &#8211; so perfect!)<\/figcaption><\/figure>\n<p>It\u2019s always important to see if there\u2019s something ELSE that you can do to get better performance BEFORE FORCING a query to do ONE THING. Yes, forcing might be easy NOW but a small amount of time checking some of these other things might be well worth it in the long run. <\/p>\n<p>For plan problems that are specifically from THIS query and THESE values (not from the execution of a cached plan), some things I might try are:<\/p>\n<ul>\n<li>Are the statistics accurate \/ up-to-date? Does updating the statistics fix the issue?<\/li>\n<li>Are the statistics based on a sampling? Does FULLSCAN fix the issue?<\/li>\n<li>Can you re-write the query and get a better plan?\n<ul>\n<li>Are there any search arguments that aren\u2019t well-formed? Columns should always be isolated to one side of an expression\u2026\n<ul>\n<li><strong>WRITE:<\/strong> MonthlySalary &gt; expression \/ 12<\/li>\n<li><strong>NOT:<\/strong> MonthlySalary * 12 &gt; expression<\/li>\n<\/ul>\n<\/li>\n<li>Are there any transitivity issues you can help SQL Server with? This is an interesting one and becomes more problematic across more and more tables. However, adding a seemingly redundant condition can help the optimizer to do something it hadn\u2019t considered with the original query (see, the added condition below):<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>FROM table2 AS t1<br \/>JOIN table2 AS t2<br \/> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ON t1.colX = t2.colX<br \/> WHERE t1.colX = 12 <strong>AND t2.colX = 12<\/strong><\/p>\n<ul>\n<li>Sometimes just changing from a join to a subquery or a subquery to a join \u2013 fixes the issue (no, one isn\u2019t specifically\/always better than the other but sometimes the re-write can help the optimizer to see something it didn\u2019t see with the original version of the query)<\/li>\n<li>Sometimes using derived tables (a join of specific tables in the FROM clause AS J1) can help the optimizer to more optimally join tables<\/li>\n<li>Are there any OR conditions? Can you re-write to UNION or UNION ALL to get the same result (this is the most important) AND get a better plan? Be careful though, these are semantically NOT the same query. You\u2019ll need to understand the differences between all of these before you move to a different query.\n<ul>\n<li>OR removes duplicate rows (based on the row ID)<\/li>\n<li>UNION removes duplicates based on JUST the SELECT LIST<\/li>\n<li>UNION ALL concatenates sets (which can be A LOT faster than having to remove duplicates) but, this might \/ might not be an issue:\n<ul>\n<li>Sometimes there are NO duplicates (KNOW your data)<\/li>\n<li>Or, sometimes it might be OK to return duplicates (KNOW your user\/audience\/application)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>This is a short list but might help you get better performance WITHOUT using hints\/forcing. That means, as subsequent changes occur (to the data, to the indexes\/statistics, to the SQL Server version) you\u2019ll also be able to benefit from those changes!<\/p>\n<p>But, the coolest part, is that these hints are there \u2013 IF we do have to use them! And, there ARE cases where the optimizer might not be able to come up with the most efficient plan; for those cases, you can force the process. So, yes, I love that they\u2019re there. I just don\u2019t want you to use them when you really haven\u2019t determined what the actual root cause of the problem is\u2026 <\/p>\n<p style=\"text-align:center\" class=\"has-text-color has-background has-medium-font-size has-very-dark-gray-color has-luminous-vivid-amber-background-color\"><strong>So, yes, you can achieve syntax-based optimization\u2026IFF you NEED it.<\/strong><\/p>\n<h2><strong>Rules-based Optimization (Heuristics)<\/strong><\/h2>\n<p>I mentioned that cost-based optimization requires statistics. But what if you don\u2019t have statistics? <\/p>\n<p>Or, maybe a different way to look at it \u2013 why can\u2019t SQL Server just use a \u201cbunch of rules\u201d to more-quickly optimize your queries without having to look at \/ analyze specific information about your data? Wouldn\u2019t that be faster? SQL Server can do this \u2013 but, it\u2019s often FAR from ideal. The best way to show you this is to NOT allow SQL Server to use statistics to process a query. I can show you a case where it actually works well and a MUCH MORE likely case where it doesn\u2019t.<\/p>\n<p>Heuristics are rules. Simple, STATIC \/ FIXED calculations. The fact that they\u2019re simple is their benefit. No data to look at. Just a quick estimate based on the predicate. For example, less than and greater than have an internal rule of 30%. Simply put, when you run a query with a less than or greater than predicate and NO INFORMATION is known about the data in that column \/ predicate, SQL Server is going to use a RULE that 30% of the data will match. They\u2019ll use this in their estimate \/ calculations and come up with a plan tied to this rule.<\/p>\n<p>To get this to &#8220;work,&#8221; I have to first disable auto_create_statistics and check to see if there&#8217;s anything there (already) that could possibly help my query:<\/p>\n<p>USE [WideWorldImporters];<br \/>\nGO<\/p>\n<p>ALTER DATABASE [WideWorldImporters]<br \/>\nSET AUTO_CREATE_STATISTICS OFF;<br \/>\nGO<\/p>\n<p>EXEC sp_helpindex &#8216;[sales].[Customers]&#8217;;<br \/>\nEXEC sp_helpstats &#8216;[sales].[Customers]&#8217;, &#8216;all&#8217;;<br \/>\nGO<\/p>\n<p>Check the output from sp_helpindex and sp_helpstats. By default, WideWorldImporters does not have any indexes or statistics that lead with DeliveryPostalCode (or even have it as a member). If you&#8217;ve added one (or, SQL Server has auto-created one), you must drop it before running these next code samples.<\/p>\n<p>For this first query, we&#8217;ll supply a zip code of 90248. The query is using a less than predicate. Without any statistics and without the ability to auto-create any, what does SQL Server estimate for the number of rows?<\/p>\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"703\" src=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20200120hueristicsright-1024x703.png\" alt=\"\" class=\"wp-image-2855\" srcset=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20200120hueristicsright-1024x703.png 1024w, https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20200120hueristicsright-300x206.png 300w, https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20200120hueristicsright-768x527.png 768w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Columns without statistics use heuristics&#8230; <br \/>Without engineering a &#8220;PERFECT&#8221; value, they&#8217;re going to be WRONG&#8230; most of the time!<\/figcaption><\/figure>\n<p>For this first query, the estimate (30% of 663 = 198.9) works well, as the actual number of rows for the query is 197. One additional item to notice, there&#8217;s a warning symbol next to the table (Customers) as well as the left-most SELECT operator. These also tell you that something&#8217;s wrong. But, here &#8211; the estimate is &#8220;correct.&#8221;<\/p>\n<p>For this second query, we&#8217;ll supply a zip code of 90003. The query is the exact same except for the value of the zipcode. Without any statistics and without the ability to auto-create any, what does SQL Server estimate for the number of rows?<\/p>\n<p>SELECT [c1].[CustomerID],<br \/>\n      [c1].[CustomerName],<br \/>\n      [c1].[PostalCityID],<br \/>\n      [c1].[DeliveryPostalCode]<br \/>\n FROM [sales].[Customers] AS [c1]<br \/>\n WHERE [c1].[DeliveryPostalCode] &lt; &#8216;90003&#8217;;<\/p>\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"702\" src=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20200120hueristicswrong-1024x702.png\" alt=\"\" class=\"wp-image-2856\" srcset=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20200120hueristicswrong-1024x702.png 1024w, https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20200120hueristicswrong-300x206.png 300w, https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2020\/01\/20200120hueristicswrong-768x526.png 768w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Columns without statistic use heuristics (simple rules). Often, they are very incorrect!<\/figcaption><\/figure>\n<p>For the second query, the estimate is again 198.9 but the actual is only 1. Why? Because without statistics, the heuristic for less than (and, greater than) is 30%. Thirty perfect of 663 is 198.9. This will change as the data changes but the percent for this predicate is fixed at 30%. <\/p>\n<p>And, if this query were more complicated \u2013 with joins and\/or additional predicates \u2013 having incorrect information is ALREADY a problem for the subsequent steps of optimization. Yes, you might occasionally get lucky and use a value that loosely matches the heuristic but MORE LIKELY, you won\u2019t. And, while the heuristic for less than and greater than is 30%, it\u2019s different for BETWEEN and different again for equality. In fact, some even change based on the cardinality estimation model you&#8217;re using (for example, equality (=) has changed). Do I even care? Actually, not really! I don\u2019t really ever WANT to use them. <\/p>\n<p style=\"text-align:center\" class=\"has-background has-medium-font-size has-luminous-vivid-amber-background-color\">So, yes, SQL Server can use rules-based optimization\u2026but, only when it doesn\u2019t have better information.<\/p>\n<p><strong>I don\u2019t want heuristics; I want statistics! <\/strong><\/p>\n<p>Statistics are one of the FEW places in SQL Server where having MORE of them can be beneficial. No, I\u2019m not saying to create one for every column of your table but there are some cases where I might pre-create statistics. OK, that\u2019s a post for another day! <\/p>\n<p>So, WHY statistics are important for cost-based optimization?<\/p>\n<h2><strong>Cost-based Optimization<\/strong> <\/h2>\n<p>What does cost-based optimization actually do? Simply put, SQL Server QUICKLY gets a rough estimate of how much data will be processed. Then, using this information, it estimates the costs of different possible algorithms that could be used to access the data. Then, based on the \u201ccosts\u201d of these algorithms, SQL Server chooses the one that it\u2019s calculated to be the least expensive. Then, it compiles it and executes it.<\/p>\n<p>This <em>sounds<\/em> great; but there are a lot of factors as to why this may or may not work well.<\/p>\n<p>Most importantly, the base information used to perform the estimations (statistics) can be flawed.<\/p>\n<ul>\n<li>They could be out of date<\/li>\n<li>They could be less accurate due to sampling<\/li>\n<li>They could be less accurate because of the table size and the limitations of statistics \u201cblob\u201d that SQL Server creates<\/li>\n<\/ul>\n<p>Some people ask me \u2013 couldn\u2019t SQL Server have more detailed statistics (larger histograms, etc.). Yes, they could. But, then the process of reading\/accessing these larger and larger statistics would start to get more and more expensive (and take more time, more cache, etc.). Which would \u2013 in turn \u2013 make the process of optimization more and more expensive. It\u2019s a difficult problem; there are pros and cons \/ trade-offs everywhere. Really, it\u2019s not quite as simple as just \u201clarger histograms.\u201d<\/p>\n<p>Finally, the process of optimization itself can\u2019t possibly EVERY combination for execution \/ processing, can it? No. That would make the sheer process of optimization so expensive that it would be prohibitive! <\/p>\n<h2><strong>In Summary:<\/strong><\/h2>\n<p class=\"has-background has-luminous-vivid-amber-background-color\"><strong>The best way to think about the process of optimization is that SQL Server MUST find a \u201cgood plan fast.\u201d <\/strong>Otherwise, the process of optimization would become so complicated and take so long, it would defeat its own purpose!<\/p>\n<p>So, why are statistics soooooo important:<\/p>\n<ul>\n<li>They feed into the whole process of cost-based optimization (and, you WANT cost-based optimization)<\/li>\n<li>They need to exist for optimization otherwise you\u2019ll be forced to use heuristics (yuck)\n<ul>\n<li>IMO \u2013 generally, I HIGHLY RECOMMEND auto_create_statistics to be ON, if you\u2019ve turned it off<\/li>\n<\/ul>\n<\/li>\n<li>They need to be reasonably accurate to be give better estimates (keeping them up to date is important but some statistics might need more \u201chelp\u201d than others staying up to date)<\/li>\n<li>They might need even further help (with HOW they\u2019re updated, WHEN they\u2019re updated, or even with ADDITIONAL statistics)<\/li>\n<\/ul>\n<p style=\"text-align:center\" class=\"has-background has-medium-font-size has-luminous-vivid-amber-background-color\"><strong>STATISTICS ARE THE KEY TO BETTER OPTIMIZATION AND THEREFORE BETTER PERFORMANCE!<\/strong><br \/><em>(but, they\u2019re far from perfect!)<\/em><\/p>\n<p>I hope that motivates you to consider learning a bit more about statistics. They\u2019re actually easier than you might think AND (obviously) they\u2019re really, really important!<\/p>\n<h3><strong>Where can you get more information?<\/strong><\/h3>\n<p><strong>Whitepaper:<\/strong> There\u2019s a VERY dated, but still has some great concepts, whitepaper on Statistics here: <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/dd535534.aspx\" target=\"_blank\" rel=\"noreferrer noopener\">Statistics Used by the Query Optimizer in Microsoft SQL Server 2008<\/a><\/p>\n<p><strong>Online Training with <a href=\"https:\/\/www.pluralsight.com\/\">Pluralsight<\/a>: <\/strong>I don\u2019t [yet] have a course just on statistics but \u2013 out of necessity \u2013 I talk about them in some of my other courses. This course: <a href=\"https:\/\/www.pluralsight.com\/courses\/sqlserver-optimizing-adhoc-statement-performance\">SQL Server: Optimizing Ad Hoc Statement Performance<\/a> has the best discussion. Specifically, be sure to review Module 3: Estimates and Selectivity.<\/p>\n<blockquote class=\"wp-block-quote\">\n<p>TIP: If you\u2019re not a Pluralsight subscriber, shoot an email to Paul at SQLskills dot com and tell him I sent you there for a PS code. That will give you 30 days to check out our <a href=\"https:\/\/www.sqlskills.com\/sql-server-training\/online-training\/\">SQLskills content on Pluralsight<\/a>!<\/p>\n<\/blockquote>\n<p><strong>In-Person Events: <\/strong>And, we have some in-person events coming up where you can certainly learn a lot more about SQL Server, and specifically statistics:<\/p>\n<h4><a href=\"https:\/\/sqlbits.com\/\">SQLBits<\/a> in London \u2013 March 31-April 4, 2020<\/h4>\n<p><strong>My FULL-DAY workshop: <\/strong><a href=\"https:\/\/sqlbits.com\/information\/event20\/Statistics_for_Performance_Internals_Analysis_and_Problem\/trainingdetails\">Statistics for Performance: Internals, Analysis, Problem Solving<\/a><\/p>\n<h4><a href=\"https:\/\/www.SQLintersection.com\/\">SQLintersection<\/a> in Orlando, FL \u2013 April 5-10, 2020<\/h4>\n<p><strong>My 60 minute conference session: <\/strong><a href=\"https:\/\/sqlintersection.com\/#!\/session\/Statistics:%20Internals,%20Analysis,%20and%20Solutions\/4356\">Statistics: Internals, Analysis, and Solutions<\/a><\/p>\n<h4><a href=\"https:\/\/www.sqlskills.com\/sql-server-training\/\">SQLskills Immersion Events<\/a><\/h4>\n<p><a href=\"https:\/\/www.sqlskills.com\/sql-server-training\/iepto1\/\">IEPTO1<\/a> has a few modules (2+ DAYS) related to indexing and statistics!<\/p>\n<p style=\"text-align:center\" class=\"has-background has-medium-font-size has-luminous-vivid-amber-background-color\"><strong>I HIGHLY recommend that you #NeverStopLearning and that you access as many different mediums \/ learning paths as possible. This stuff isn\u2019t simple and it takes time to really digest it and LEARN it. <\/strong><\/p>\n<p>Thanks for reading!<br \/>\n-k<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In my years working with SQL Server, I\u2019ve found there are a few topics that are often ignored. Ignored because people fear them; they think they\u2019re harder than they actually are OR they think they\u2019re not important. Sometimes they even think \u2013 I don\u2019t need to know that because SQL Server \u201cdoes it for me.\u201d [&hellip;]<\/p>\n","protected":false},"author":7,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[97,94,23,98,72,78],"tags":[],"class_list":["post-2850","post","type-post","status-publish","format-standard","hentry","category-cardinality-estimator","category-database-development","category-database-maintenance","category-sql101","category-statistics","category-tips"],"_links":{"self":[{"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/posts\/2850","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/comments?post=2850"}],"version-history":[{"count":0,"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/posts\/2850\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/media?parent=2850"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/categories?post=2850"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/tags?post=2850"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}