{"id":2818,"date":"2017-03-13T19:53:38","date_gmt":"2017-03-14T02:53:38","guid":{"rendered":"http:\/\/3.209.169.194\/blogs\/kimberly\/?p=2818"},"modified":"2018-03-27T09:23:38","modified_gmt":"2018-03-27T16:23:38","slug":"sqlskills-sql101-indexing","status":"publish","type":"post","link":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/sqlskills-sql101-indexing\/","title":{"rendered":"SQLskills SQL101: Indexing Basics"},"content":{"rendered":"<p>SQLskills introduced our new SQL101 recently and well&#8230;\u00a0indexing is something that everyone needs to get right. But, it&#8217;s\u00a0not a simple task. And, as I start to sit down to write a SQL101 post on indexing, I suspect I&#8217;m going to struggle keeping it simple? However, there are some core points on which I will focus and I&#8217;ll be sure to list a bunch of additional resources to get you more information from here! Remember, the point of each of our SQL101 posts is to make sure that everyone&#8217;s on the same page and has the same basic idea \/ background about a topic. And, for indexing, that&#8217;s incredibly important (and, often, misunderstood).<\/p>\n<h2>What is\u00a0an Index?<\/h2>\n<p>Simply, it\u2019s a structure that&#8217;s applied to a set of [or, subset of] data to enforce ordering &#8211; either to quickly check uniqueness or to aid in accessing data quickly. Simply, that&#8217;s why you create an index. You&#8217;re either wanting to enforce data integrity (such as uniqueness)\u00a0or you&#8217;re trying to improve performance in some way.<\/p>\n<h2>How Does an Index Enforce Uniqueness?<\/h2>\n<p>If you want to maintain uniqueness over a column (or a combination of columns), SQL Server takes the required data\u00a0(and very likely more data than you specifically chose)\u00a0and sorts that data in an index. By\u00a0storing the data in sorted order, SQL Server is able to quickly determine if a value exists (by efficiently navigating the index structure). For this intro post, it doesn&#8217;t entirely matter exactly what&#8217;s in the index but it does matter which index you choose and for what purpose.<\/p>\n<h2>Relational Rules Enforced by Indexing<\/h2>\n<p>In a relational database, relational rules rule the world. And, many learn some of the rules rather quickly. One of these rules is that relational theory says that every table <em>must<\/em> have a primary key. A primary key can consist of multiple columns; however, none of the columns can allow NULLs and the combination of those columns must be unique. While I agree that every table <em>should<\/em> have a primary key, what&#8217;s chosen AS the primary key can be more complex than the relational rules allow. Behind the scenes, a primary key is enforced by an index (to enforce and check for uniqueness). The type of index that is used depends on whether or not you explicitly state the index type or not. If you do not explicitly state an index type, SQL Server will default to trying to enforcing your primary key constraint with a unique clustered index. If a clustered index already exists, SQL Server will create a nonclustered index instead.<\/p>\n<p>And, this is where things get tricky&#8230; a clustered index is a very important index to define. Internally, the clustering key defines how the entire data set is initially ordered. If not well chosen then SQL Server might end up with a structure that&#8217;s not as efficient as it could be. There&#8217;s quite a bit that goes into choosing a good clustering and I&#8217;ll stress that I think it&#8217;s one of the most important decisions to be made for your tables. And, it also needs to be made early as later changes to your clustering key can be difficult at best (often requiring downtime and complex coordinated scripting after already suffering poor performance before you make the decision to change).<\/p>\n<p>So, let&#8217;s keep this simple&#8230; you&#8217;ll want to choose the clustering key wisely and early. And, you&#8217;ll want to get a good understanding on the things that depend on the clustering key. I did a discussion and demo in my Pluralsight course: <a href=\"https:\/\/www.pluralsight.com\/courses\/sqlserver-why-physical-db-design-matters\" target=\"_blank\" rel=\"noopener noreferrer\">Why Physical Database Design Matters<\/a> and I&#8217;ve discussed this quite a bit in my blog category: <a href=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/category\/clustering-key\/\" target=\"_blank\" rel=\"noopener noreferrer\">Clustering key<\/a>.<\/p>\n<p>The key point is that the primary does NOT have to be enforced with a clustered index. Sometimes your primary key is not an ideal choice as the clustering key. Some great clustering key choices are:<\/p>\n<ul>\n<li><strong>Composite key:<\/strong> EntryDate, RowID where\u00a0EntryDate is an ever-increasing date value that follows the insert pattern of your data. For example, OrderDate for a table that stores Orders. RowID should be something that helps to uniquely identify the rows (something like an identity column is useful). Key points: choose the smallest (but reasonable) data types for both the date and the ID. Ideally, use DATETIME2(p) where p is the level of precision desired. And, for an identity column &#8211; choose INT if you know you&#8217;ll never get anywhere near 2 billion rows. However, if you even think you&#8217;ll have &#8220;hundreds of millions&#8221; of rows, I&#8217;d probably go straight for BIGINT so that you never have to deal with the problems that you&#8217;ll have if you run out.<\/li>\n<li><strong>Identity column:<\/strong> When I don&#8217;t have a good composite key like that above, I&#8217;ll often consider an identity column for clustering &#8211; even if my queries aren&#8217;t specifically using this value. Even if you&#8217;re not <em>explicitly<\/em> using this value, SQL Server is using it behind the scenes in its nonclustered indexes. Please note that this is both a good thing and a bad thing. If your clustering key is very narrow then you&#8217;re not unnecessarily widening your nonclustered indexes. That isn&#8217;t to say that you won&#8217;t have a few wider nonclustered indexes but choosing a wide clustering key makes all of your nonclustered indexes wide when they might not need to be.<\/li>\n<\/ul>\n<p>OK, I feel like I&#8217;ve started to open a car of worms with this one. But, the key points are:<\/p>\n<ol>\n<li>The\u00a0primary key does NOT have to be clustered (and sometimes it&#8217;s better not to be)<\/li>\n<li>The clustering key needs to be chosen early and based on many factors &#8211; there&#8217;s no single right answer ALL the time&#8230; for example, if you\u00a0don&#8217;t need any nonclustered indexes then the width of the clustering key becomes less of an issue.<\/li>\n<\/ol>\n<p>At the end of this post, I&#8217;ll point you to more resources to help you to make a better decision.<\/p>\n<h2>What about Indexing for Performance?<\/h2>\n<p>In addition to enforcing uniqueness (and, allowing SQL Server to quickly determine whether or not a value already exists), indexes are used to help performance.\u00a0And here&#8217;s where there are some very simple yet important things to understand. There are two types of performance tuning methods that I want to describe here: <strong>query tuning<\/strong> and\u00a0<strong>database tuning<\/strong>. What&#8217;s often done most is query tuning. While that might be [temporarily] good for that query, it&#8217;s NOT a good long-term strategy for the server.\u00a0I always START with query tuning but that&#8217;s not an ideal strategy to implement directly on your production server.<\/p>\n<h3>Query Tuning<\/h3>\n<p><figure id=\"attachment_2828\" aria-describedby=\"caption-attachment-2828\" style=\"width: 1024px\" class=\"wp-caption alignright\"><a href=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2017\/03\/GoFasterButtonDTA.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-2828 size-large\" src=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2017\/03\/GoFasterButtonDTA-1024x425.jpg\" alt=\"Database Tuning Advisor for &quot;query tuning&quot; from SSMS\" width=\"1024\" height=\"425\" srcset=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2017\/03\/GoFasterButtonDTA-1024x425.jpg 1024w, https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2017\/03\/GoFasterButtonDTA-300x125.jpg 300w, https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2017\/03\/GoFasterButtonDTA-900x374.jpg 900w, https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-content\/uploads\/2017\/03\/GoFasterButtonDTA.jpg 1219w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><figcaption id=\"caption-attachment-2828\" class=\"wp-caption-text\">Indexing for performance using the DTA [Database Tuning Advisor] for &#8220;query tuning&#8221; from SSMS<\/figcaption><\/figure>Query tuning is where you focus on nothing but the query to come up with a better indexing strategy. You might use the Database Engine Tuning Advisor on the specific query right from SQL Server Management Studio (see image). Or, you might use the &#8220;green hint&#8221; found in the show actual executing plan window. While these might\u00a0significantly help you with that query, they may not be ideal to implement in your database (where other users are executing, where other indexes exist, and where resource access\/usage are at a premium).<\/p>\n<p>Don&#8217;t get me wrong, query tuning is a MUST. But, it&#8217;s just a starting point. I always start my tuning process by determining the best indexes for a query. But, you can&#8217;t stop there. You MUST do &#8220;server tuning&#8221; if you want your production database to truly scale.<\/p>\n<h3>Server Tuning<\/h3>\n<p>Before you create a desired index in production (or, while you&#8217;re doing testing \/ analysis in development \/\u00a0QA [quality assurance]) you really want to check to see if this index is going to be good for production.<\/p>\n<h4 style=\"padding-left: 30px;\">Are there other similar indexes?<\/h4>\n<p style=\"padding-left: 30px;\">Maybe you can consolidate some of these indexes into one. Yes, this consolidated index might not be the best for the individual queries but by creating one index instead of three, you&#8217;ll be reducing the cost of this index\u00a0for data modifications, maintenance, storage, etc.<\/p>\n<h4 style=\"padding-left: 30px;\">Are there any suggested missing indexes?<\/h4>\n<p style=\"padding-left: 30px;\">Again, before I create a new index, I want to see if I can get more uses out of it. Can I consolidate this new index with existing and\/or missing recommendations? If I can then I&#8217;ll get more uses out of this index.<\/p>\n<h4 style=\"padding-left: 30px;\">Are there good maintenance strategies in place?<\/h4>\n<p style=\"padding-left: 30px;\">Before you go and create more indexes, make sure that your existing indexes are being maintained. You should also check that the indexes being maintained are actually being used. To be honest, you should do that BEFORE you do any tuning at all.<\/p>\n<h2>SUMMARY: Steps for Server Tuning and Scalability<\/h2>\n<ol>\n<li>Get rid of the dead weight. Clean up unused indexes. Consolidate similar indexes.<\/li>\n<li>Make sure your index maintenance strategy is in place. There&#8217;s no point in adding indexes if you&#8217;re not cleaning up fragmentation and reducing splits.<\/li>\n<li>Then, you can consider adding indexes BUT only after you&#8217;ve done the following:\n<ol>\n<li>Query tuning<\/li>\n<li>Existing index consolidation<\/li>\n<li>Missing index consolidation<\/li>\n<li>And, of course, TESTING!<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p>Well&#8230; that was much longer than I had hoped. But, there are a lot of good concepts here. <span style=\"color: #ff0000;\"><strong>Unfortunately, indexing for performance is just NEVER just a simple discussion.<\/strong> <\/span>You can&#8217;t just put an index on every column and expect things to work well. Conversely, some of these tools seem helpful but they\u00a0mostly do <strong>query tuning<\/strong> and not server tuning. If you really want to\u00a0get better performance diving into indexing is a fantastic way to do this! So, if you&#8217;re motivated &#8211; here are a ton of resources to consider!<\/p>\n<h2>Learning more about Indexing<\/h2>\n<p>If you want to learn more about index structures, check out this older (but still <a href=\"https:\/\/technet.microsoft.com\/en-us\/dn912438#mcm-readiness\" target=\"_blank\" rel=\"noopener noreferrer\">useful page of videos<\/a>). On it, watch them in this order.<\/p>\n<ol>\n<li><strong>Index Internals<\/strong><\/li>\n<li><strong>Index Internals Demo<\/strong><\/li>\n<li><strong>Index Fragmentation <\/strong><\/li>\n<li><strong>Index Fragmentation Demo<\/strong><\/li>\n<li>Even better &#8211; skip 3 and 4 and go to Paul&#8217;s Pluralsight course on\u00a0<a href=\"https:\/\/www.pluralsight.com\/courses\/sqlserver-index-fragmentation-internals-analysis-solutions\" target=\"_blank\" rel=\"noopener noreferrer\">SQL Server: Index Fragmentation Internals, Analysis, and Solutions<\/a><\/li>\n<li><strong>Index Strategies <\/strong><\/li>\n<li><strong>Index Strategies Demonstration<\/strong><\/li>\n<\/ol>\n<p>Also, check out these blog posts:<\/p>\n<ul>\n<li>First, setup the latest version of sp_SQLskills_helpindex from this <a href=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/sp_helpindex-v20170228\/\" target=\"_blank\" rel=\"noopener noreferrer\">post<\/a><\/li>\n<li><a href=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/nonclustered-indexes-lookup-key-btree\/\">Nonclustered indexes require the \u201clookup\u201d key in the b-tree when?<\/a><\/li>\n<li><a href=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/explicitly-naming-cl-key-columns-in-nc-indexes-when-and-why\/\">Explicitly naming CL key columns in NC indexes \u2013 when and why<\/a><\/li>\n<li><a href=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/how-can-you-tell-if-an-index-is-really-a-duplicate\/\" target=\"_blank\" rel=\"noopener noreferrer\">How can you tell if an index is REALLY a duplicate?<\/a><\/li>\n<li><a href=\"https:\/\/www.sqlskills.com\/blogs\/kimberly\/removing-duplicate-indexes\/\" target=\"_blank\" rel=\"noopener noreferrer\">Removing duplicate indexes<\/a>\u00a0(this was created for 2008 but the code still works on higher versions&#8230; but, yes, this is due an update! :))<\/li>\n<\/ul>\n<p>I&#8217;m also working on a much more extensive course on Indexing for Pluralsight, this should be available within the next few months.<\/p>\n<p>UPDATE: Check out my Pluralsight course &#8211;\u00a0<a href=\"https:\/\/www.pluralsight.com\/courses\/sqlserver-indexing-for-performance\">SQL Server: Indexing for Performance<\/a> for more details! Enjoy!!<\/p>\n<p><strong>Thanks for reading!<\/strong><br \/>\nk<\/p>\n","protected":false},"excerpt":{"rendered":"<p>SQLskills introduced our new SQL101 recently and well&#8230;\u00a0indexing is something that everyone needs to get right. But, it&#8217;s\u00a0not a simple task. And, as I start to sit down to write a SQL101 post on indexing, I suspect I&#8217;m going to struggle keeping it simple? However, there are some core points on which I will focus [&hellip;]<\/p>\n","protected":false},"author":7,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16,17,21,36,98],"tags":[],"class_list":["post-2818","post","type-post","status-publish","format-standard","hentry","category-clustered-index","category-clustering-key","category-consolidation","category-indexes","category-sql101"],"_links":{"self":[{"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/posts\/2818","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/comments?post=2818"}],"version-history":[{"count":0,"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/posts\/2818\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/media?parent=2818"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/categories?post=2818"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/kimberly\/wp-json\/wp\/v2\/tags?post=2818"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}