This month the SQLskills team is presenting a series of blog posts aimed at helping Accidental/Junior DBAs ‘keep the SQL Server lights on’. It’s a little taster to let you know what we cover in our Immersion Event for The Accidental/Junior DBA, which we present several times each year. If you know someone who would benefit from this class, refer them and earn a $50 Amazon gift card – see class pages for details. You can find all the other posts in this series at http://www.SQLskills.com/help/AccidentalDBA. Enjoy!

As an accidental DBA you are constantly wearing many hats. You’ve heard that indexes are critical for performance (and that’s absolutely true) but it’s not just any indexes – it’s the RIGHT indexes. The unfortunate thing about indexing is that there’s both a science and an art to it. The science of it is that EVERY single query can be tuned and almost any SINGLE index can have a positive effect on certain/specific scenarios. What I mean by this is that I can show an index that’s ABSOLUTELY fantastic in one scenario but yet it can be horrible for others. In other words, there’s no short cut or one-size-fits-all answer to indexing.

This is where the art comes in – indexing for performance is really about finding the right balance between too many and too few indexes, as well as trying to get more from the indexes that you do keep.

Having said that, it’s way beyond the scope of what we can talk about in a short post. Unfortunately, a full discussion about these things would take a lot of time but I do have a few links that might help if you’re interested in digging in a bit deeper:

  • How do indexes work? (Index Internals video and demo)
  • What’s the clustered index? (Clustered Index Debate video)
  • What are good/general strategies for indexing? (Indexing Strategies video and demo)
  • When should columns be put in the key (and in what order should they be defined) versus when should they be in the INCLUDE list?
  • Would we benefit from a filtered index?

These are difficult discussions. And really, if you’re only meant to “maintain” the system and keep the lights on – these are probably beyond what you can effectively do in your already-packed day.

So, who is handling this job, to create and define the database’s indexes? It’s probably the application developers. These might be in-house developers (which is good – you’ll have someone with whom to consult when you do your analysis) or they might be vendor developers (which might be both good and bad). The good side is that some vendors are open to discussions on their customer support lines and may help you through the issues that you’re seeing. The bad news is that some vendors are not open to discussions and they might not support you if you make any indexing changes.

So, first and foremost – make sure you thoroughly understand the environment you’re about to analyze and do not change anything without verifying that you can.

Introduction

The good news is that it’s NOT all doom and gloom. There are some very helpful DMVs and resources that you can use to analyze your environment and see where some of your indexing issues are. Again, you might not be able to change them (immediately) but, you will be armed with information to help you discuss what you are seeing.

When I’m analyzing a new-to-me system, I tend to break down my index analysis into three parts:

  1. Are there any indexes just lying around not doing anything useful… time to get rid of the dead weight!
  2. Are there any indexes that are bloated and unhealthy – costing me time and space… time to analyze the health of my existing (and useful) indexes
  3. Then, and only then do I feel like you can add more indexes.

Part I: Getting rid of the dead weight

Fully duplicate indexes

SQL Server lets you create redundant/duplicate indexes. This is annoying but it’s always been the case. It certainly begs the question about why SQL Server lets you do this and I wrote up an answer to this in an article on SQL Server Magazine here: Why SQL Server Lets You Create Redundant Indexes (http://sqlmag.com/blog/why-sql-server-lets-you-create-redundant-indexes). Regardless of why, you still need to remove them. And, even if you don’t remember seeing duplicate indexes, you might be surprised. Without knowing index internals, it might be harder to recognize duplicates than you think. It’s not always as easy as Index1 on col1 and Index2 on col1. Internally, SQL Server adds columns to your index and most commands (like sp_helpindex) do not show these internally added columns. The good news is that I have a version of sp_helpindex that does show you the entire structure. And, tied to that updated version of sp_helpindex, I built a script for finding duplicate indexes at either the table-level or database-wide. Check out these links:

But, you could BREAK the application if they’ve used index hints. So, beware! Generally, it might be best to disable an index for a while before you just drop it.

Unused Indexes

Almost as expensive as a duplicate index is one that never gets used. However, this is a lot more challenging to determine. There is a fantastic DMV (sys.dm_db_index_usage_stats) that gives you information about index usage but, it’s not perfect. And, some of the behaviors have changed in some releases (sigh). If you’re really wanting to best understand your index usage patterns you’ll have to persist this information over a business cycle and be sure to persist it prior to index maintenance (see this connect item: Rebuilding an index clears stats from sys.dm_db_index_usage_stats). https://connect.microsoft.com/SQLServer/feedback/details/739566/rebuilding-an-index-clears-stats-from-sys-dm-db-index-usage-stats Note: this is only an issue is SQL Server 2012.

Bug, again, even the information tracked in this DMV isn’t perfect. One of my biggest frustrations is the user_updates only tracks the number of STATEMENTS, not the number of ROWS modified. For example, if I execute this statement (without a WHERE clause) UPDATE Table SET ColumnX = Value and it affects 10,000 rows, then the user_updates column will be incremented by 1 for BOTH the table and any indexes that include ColumnX. So, you might have an even higher (possibly MUCH higher) value for updates.

And, there’s more to it than that. Instead of duplicating this information, I’ll link to a FANTASTIC post by Microsoft PFE Ignacio [Nacho] Alonso’s FAQ around sys.dm_db_index_usage_stats. http://blogs.msdn.com/b/ialonso/archive/2012/10/08/faq-around-sys-dm-db-index-usage-stats.aspx.

Finally, both Paul and I have written about this DMV as well as how to persist it. Check out these posts:

Similar or semi-redundant indexes

You might have some indexes that are good candidates for consolidation:

  • Indexes that have the same key (but possibly different INCLUDEd columns)
    • Index1: Key = LastName
    • Index2: Key = LastName, INCLUDE = FirstName
    • In this case you don’t “NEED” Index1. There’s NOTHING that Index1 does that Index2 cannot also do. However, Index2 is wider. So, a query that solely wants the following will have more I/Os to do because of the wider index:
      • SELECT LastName, count(*) FROM table GROUP BY LastName
    • But, the argument is – how critical is that query? How often is that index really used? Remember, you can use sys.dm_db_index_usage_stats to help you determine how often it’s used.
  • Indexes that have left-based subsets of other index keys
    • Index1: Key = LastName, FirstName, MiddleInitial
    • Index2: Key = LastName INCLUDE = SSN
    • Index3: Key = LastName, FirstName INCLUDE = phone
    • In this case each index does provide some specific (and unique) uses. However, you have a lot of redundancy there.
    • What if you created a new Index: LastName, FirstName, MiddleInitial INCLUDE (SSN, phone)
    • Again, this new index is wider than any of the prior 3 but this new index has even more uses and it has less overall overhead (only one index to maintain, only one index on disk, only one index in cache [and, it’s more likely to stay in cache]). But, you still have to determine how critical each of the queries are that were using the narrower indexes? As well as how much more expensive they are with the new index.

So, this is where the “art” of indexing comes into play. Index consolidation is a critical step in reducing waste and table bloat but there isn’t a simple answer to every consolidation option. This is another “it depends” case.

Part II: Analyze the health of your existing indexes

This topic has been talked about in many places. And, we’ve even chatted about it in our Accidental DBA series here: The Accidental DBA (Day 14 of 30): Index Maintenance.

In the content of this post, I want to make sure that after I’ve cleaned up the dead weight, my existing and useful indexes are healthy. And, you might want to review your index maintenance strategies and see if they’re “lighter” and take less time. And, be sure that they don’t miss anything. A couple of key reminders:

  • Make sure your index maintenance routines at indexes on tables AND views
  • Make sure your index routines use a LIMITED scan if you’re only analyzing avg_fragmentation_in_percent

And, here are a few other resources that you might find handy on this topic:

Part III: Adding more indexes

This is a tricky one. There are lots of good/bad practices around adding indexes. One of the worst is that most folks just add indexes without really fully analyzing (and CORRECTLY analyzing) their existing indexes. The reason I say correctly analyzing their existing indexes is that the tools (like sp_helpindex and SSMS) hide some of the information about columns that might have been added to your indexes. So, unless you really know what your indexes look like you won’t be able to correctly add new indexes while consolidating your existing indexes.

The primary tool that I want to discuss here is the “user impact” aspect of the missing index DMV queries that exist out there (and, there are some great examples of using the missing index DMVs). And, while I STRONGLY encourage you to use them as a GUIDE, I do want you to remember that they’re not perfect. Here are my main issues/concerns/gripes:

  • The missing index DMVs (and therefore the “index/green hint” that shows up in showplan) only tune the plan that was executed. If the plan performed a hash join then the index is going to help the hash join. But, it’s unlikely that the join type will change. And, it might be the case that a different index would perform a different join type and the query would be even faster. If you’re about to trust the missing index DMVs recommendations(or, the green hint), then consider reverse-engineering the queries that are being tuned by these recommendations (see Jon’s post on how to do this) and then (if possible) run these queries through DTA (the Database Engine Tuning Advisor). DTA has capabilities that the missing index DMVs do not in that DTA can “hypothesize” about alternate strategies. This makes the index recommendations even better!
  • The missing index DMVs only think about the BEST index for EACH query. And, that does make sense (from a QUERY tuning perspective) but, you need to do SYSTEM tuning. You can’t just create individual indexes for each and every query that needs one. You definitely want to consider the indexes that have the highest user impact but you also don’t want to forget about consolidation.
  • The missing index DMVs can show indexes that you already have. Missing index DMVs bug that could cost your sanity…

Here are a few Missing Index DMV queries/resources:

Summary

These are the primary things that I’m looking for when I want to see how the already implemented indexing strategies are working as well as the order in which I begin to analyze and change indexes. But, beware: you can negatively impact the environment so it’s important that adequate testing is done to make sure that what you’re doing has a net-positive effect.

Finally, I also did a video summarizing these things that I’m describing here – you might want to check this out as well: http://technet.microsoft.com/en-US/sqlserver/gg545020.aspx.

Thanks for reading!
kt