First and foremost, happy spring! I truly hope we’re on the path to summer (although who would know it here – we’re in Florida for SQL Connections and the weather is a bit chilly and it’s been raining off/on today – I hope this is short term (no, I don’t want to look at the forecast as I don’t want to jinx it :)). But, wherever you are – I hope you’re on your way to nicer weather and minimal cold weather (ok, I guess I only have warm wishes for the northern hem… for you southies - I hope your fall is lovely!!).

But, for everyone – now’s a good time to start thinking about cleaning out the [non-literal index] closet… and getting rid of some of those dusty indexes?

Why/when should you get rid of some of your indexes?

  1. It’s possible that some of them aren’t being used at all…
  2. Especially when they’re not being used but even when they’re “redundant” (or minimally useful) they’re still costing you in many ways:
    • Wasting space on disk
    • Wasting space in memory (well, if they are being used then they’re cluttering up your cache)
    • Wasting space/time in your maintenance routines (so, here they’re cluttering up your cache for sure!)
    • Wasting space in your backups
  3. You might be able to reduce your overall indexes with index consolidation…

So, for this post, I’m going to target #1: are there any indexes that just aren’t being used at all…

First, how do you know if your indexes are being used?

In SQL Server 2005 and higher, there’s a DMV (dynamic management view) called sys.dm_db_index_usage_stats and it’s there to track index usage patterns. However, it’s not persisted since the beginning of time and as a result, if you look at this and believe that it’s telling you ALL of the indexes that have been used in your database – then you might be mistaken. The index usage stats DMV is cleared when SQL Server is restarted as well as when you detach/attach the database or when you backup/restore the database.

Therefore, you don’t want to just run the following query and drop all of the indexes that aren’t being used. A better way to “trust” this information is to periodically persist the data from the DMV in your own table and then query it after you’ve completed a business cycle’s worth of activity – logging all of the usage stats. Then you can trust this much more. Again, here are a couple of negatives:

  1. It’s not persisted
  2. It only keeps the database_id and the index_id (which could change over time). You’re right in thinking it probably shouldn’t change but, a nonclustered index’s ID is not permanent so, it’s better to track the index name in addition to the index_id. And, when you run your queries to determine what to delete, you can easily verify indexes against the current indexes because your comparison is within the same db (more on this below).

If you want to persist this, then you have two ways to do this:

  1. Store the index usage patterns in a table within the specific database you’re tracking:
    • Pro: it goes with the database when you back it up, etc. and, it’s easier to reverse engineer which actual indexes you’re referring to (grabbing the names and not just the IDs).
    • Con: it’s a bit more complex of a query to run and you’ll need to run it for all of your databases (ok, it’s really not all that bad – but, using something like sp_msForEachDB will really help)
  2. Store the index usage patterns from all databases in a table within master or your own “performance database”:
    • Pro: you only need one job to handle all the index usage info AND object_name *does* support TWO parameters (object_id and database_id) so, as long as you trap the name at the time of insert then you’ll be good.
    • Con: it doesn’t go with the database (e.g. backup/restore – and if you’re restoring to a test system and you want to see what the usage patterns were then you’ll need to get this information as well…)

Here’s a simple query that you can run that shows all the indexes used right now – and adds the databasename/objectname into the results – in a persisted table you’ll also want to add the runtime:

SELECT getdate() AS RunTime
, DB_NAME(i.database_id) as DatabaseName
, OBJECT_NAME(i.object_id, i.database_id) as ObjectName
, *
FROM sys.dm_db_index_usage_stats AS i
WHERE object_id > 100

And, if you want to get a few more insghts into how to persist this on Paul’s blog here.

OK, so, you have a few options to think about and I have a few more parts to post!
kt