Whenever I’m discussing index maintenance, and specifically fragmentation, I always make a point of saying ‘Make sure the index is being used before doing anything about fragmentation’.
If an index isn’t being used very much, but has very low page density (lots of free space in the index pages), then it will be occupying a lot more disk space than it could do and it might be worth compacting (with a rebuild or a reorg) to get that disk space back. However, usually there’s not much point spending resources to remove any kind of fragmentation when an index isn’t being used. This is especially true of those people who rebuild all indexes every night or every week.
You could even go so far as to say if a non-clustered index isn’t being used, why is it there at all? Extra non-clustered indexes drag down performance in a number of ways. Consider a non-clustered index (not filtered) called IX_MyNCIndex on the table MyTable
- Any time a record is inserted into MyTable, a matching record is inserted into IX_MyNCIndex. This is a bunch of extra I/Os, extra log records, plus maybe even a page-split.
- Any time a record is deleted from MyTable, the matching record in IX_MyNCIndex must be deleted. Extra I/Os and log records again.
- Any time a record in MyTable is updated:
- If MyTable has a clustered index, and the clustered index key value changes, then the matching record in IX_MyNCIndex must be updated. Extra I/Os and log records again.
- If any of the non-clustered index key values changes, or any of the INCLUDE’d column values changes, then the matching record in IX_MyNCIndex must be updated. Extra I/Os and log records again.
- If a clustered index is created on MyTable, then IX_MyNCIndex has to be rebuilt to include the logical RIDs rather than the physical heap RIDs (see this post for an explanation). Lot of extra I/Os and log records again.
That’s a significant amount of extra IOs and log records to maintain each extraneous non-clustered index.
So, how can you tell if an index is being used?
- In SQL Server 2000 there is no easy way to do it
- From SQL Server 2005 onward there are a few different ways and the one I want to discuss in this post is the sys.dm_db_index_usage_stats DMV.
This DMV exposes the information that is tracked about index usage (as the name suggests). It does not generate any information itself; it just returns info from a cache inside SQL Server. This cache is empty when the server instance starts, and is not persisted across instance restarts. All cache entries for indexes in a database are removed when that database is closed. So, the cache tracks usage information about which indexes have been used since the database they are part of was last opened (either manually or as part of instance start-up).
This continues to confuse people so I’ll call it out: if the output from the DMV does not have an entry for the index you’re interested in, it has not been used since the last database startup.
The cache tracks the following info for each index (for user queries and system queries):
- The number of times it was used in a seek operation (either looking up a single row, or doing a range scan) along with the time of the last seek.
- The number of times it was used in a scan operation (e.g. a select * operation) along with the time of the last scan
- The number of times it was used in a lookup operation (this means a bookmark lookup – where a non-clustered index does not fully cover a query and additional columns must be retrieved from the base table row) along with the time of the last lookup.
- The number of times it was used in an update operation (this counts inserts, updates, and deletes) along with the time of the last update.
Let’s have a look at its use.
SELECT * FROM sys.dm_db_index_usage_stats; GO
The output is too wide for a single image so I’ve split it in two (I won’t post any more output from the DMV – I’ll just talk about it):
Unless you’ve just re-started your instance, you’ll see a bunch of output from this, representing all index activity since the instance/databases started. If you’re interested in whether an index is being used, you can filter the output. Let’s focus in on a particular table: AdventureWorks.Person.Address
SELECT * FROM sys.dm_db_index_usage_stats WHERE [database_id] = DB_ID (N'AdventureWorks') AND [OBJECT_ID] = OBJECT_ID (N'AdventureWorks.Person.Address'); GO
You’ll probably see nothing in the output, unless you’ve been playing around with that table. Let’s force the clustered index on that table to be used, and look at the DMV output again.
SELECT * FROM [AdventureWorks].[Person].[Address]; GO SELECT * FROM sys.dm_db_index_usage_stats WHERE [database_id] = DB_ID (N'AdventureWorks') AND [OBJECT_ID] = OBJECT_ID (N'AdventureWorks.Person.Address'); GO
Now there’s a single row, showing a scan on the clustered index. Let’s do something else.
SELECT [StateProvinceID] FROM [AdventureWorks].[Person].[Address] WHERE [StateProvinceID] > 4 AND [StateProvinceId] < 15; GO SELECT * FROM sys.dm_db_index_usage_stats WHERE [database_id] = DB_ID (N'AdventureWorks') AND [OBJECT_ID] = OBJECT_ID (N'AdventureWorks.Person.Address'); GO
And there’s another row, showing a seek in one of the table’s non-clustered indexes.
So, its easy to look at the index usage for particular tables and indexes. But how can you monitor this over time? This is easy too – let’s see how.
First we need to create our own table to store snapshots of the DMV output.
IF OBJECTPROPERTY (OBJECT_ID (N'master.dbo.MyIndexUsageStats'),'IsUserTable') = 1 DROP TABLE [master].[dbo].[MyIndexUsageStats]; GO SELECT GETDATE () AS [ExecutionTime], * INTO [master].[dbo].[MyIndexUsageStats] FROM sys.dm_db_index_usage_stats WHERE [database_id] = 0; GO
Next we need to take a baseline snapshot of the DMV output.
INSERT [master].[dbo].[MyIndexUsageStats] SELECT GETDATE (), * FROM sys.dm_db_index_usage_stats; GO
And now simulate a few operations and take another snapshot of the DMV:
SELECT * FROM [AdventureWorks].[Person].[Address]; GO SELECT * FROM [AdventureWorks].[Person].[Address]; GO SELECT [StateProvinceID] FROM [AdventureWorks].[Person].[Address] WHERE [StateProvinceID] > 4 AND [StateProvinceId] < 15; GO INSERT [master].[dbo].[MyIndexUsageStats] SELECT GETDATE (), * FROM sys.dm_db_index_usage_stats; GO
And look at the filtered contents of our snapshot table:
SELECT * FROM [master].[dbo].[MyIndexUsageStats] WHERE [database_id] = DB_ID (N'AdventureWorks') AND [OBJECT_ID] = OBJECT_ID (N'AdventureWorks.Person.Address'); GO
You should see four rows – two from the baseline snapshot and two from the final snapshot. If you ran just the statements above, you’ll see that the user_scans count for the clustered index has increased by two, and the user_seeks count for the non-clustered index has increased by one.
So this is a pretty simple example of how you can track index usage.
Here’s the catch: you need to consider an entire business cycle before deciding whether an index can be removed or not. An index may only be used once a month for reporting, or an ‘executive’ query so make sure you cover all possible times that an index may be used. And even if an index isn’t used, make sure it’s not enforcing a uniqueness constraint, as the query optimizer may need the index to exist.
Don’t just blindly drop all the indexes that don’t show up in the output – make sure they *really* haven’t been used.
9 thoughts on “Indexes From Every Angle: How can you tell if an index is being used?”
I’m amazed noone has posted this yet. Here’s something awesome you can do with this: Get a list of all your indexes and how used / unused they are.
I’ve got a db that is heavily used with db_id = 5. This query shows me all the indexes that are in that database, and sorts the list by usage of those indexes. This shows you heaps (index_name = null) and unused indexes (where user_* and system_* are 0 or very small) or indexes that are just a tax on the system (ie, never queried against, only updated). This doesn’t mean that those indexes should be deleted – they might just have little use – but it’s a good start, and gets better the longer your db is up. Anyways, on with the query!
SELECT object_name(b.object_id) as obj_name, b.name as index_name, a.*
FROM sys.dm_db_index_usage_stats a
right outer join sys.indexes b on a.object_id = b.object_id and a.index_id = b.index_id and a.database_id = 5
where b.object_id > 100
order by user_seeks + user_scans
Hope others find that useful :)
This code is very helpful. I am working on Index maintainence strategy for a DB where indexes have never been rebuild/re-organised. I was looking at fragmentation levels of current indexes where I came across this topic and makes lot of sense to get rid of non-used indexes or atleast keep them out of your indexing strategy since its not been used.
This is awesome!
Thanks Paul, You mentioned there are many ways besides this. please mention those too :)