Indexes From Every Angle: What happens to non-clustered indexes when the table structure is changed?

I’d like to kick off the Indexes From Every Angle series this evening by re-posting some articles from my old blog. Both of these topics continue to crop up on newsgroups and forums.

What happens to non-clustered indexes when changes are made to the underlying table? Are they always rebuilt or not? Well – it depends! (my favorite answer :-)).

Before we get into that discussion, I’ll give you a little background.

One way to describe a non-clustered index is whether it is a covering index or not. A covering index is one which has all the table columns necessary to satisfy a query, and so there is no need to go back to the underlying table (which I’ll call the base table) to fetch additional columns. Most non-clustered indexes are not covering indexes and so it is necessary for each non-clustered index record (even in covering indexes) to include a link back to the corresponding base table record. This link takes various forms, depending on the structure of the base table:

  • Heap: A heap does not impose any logical ordering on the records within it, so the only way to find a particular record within a heap is to do a full table scan or to know the physical location of the record (i.e. which record slot on which data page) – called the physical record ID or physical RID. Clearly doing a full table scan every time a query needs to go from a non-clustered index record back to a heap is not feasible, so each non-clustered index record stores the physical RID of the corresponding heap record.
  • Clustered index: A clustered index does impose a logical ordering on the records within it and the ordering exactly matches the definition of the clustered index keys – known as the cluster key. The fastest way to find a particular record in a clustered index is to use the cluster key to navigate through the clustered index b-tree to find the exact record. So, non-clustered indexes over a clustered index include the cluster key of the base table record in each non-clustered index record. This is known as the logical RID. There is a twist though, depending on whether the clustered index is unique or not.
  • Non-unique clustered index: Every record in a clustered index HAS to be unique, otherwise there would be no way to deterministically navigate to a particular record using the index b-tree. In a non-unique clustered index, SQL Server has to add a special column, called the uniquifier, to each record, so that if multiple records have the same cluster key values, the uniquifier column will be the tie-breaker. This uniquifier column is added as part of the cluster key, and so it is also present in all non-clustered index records as part of the logical RID.
  • Unique clustered index: Every record in a unique clustered index is already unique and so no extra column is required in the cluster key.

Ok – background out of the way. Now let’s discuss what happens to non-clustered indexes when various actions are performed to the base table.

SQL Server 2000

  • Going from a heap to a clustered index: Changing the fundamental structure of a table from a heap to a clustered index means you’re changing the way that records can be looked up in the table, from physical to logical. This means that all the physical RIDs that are stored in the non-clustered index records need to be changed to logical RIDs. So – in this case, all non-clustered indexes are rebuilt after the clustered index has been built.
  • Going from a clustered index to a heap: This is the opposite of the case above – you’re changing the way records are looked up from logical to physical. This means that all the logical RIDs that are stored in the non-clustered index records need to be changed to physical RIDs. So – in this case, all non-clustered indexes are rebuilt after the clustered index has been dropped.
  • Rebuilding a unique clustered index: This operation has no effect on the cluster key, so there is no need for the non-clustered indexes to be rebuilt. Various early Service Packs of SQL Server 2000 had bugs that caused this behavior to be broken – and this is the cause of much of the confusion around this behavior.
  • Rebuilding a non-unique clustered index: In SQL Server 2000, when a non-unique clustered index (which contains uniquifier columns) is rebuilt, all the uniquifier values are regenerated. This essentially means that the cluster keys have changed and so all the non-clustered indexes are rebuilt after the clustered index rebuild has completed.
  • Changing the clustered index schema: This is simple – any change to the clustered index schema that changes the cluster keys will cause all the non-clustered indexes to be rebuilt.

SQL Server 2005 and later

  • Going from a heap to a clustered index: This is the same as SQL Server 2000 – all non-clustered indexes are rebuilt, regardless of whether the clustered index was created online or offline.
  • Going from a clustered index to a heap: Again, all non-clustered indexes are rebuilt, regardless of whether the clustered index is dropped online or offline.
  • Rebuilding a unique clustered index: This is the same as SQL Server 2000 – the cluster keys aren’t changing and so the non-clustered indexes are not rebuilt.
  • Rebuilding a non-unique clustered index: Aha! This is different from SQL Server 2000. SQL Server 2005 will RE-USE the old uniquifier values so the cluster keys don’t change. This means that non-clustered indexes are NOT rebuilt in this case – that’s very cool!
  • Changing the clustered index schema:
    • Changing the cluster key: This behavior has to be the same as SQL Server 2000 – any change to the cluster keys forces all the non-clustered indexes to be rebuilt after the new clustered index has been created.
    • Changing anything else (e.g. partitioning it or moving it):This is one of the cases that confuses people in SQL Server 2005. Moving it to a different filegroup doesn’t change the cluster key at all – so there’s no need to rebuild the non-clustered indexes, but applying a partitioning scheme DOES cause the nonclustered indexes to be rebuilt.

So the rule of thumb to remember is: if the physical or logical RID contained in the non-clustered index records needs to change then all the non-clustered indexes need to be rebuilt. It’s as simple as that.

Hopefully this will help to clear up some confusion – let me know of any questions!

13 thoughts on “Indexes From Every Angle: What happens to non-clustered indexes when the table structure is changed?

  1. Great Expalanation.
    Would it ever be appropriate to issue alter index all…rebuild with the goal being to rebuild the NC indexes?

  2. Hello,

    When you said: “The fastest way to find a particular record in a clustered index is to use the cluster key to navigate through the clustered index b-tree to find the exact record.” I get confused. I mean, it is true for most of dml statements, for sure, but when we are thinking about non-clustered indexes, it would not be faster to use the physical RID and go directly to page/slot instead to have to travel all the b-tree again in the clustered index? And what about the space we will save in the non-clustered index if we store the physical RID instead of the logical one?

    Regards, and thanks for all your time dedicated to spread SQL Server knowledge.

    1. Then you’re talking about a heap. A clustered index, by definition, only allows the looking up of records within it using the cluster keys, regardless of whether it is technically possible to use a direct physical RID.

  3. Hi Paul.
    Firstly thanks for posting, I happend across this article when doing some research…
    You mention “Applying a partitioning scheme to a clustered index, or moving it to a different filegroup, doesn’t change the cluster key at all – so there’s no need to rebuild the non-clustered indexes.”
    However AFAIK SQL does not actually have a SQL method that supports this capability?
    Ex. Using CREATE INDEX WITH DROP_EXISTING and assigning a partition scheme to an existing Clustered Index will rebuild the NC indexes once.
    …or am I missing some SQL command here than can do this and save the NC build cost?
    Thanks

      1. Thanks for clarifying Paul, BTW your post is very much still relevant today :) — The behavior seems unchanged in SQL 2014. To me it seems like a lot of extra work SQL does under the covers on the NC indexes when the cluster key itself is unchanged after applying the partition scheme. I guess it is how it is.

  4. Interesting post, thanks for this.
    I am interested if the behaviour of the Versions beyond 2005 have changed? Would it make sense to update the article including those?

  5. Hi Paul. One word of warning about an issue that I have just discovered. Say, you have a heap and NC indexes on it. You DISABLE the NC indexes and decide you want to make the table into a cluster. So you set off to build a cluster index on the table. All good and well until suddenly you realize that your NC indexes have all been rebuilt as well as a result of this and—what’s very important—they have all been ENABLED! Now, this is a serious issue. If you had NC indexes disabled on the table in the first place, they will get enabled regardless. So, in a word, you cannot decide ahead of the cluster creation whether some indexes should or should not be rebuilt—they will all get rebuilt. The only power you have over the process is when you actually DROP all the indexes but even then you will have to go through the process of rebuilding every single one of them, even the ones that should be disabled… I could understand the rebuilding of the enabled indexes but the disabled ones? Why??? After all, if you want to enable an index you do have to use alter index ix_your_index on your_table REBUILD, don’t you?

    1. This looks like an oversight in the logic around clustered <-> heap transitions, and I agree with you that the disabled indexes should not be re-enabled except through an explicit REBUILD statement. Have you entered a Connect item to suggest a change to Microsoft?

  6. Hi Paul,

    Going from a heap to a clustered index, is this behavior changed in later versions. Micorsoft docs explicitly states “nonclustered indexes must all be recreated whenever the clustered index is changed”.

    Can you please share your thoughts. Appreciate your kind help here to clear the air.

    Reference link below

    https://docs.microsoft.com/en-us/sql/relational-databases/indexes/heaps-tables-without-clustered-indexes

    “Warning

    Creating or dropping clustered indexes requires rewriting the entire table. If the table has nonclustered indexes, all the nonclustered indexes must all be recreated whenever the clustered index is changed. Therefore, changing from a heap to a clustered index structure or back can take a lot of time and require disk space for reordering data in tempdb.”

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.