Search Engine Q&A #19: Misconceptions around index rebuilds (allocation, BULK_LOGGED mode, locking)

Over the last few weeks I've seen (and helped correct) quite a few myths and misconceptions about index rebuild operations. There's enough now to make it worthwhile doing a blog post (and it's too hot here in Orlando for us to go sit by the pool so we're both sitting here blogging)…

Myth 1:  index rebuild pre-allocates the necessary space

This myth has two variations:

  1. The space for the new copy of the index is pre-allocated
  2. The space for the sort portion of the rebuild is pre-allocated

Neither of these are true. Index rebuild (whether online or offline, and at least as far back as 7.0) will create a new copy of the index before dropping the old copy. The pages and extents required to do this will always be allocated as needed, as with any other operation in SQL Server. The sort phase of an index rebuild, if required (in certain cases it is skipped in 2005), will adhere to the same allocation behavior.

Myth 2: indexes are rebuilt within a single file in a multi-file filegroup

This is a new one that I just heard yesterday – (paraphrasing) "In a two-file filegroup, an index in file 1 will be rebuilt into file 2. The next time it is rebuilt, it will be built in file 1. And so on".

This is untrue. Any time any allocations are done in a multi-file filegroup, the allocations are spread amongst all the files using the allocation system's proportional fill algorithm. In a nutshell, this says that space will be allocated more frequently from larger files with more free space than from smaller files with less free space. There is no concept in SQL Server of limiting allocations to a particular file in a multi-file filegroup.

Myth 3: non-clustered indexes are always rebuilt when a clustered index is rebuilt

This is untrue. The rules are a little complex here but can be summed up as follows:

  • In 2005+, rebuilding a unique or non-unique clustered index (without changing its definition) will NOT rebuild the non-clustered indexes
  • In 2000:
    • Rebuilding a non-unique clustered index WILL rebuild the non-clustered indexes
    • Rebuilding a unique clustered index will NOT rebuild the non-clustered indexes

The first few service packs of 2000 had bugs that changed the behavior of rebuilding unique clustered indexes back and forth – this is the source of much of the confusion around this myth.

For a much more detailed discussion of this, see my blog post from last Fall – Indexes From Every Angle: What happens to non-clustered indexes when the table structure is changed?.

Myth 4: BULK_LOGGED recovery mode decreases the size of the transaction log and log backups for an index rebuild

This myth is partly true.

Switching to the BULK_LOGGED recovery mode while doing an index rebuild operation WILL reduce the amount of transaction log generated, which is very useful for limiting the size of the transaction log file (note I say 'file', not 'files' – you only need one log file).

Switching to the BULK_LOGGED recovery mode while doing an index rebuild will NOT reduce the size of the transaction log BACKUP. Although the operation will be minimally-logged, the next transaction log backup will read all the transaction log since the last backup plus all the extents that were changed by the minimally-logged index rebuild. This will result in a log backup that's almost exactly the same size as for a fully-logged index rebuild. The ONLY time a log backup will contain data extents is when a minimally-logged operation has taken place since the last log backup – see here on MSDN for more info.

If you're considering using the BULK_LOGGED recovery mode, beware that you lose the ability to do point-in-time recovery to ANY point covered by a transaction log backup that contains even a single minimally-logged operation. Make sure that there's nothing else happening in the database that you may need to effectively roll-back with P.I.T. recovery. The operations you should perform if you're going to do this are:

  • In FULL recovery mode, take log backup immediately before switching to BULK_LOGGED
  • Switch to BULK_LOGGED and do the index rebuild
  • Switch back to FULL and immediately take a log backup

This limits the time period in which you can't do P.I.T. recovery.

Myth 5: online index rebuild doesn't take any locks

This myth is untrue. The 'online' in 'online index operations' is a bit of a misnomer.  Online index operations need to take two very short-term table locks. An S (Shared) table lock at the start of the operation to force all write plans that could touch the index to recompile, and a SCH-M (Schema-Modification – think of it as an Exclusive) table lock at the end of operation to force all read and write plans that could touch the index to recompile.

The most recent time this came up on the forums was someone noticing insert queries timing out after an online index rebuild operation had just started. The problem is that the  table lock that online index rebuild needs has to be entered into the grant queue in the lock manager until it can be acquired – and it will stay there until existing transactions that are holding conflicting locks either commit or roll-back. Any transaction that requires a conflicting lock AFTER the index rebuild lock has been queued but not acquired (and then released) will wait behind it in the lock grant queue. If the query timeout is reached before the transaction can get it's lock, it will timeout.

This is still much better than the table lock being held for the entire duration of the index rebuild operation. For more info, checkout this whitepaper on Online Index Operations in SQL Server 2005.

14 thoughts on “Search Engine Q&A #19: Misconceptions around index rebuilds (allocation, BULK_LOGGED mode, locking)

  1. Thanks for the clarifications – one question:

    You wrote "Although the operation will be minimally-logged, the next transaction log backup will read all the transaction log since the last backup plus all the extents that were changed by the minimally-logged index rebuild. This will result in a log backup that’s almost exactly the same size as for a fully-logged index rebuild."

    Are there other cases, besides this one, where log backups contain data file extents as well as log records?

  2. Interesting link to the White Paper on Online Index ops. We use Online Index for rebuilds on 2005 EE. The job actually runs on the command-line using OSQL, and recently we had a problem with blocking by Spid -2, which I believe is DTC-related. This caused serious knock-on effects to users. We saw blocking processes running commands of the type

    ALTER INDEX 0x020000000094151D949F6B9F14EED399417BEE7CB327610D
    insert [dbo].[Table] select *, %%bmk%% from [dbo].[Table]

    which I am sure is the batch insert section of the Online Index build. We had to use KILL with the UOW (unit of work) GUID to kill the blocking -2 SPID. I am hoping switching to using an SP direct on the server (as opposed to using OSQL or SQLCMD) will help with this. Just a heads up that online index ops aren’t always trouble-free. The White Paper really helped me understand the process much better though, thank you for the link and the excellent blog. When are you and KT going to come back to the UK?

  3. Thanks for your continued demythologizing efforts — and for the great presentations in Orlando.

    I do have a simple question for clarification. You say that index rebuild will "will create a new copy of the index before dropping the old copy." We have a consultant here right now who says that, at least for a unique clustered index, the rebuild is done "in place." It seems to me that his claim is inconsistent with what you say — is that right?

    Also, digressing a bit: I’m wondering if you can direct me to anywhere that there is a good explanation of the relationship between a table and its clustered index at the physical level. As I understand it, the clustered index really IS the table, and so there can be no table fragmentation other than the fragmentation which is actually present in the clustered index.

    Thanks very much…

  4. Hi Dave,

    The consultant is wrong, period. An index rebuild *always* creates a new index. Rebuilding a unique clustered index in 2005 will not cause *non-clustered indexes* to be rebuilt (myth 3 in that blog post) – maybe that’s what’s confused him/her. See (and point the consultant to)
    https://www.sqlskills.com/blogs/paul/post/Indexes-From-Every-Angle-What-happens-to-non-clustered-indexes-when-the-table-structure-is-changed.aspx.

    You can easily prove it by looking at the pages comprising the unique clustered index before and after the rebuild…

    Yes, you’re correct about the clustered index – it is the table, just that the table rows are ordered according to the clustered index key. I haven’t
    blogged about this yet, you could look in Kalen’s Inside SQL Server books or see one of Kimberly’s recent posts –
    http://www.sqlskills.com/blogs/kimberly/post/Indexes-in-SQL-Server-20052008-Part-2-Internals-.aspx

    Hope this helps!

  5. In my organization, we use sql server 2012 standard edition and when I try to reorganize or rebuild table indexes it didn’t work. Any advise please? Thanks in advance. I tried both GUI and The Alter index statement.

    1. Index reorganize and rebuild are both supported in Standard Edition. Can you reply with the T-SQL statement you’re using and the error you get? Just saying it doesn’t work isn’t enough for a diagnosis.

  6. Hi Paul,

    I was one of those that believed that online rebuild of indexes doesn’t produce table locks…
    I was curious to know if the same logic, as the white paper that you referenced, applies to the reorganization of indexes, which is considered to always be online?

    Thank you Great Article

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.