Over the last few weeks I've seen (and helped correct) quite a few myths and misconceptions about index rebuild operations. There's enough now to make it worthwhile doing a blog post (and it's too hot here in Orlando for us to go sit by the pool so we're both sitting here blogging)…
Myth 1: index rebuild pre-allocates the necessary space
This myth has two variations:
- The space for the new copy of the index is pre-allocated
- The space for the sort portion of the rebuild is pre-allocated
Neither of these are true. Index rebuild (whether online or offline, and at least as far back as 7.0) will create a new copy of the index before dropping the old copy. The pages and extents required to do this will always be allocated as needed, as with any other operation in SQL Server. The sort phase of an index rebuild, if required (in certain cases it is skipped in 2005), will adhere to the same allocation behavior.
Myth 2: indexes are rebuilt within a single file in a multi-file filegroup
This is a new one that I just heard yesterday – (paraphrasing) "In a two-file filegroup, an index in file 1 will be rebuilt into file 2. The next time it is rebuilt, it will be built in file 1. And so on".
This is untrue. Any time any allocations are done in a multi-file filegroup, the allocations are spread amongst all the files using the allocation system's proportional fill algorithm. In a nutshell, this says that space will be allocated more frequently from larger files with more free space than from smaller files with less free space. There is no concept in SQL Server of limiting allocations to a particular file in a multi-file filegroup.
Myth 3: non-clustered indexes are always rebuilt when a clustered index is rebuilt
This is untrue. The rules are a little complex here but can be summed up as follows:
- In 2005+, rebuilding a unique or non-unique clustered index (without changing its definition) will NOT rebuild the non-clustered indexes
- In 2000:
- Rebuilding a non-unique clustered index WILL rebuild the non-clustered indexes
- Rebuilding a unique clustered index will NOT rebuild the non-clustered indexes
The first few service packs of 2000 had bugs that changed the behavior of rebuilding unique clustered indexes back and forth – this is the source of much of the confusion around this myth.
For a much more detailed discussion of this, see my blog post from last Fall – Indexes From Every Angle: What happens to non-clustered indexes when the table structure is changed?.
Myth 4: BULK_LOGGED recovery mode decreases the size of the transaction log and log backups for an index rebuild
This myth is partly true.
Switching to the BULK_LOGGED recovery mode while doing an index rebuild operation WILL reduce the amount of transaction log generated, which is very useful for limiting the size of the transaction log file (note I say 'file', not 'files' – you only need one log file).
Switching to the BULK_LOGGED recovery mode while doing an index rebuild will NOT reduce the size of the transaction log BACKUP. Although the operation will be minimally-logged, the next transaction log backup will read all the transaction log since the last backup plus all the extents that were changed by the minimally-logged index rebuild. This will result in a log backup that's almost exactly the same size as for a fully-logged index rebuild. The ONLY time a log backup will contain data extents is when a minimally-logged operation has taken place since the last log backup – see here on MSDN for more info.
If you're considering using the BULK_LOGGED recovery mode, beware that you lose the ability to do point-in-time recovery to ANY point covered by a transaction log backup that contains even a single minimally-logged operation. Make sure that there's nothing else happening in the database that you may need to effectively roll-back with P.I.T. recovery. The operations you should perform if you're going to do this are:
- In FULL recovery mode, take log backup immediately before switching to BULK_LOGGED
- Switch to BULK_LOGGED and do the index rebuild
- Switch back to FULL and immediately take a log backup
This limits the time period in which you can't do P.I.T. recovery.
Myth 5: online index rebuild doesn't take any locks
This myth is untrue. The 'online' in 'online index operations' is a bit of a misnomer. Online index operations need to take two very short-term table locks. An S (Shared) table lock at the start of the operation to force all write plans that could touch the index to recompile, and a SCH-M (Schema-Modification – think of it as an Exclusive) table lock at the end of operation to force all read and write plans that could touch the index to recompile.
The most recent time this came up on the forums was someone noticing insert queries timing out after an online index rebuild operation had just started. The problem is that the table lock that online index rebuild needs has to be entered into the grant queue in the lock manager until it can be acquired – and it will stay there until existing transactions that are holding conflicting locks either commit or roll-back. Any transaction that requires a conflicting lock AFTER the index rebuild lock has been queued but not acquired (and then released) will wait behind it in the lock grant queue. If the query timeout is reached before the transaction can get it's lock, it will timeout.
This is still much better than the table lock being held for the entire duration of the index rebuild operation. For more info, checkout this whitepaper on Online Index Operations in SQL Server 2005.