TechEd demo: nonclustered index corruption

This blog post describes the demo "2 – NC Indexes" from my Corruption Survival Techniques conference session from various conferences in 2008. The links to the scripts and databases to use are in this blog post.

Edit 6/4/2012: Be aware that in versions of SQL Server from 2008 onwards, you may only be able to fix the corruption by doing a DROP + CREATE index, as all other methods will try to read the old index to build them. You can avoid any constraints (e.g. unique) being invalidated in between the DROP and the CREATE by wrapping the whole operation in an explicit transaction.

The aim of this demo is to show that sometimes its just redundant data (i.e. nonclustered indexes) that get corrupted, and so you don't have to do anything that takes the actual data offline – like restoring from a full backup or running one of the repair options (both of which require the database to be in single-user mode).

Let's look at an example. Extract and restore the DemoNCIndex database, and the NCIndexCorruption.sql script. What do we get from running DBCC CHECKDB on the DemoNCIndex database (lines 39-42)?

DBCC CHECKDB (DemoNCIndex) WITH NO_INFOMSGS, ALL_ERRORMSGS;
GO

Msg 8951, Level 16, State 1, Line 1
Table error: table 'Customers' (ID 453576654). Data row does not have a matching index row in the index 'CustomerName' (ID 2). Possible missing or invalid keys for the index row matching:
Msg 8955, Level 16, State 1, Line 1
Data row (1:45:28) identified by (CustomerID = 29) with index values 'LastName = 'Adams' and CustomerID = 29'.
Msg 8951, Level 16, State 1, Line 1
Table error: table 'Customers' (ID 453576654). Data row does not have a matching index row in the index 'CustomerName' (ID 2). Possible missing or invalid keys for the index row matching:
Msg 8955, Level 16, State 1, Line 1
Data row (1:180:164) identified by (CustomerID = 2118) with index values 'LastName = 'Adams' and CustomerID = 2118'.

<snip – removed for brevity>

Msg 8952, Level 16, State 1, Line 1
Table error: table 'Customers' (ID 453576654). Index row in index 'CustomerName' (ID 2) does not match any data row. Possible extra or invalid keys for:
Msg 8956, Level 16, State 1, Line 1
Index row (1:24482:16) with values (LastName = 'Andersen' and CustomerID = 18718) pointing to the data row identified by (CustomerID = 18718).
Msg 8952, Level 16, State 1, Line 1
Table error: table 'Customers' (ID 453576654). Index row in index 'CustomerName' (ID 2) does not match any data row. Possible extra or invalid keys for:
Msg 8956, Level 16, State 1, Line 1
Index row (1:24482:127) with values (LastName = 'Arthur' and CustomerID = 9758) pointing to the data row identified by (CustomerID = 9758).
CHECKDB found 0 allocation errors and 26 consistency errors in table 'Customers' (object ID 453576654).
CHECKDB found 0 allocation errors and 26 consistency errors in database 'DemoNCIndex'
repair_rebuild is the minimum repair level for the errors found by DBCC CHECKDB (DemoNCIndex).
 

Lots of errors. Now, in this example there are only 26 errors, but in cases where there are hundreds of errors it can be really hard to tell whether all the corruptions are in nonclustered indexes (i.e. indexes with IDs > 1). Luckily, there's an undocumented option to all the DBCC CHECK* commands – WITH TABLERESULTS. The option is undocumented because the output can change from release to release, but basically this converts the DBCC output into tabular form. Try running lines 48-50 in the script and you'll see what I mean. One of the columns in the output is IndexId – so you can easily see whether all the errors are in nonclustered indexes. In this case, they are, and all in one index of the Customers table.

You could run lines 55-57 of the script to prove to yourself that repairs can't be run online, and then realize that we can address the problem without having to run repair or restore. First off we need to figure out the name of the index to fix – index ID 2 of the Customers table. Lines 77-80 run sp_HelpIndex on the table (although I should really be using Kimberly's sp_HelpIndex2):

USE DemoNCIndex
GO
EXEC sp_HelpIndex 'Customers'
;
GO

index_name     index_description                                   index_keys
————– ————————————————— ————
CustomerName   nonclustered located on PRIMARY                     LastName
CustomerPK     clustered, unique, primary key located on PRIMARY   CustomerID

The nonclustered index is called CustomerName. Plug in the index name to line 82, then try fixing the index by doing an online index rebuild, and run DBCC CHECKDB afterwards (lines 82-89). The corruption hasn't been fixed! Online index rebuild reads the old index to build the new one so the new index has the same missing rows as the old one. We need to do an offline index rebuild – with lines 110-115. After the last DBCC CHECKDB, the index is fixed up. Now, on SQL Server 2008, you may or may not get a query plan for the index rebuild that doesn't use the old index, because the query optimizer has some more plan choices available to it – so on SQL Server 2008 you may need to do an actual drop and create of the broken index (carefully, if its enforcing a constraint).

So – just because DBCC CHECKDB reports a ton of errors, that doesn't necessarily mean that the database needs to be taken (essentially) offline to repair it – check through the errors to see if it's just nonclustered indexes that are affected.

10 thoughts on “TechEd demo: nonclustered index corruption

  1. Dear Paul,

    If I understand BOL correct, with SQL Server 2008 SP3 all error messages are displayed by default when you run DBCC CHECKDB. So would you still advise us to use ALL_ERRORMSGS on the DBCC CHECKDB?

    Thank you for all your good advise to all of us.

    Aron Pedersen

  2. Due to our SAN down, we were not able to connect to on our PROD DB.When I ran DBCC on effected DB we ran into exact same issues mentioned in this article and I was able resolve with steps mentioned in article. Thanks for the help Paul.

  3. Thank you! I was being driven crazy by why a disable and rebuild did not fix corruption. (I’m still a bit frustrated – it seems to me that the disable should have rendered the index contents as untrustworthy, and forced a re-scan of the table data.

    1. As an addendum: SQL Server will not just re-read “the” index on a rebuild – it will reuse any index that covers the index-to-rebuild.

      In the case that brought me here, I saw two indexes that differed in the order of columns (One was Col1, Col2, Col3, and the other was Col1, Col3, Col2), and both were corrupt.

      To fix them, we had to drop both indexes – otherwise, SQL Server would use the second index to rebuild the first (re-creating the corruption) and then use the first index to rebuild the second (ditto).

      However: the transaction, drop both, create both, worked like a charm.

  4. Hi Paul,
    So online index rebuild does not fix corruption? Only offline index rebuild fixes corruption?

        1. No – that’s really an online *build* being done using a rebuild after a disable. Disabling the index removes everything but its definition, and doing a rebuild builds a new index by scanning the base table. Doing the online rebuild without disabling it first will scan the existing index, not removing the corruption.

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.