Wednesday, August 27, 2008

Every so often I'll see posts on the various data corruption forums discussing causes of corruption. In this post I want to debunk some of the myths around what can cause corruption. There are really two types of corruption to deal with, physical corruption and logical corruption.

Physical corruption

This is where something has altered the contents of a data or log file sector with no regard for what is being stored there. Possible causes of physical corruption are:

  • Problem with the I/O subsystem (99.8% of all cases I've ever seen - only 3 nines as I'd estimate I've seen around about a thousand corruption cases). Remember the I/O subsystem is everything underneath SQL Server in the I/O stack - including the OS, 3rd-party file system filter drivers, device drivers, RAID controllers, SAN controllers, network hardware, drives themselves, and so on. Millions of lines of code and lots of moving parts spinning very fast, very close to very fragile pieces of metal oxide (I once heard Jim Gray liken a disk drive head to a 747 jumbo jet flying at 500 mph at a height of 1/4 inch from the ground...)
  • Problem with the host machine hardware (0.1% of cases). Most of the time this is a memory error.
  • SQL Server bugs (0.1% of cases). Yes, there have been corruption bugs. Every piece of software has bugs. There are KB articles describing bugs.
  • Deliberate introduction of corruption using a hex editor or other means.

Physical corruption is what DBCC CHECKDB usually reports and the majority of cases are caused by a physical failures of some kind, with the minority caused by humans - software bugs.

Logical corruption

This is where something has altered some data so that a data relationship is broken. Possible causes of logical corruption are:

  • Humans

:-) Okay...

  • Application bug. The application deletes one part of an inherent data relationship but not the other. Or the application designer doesn't implement a constraint properly. Or the application designer doesn't cope with a transaction roll-back properly. You get the idea.
  • Accidental update/delete. Someone deletes or updates some data incorrectly.
  • SQL Server bug. See above.
  • DBCC CHECKDB when using the REPAIR_ALLOW_DATA_LOSS option. As is documented in Books Online, and I've blogged about and mentioned when lecturing, if you run repair, it doesn't take into account any inherent or explicit constraints on the data.

The point here is that a physical failure of a component does not cause logical corruption, it causes physical corruption. Conversely, application errors cause logical corruption, not physical corruption. DBCC CHECKDB errors are about physical corruption (okay, with the inclusion of DBCC CHECKCATALOG code in 2005, it will find cases where the DBA has manually altered the system tables, causing logical corruption) and applications cannot cause physical corruption as they can only manipulate data through SQL Server. If an application hits a SQL Server bug which causes physical corruption, that's still not the application causing physical corruption, it's SQL Server.

So - on to the myths.

  • Can an application cause physical corruption? No.
  • Can stopping a shrink operation cause corruption of any kind? No.
  • Can stopping an index rebuild cause corruption of any kind? No.
  • Can running DBCC CHECKDB without repair cause corruption of any kind? No.
  • Can creating a database snapshot cause corruption of any kind? No.

Hope this helps.

Wednesday, August 27, 2008 9:11:27 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Wednesday, June 11, 2008

Today I presented my brand new session Surviving Corruption: From Detection to Recovery at TechEd. I had a lot of fun putting together the demos, presenting the session, and talking to people afterwards. During the session, I promised to blog each of the demos so that everyone can run through them - here's the first one.

On SQL 2000, it was pretty easy to get into the system tables and manually change them - all you had to do was:

EXEC sp_configure 'allow updates', 1;
GO
RECONFIGURE WITH OVERRIDE;
GO

And then you could insert, update, and delete whatever you wanted in the all the system tables, including the critical three - sysindexes, sysobjects, and syscolumns. The problem was that sometimes people actually did this and messed things up - for instance, by manually deleting an object from sysobjects, but leaving around all the other info about the object - such as indexes and columns. DBCC CHECKCATALOG in SQL 2000 would find this, but DBCC CHECKDB would not - as it didn't run the DBCC CHECKCATALOG code - any most people do not run DBCC CHECKCATALOG at all. Many times now, I've seen databases upgraded to 2005 and suddenly DBCC CHECKDB is reporting metadata corruption errors - all because someone had manually changed the system tables on 2000, and I changed DBCC CHECKDB in 2005 to include the DBCC CHECKCATALOG checks.

This demo is all about that. I created a 2000 database, manually deleted a row in sysobjects and then upgraded the database to 2005. The corrupt database is available in a zip file - DemoCorruptMetadata.zip. If you unzip it into a folder C:\SQLskills then you can attach it using:

RESTORE DATABASE DemoCorruptMetadata FROM DISK = 'C:\SQLskills\DemoCorruptMetadata.bak'
   
WITH MOVE 'DemoCorruptMetadata' TO 'C:\SQLskills\DemoCorruptMetadata.mdf',
   
MOVE 'DemoCorruptMetadata_log' TO 'C:\SQLskills\DemoCorruptMetadata_log.ldf',
   
REPLACE;
GO

So what does the corruption look like on 2005?

DBCC CHECKDB (DemoCorruptMetadata) WITH NO_INFOMSGS, ALL_ERRORMSGS;
GO
Msg 8992, Level 16, State 1, Line 1
Check Catalog Msg 3853, State 1: Attribute (object_id=1977058079) of row (object_id=1977058079,column_id=1) in sys.columns does not have a matching row (object_id=1977058079) in sys.objects.
Msg 8992, Level 16, State 1, Line 1
Check Catalog Msg 3853, State 1: Attribute (object_id=1977058079) of row (object_id=1977058079,column_id=2) in sys.columns does not have a matching row (object_id=1977058079) in sys.objects.
CHECKDB found 0 allocation errors and 2 consistency errors not associated with any single object.
CHECKDB found 0 allocation errors and 2 consistency errors in database 'DemoCorruptMetadata'.

This is what we expect. Notice that there's no recommended repair level at the end of the output - this is because CHECKDB can't repair metadata corruptions. We can't fix this with a backup - unless we have a backup from 2000 from before the manual delete in the system tables. To fix this we'd need to go back to 2000, fix the corruption, and then upgrade again - usually not feasible.

Instead, we're going to fix it by manually altering the system tables in 2005 - something that's purportedly not possible. First let's see what tables there are that could include column information (remembering that the system catalogs were completely rewritten between 2000 and 2005):

SELECT [name] FROM DemoCorruptMetadata.sys.objects WHERE [name] LIKE '%col%';
GO

name
------------------
sysrowsetcolumns
syshobtcolumns
syscolpars
sysiscols

I know that sysrowsetcolumns and syshobtcolumns are involved at low-levels of the Storage Engine and don't contain relational metadata, so let's try syscolpars. I want to see what columns there are to see if one of the looks like an object ID, and another looks like a column ID. This query will just return the table columns, with no rows (because the condition 1=0 is always false:

SELECT * FROM DemoCorruptMetadata.sys.syscolpars WHERE 1 = 0;
GO

Msg 208, Level 16, State 1, Line 1
Invalid object name 'DemoCorruptMetadata.sys.syscolpars'.

I can't bind to internal system tables in 2005. But - I can bind to internal system tables using the Dedicated Admind Connection (or DAC for short). This is documented in Books Online at http://msdn.microsoft.com/en-us/library/ms179503.aspx. You can get to the DAC through SQLCMD using the /A switch. So - assuming I'm now connected through the DAC, I'll try that command again:

C:\Documents and Settings\paul>sqlcmd /A
1> USE DemoCorruptMetadata;
2> GO
Changed database context to 'DemoCorruptMetadata'.
1> SELECT * FROM sys.syscolpars WHERE 1=0;
2> GO
id          number colid       name

xtype utype       length prec scale collationid status      maxinrow xmlns
 dflt        chk         idtval

----------- ------ ----------- -------------------------------------------------
-------------------------------------------------------------------------------
----- ----------- ------ ---- ----- ----------- ----------- -------- -----------
 ----------- ----------- -------------------------------------------------------
-----------

(0 rows affected)
1>

This looks like the table. Now I'll query against it using the object ID from the original corruption message:

1> SELECT colid, name FROM sys.syscolpars WHERE id = 1977058079;
2> GO
colid       name
----------- --------------------------------------------------------------------
------------------------------------------------------------
          1 SalesID
         
2 CustomerID
(2 rows affected)
1>

Cool. So I'll try deleting the orphaned columns:

1> DELETE FROM sys.syscolpars WHERE id = 1977058079;
2> GO
Msg 259, Level 16, State 1, Server ROADRUNNERPR, Line 1
Ad hoc updates to system catalogs are not allowed.
1>

Hmm. And it doesn't help if I set 'allow updates' to 1, or try putting the database into single-user mode.

There IS a way though. You can put the SERVER into single-user mode, then connect with the DAC and you can then update the system tables. This particular twist on using the DAC isn't documented anywhere except in an MSDN forum thread answered by someone from Microsoft (see here).

BEWARE (if I could put little flashing lights around this too then I would...) that this is undocumented and unsupported - misuse will lead to unrepairable corruption of your databases.

The sequence of events to follow is:

  • make a backup of the database just in case something goes wrong
  • shutdown the server
  • go to the binaries directory (e.g. C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Binn) and start the server in single-user mode using 'sqlservr -m'
  • connect back in using SQLCMD /A, and run the deleta again. This time it will work, but will give an error about metadata cache consistency:

C:\Documents and Settings\paul>sqlcmd /A
1> USE DemoCorruptMetadata;
2> GO
Changed database context to 'DemoCorruptMetadata'.
1> DELETE FROM sys.syscolpars WHERE id = 1977058079;
2> GO

(2 rows affected)
Warning: System table ID 41 has been updated directly in database ID 12 and cache coherence may not have been maintained. SQL Server should be restarted.
1>

  • The system table has been updated, but the in-memory cache of metadata is now out-of-sync with the system tables. So, shutdown the server again as the message suggests and restart it normally
  • run CHECKDB again and you'll see the corruption has been fixed.

Hope this helps some of you. Watch this space for the next demo from TechEd of repairing corruption when no backup is available.

Wednesday, June 11, 2008 5:42:23 PM (Pacific Standard Time, UTC-08:00)  #    Comments [3]  | 
Monday, June 09, 2008

(I'm actually on-stage here at TechEd doing the  DAT track pre-con with Kimberly - she's on now until lunch so I'm catching up on forum problems...)

Here's a question that came up on of the SQLServerCentral.com corruption forums I monitor that I think is worth blogging about. To paraphrase:

I have a bunch of corruptions in a database, that look like they've been there for a while. Repair is my only option - it works but I'd like to know what data is being deleted. How can I do that? Here are some of the errors:

Server: Msg 8928, Level 16, State 1, Line 2
Object ID 645577338, index ID 0: Page (1:168576) could not be processed. See other errors for details.
Server: Msg 8928, Level 16, State 1, Line 2
Object ID 645577338, index ID 0: Page (1:168577) could not be processed. See other errors for details.
Server: Msg 8928, Level 16, State 1, Line 2
Object ID 645577338, index ID 0: Page (1:168578) could not be processed. See other errors for details.
Server: Msg 8928, Level 16, State 1, Line 2
Object ID 645577338, index ID 0: Page (1:168579) could not be processed. See other errors for details.
Server: Msg 8928, Level 16, State 1, Line 2
Object ID 645577338, index ID 0: Page (1:168580) could not be processed. See other errors for details.
Server: Msg 8928, Level 16, State 1, Line 2
Object ID 645577338, index ID 0: Page (1:168581) could not be processed. See other errors for details.
Server: Msg 8928, Level 16, State 1, Line 2
Object ID 645577338, index ID 0: Page (1:168582) could not be processed. See other errors for details.
Server: Msg 8976, Level 16, State 1, Line 2
Table error: Object ID 645577338, index ID 1. Page (1:168576) was not seen in the scan although its parent (1:165809) and previous (1:168575) refer to it. Check any previous errors.
Server: Msg 8978, Level 16, State 1, Line 2
Table error: Object ID 645577338, index ID 1. Page (1:168583) is missing a reference from previous page (1:168582). Possible chain linkage problem.

This is a clustered index that CHECKDB  will repair by deleting pages at the leaf-level - essentially deleting a bunch of records. The pages look to be trashed (there were a bunch more errors that I didn't include here that said the page headers were all corrupted - looked like the IO subsystem trashde a whole 64KB chunk of the disk) so there's nothing much else you can do. As the table has a clustered index, you can use the error messages to find the pages on either 'logical' side of the pages being deleted - and hence figure out the range of records that have been deleted.

The errors show that pages 168576 through 168582 in file 1 are corrupt. There are also errors that say the previous page of 168576 is 168575, and the next page of 168582 is 168583. If you do a DBCC PAGE of these two pages, you can find the lower and upper bound of the clustered index key values that have been lost. Think of three ranges:

  • the lower range of records that are intact, logically before the corrupt pages in the index
  • the range of records that will be deleted by repair
  • the upper range of records that are intact, logically after the corrupt pages in the index

To find the upper bound of the lower range:

DBCC TRACEON (3604); -- allows the output to come to the console
DBCC PAGE ('dbname', 1, 168575, 3);
GO

The key value in the slot at the end of output is the upper bound of the bottom range that's intact.

Then do:

DBCC PAGE ('dbname', 1, 168583, 3);
GO

The key value in the slot at the beginning of the output is the lower bound of the upper range that's intact.

Everything in the middle will be deleted. You could also try a DBCC PAGE on the corrupt pages themselves too - you might be able to see some data in them.

I'll be blogging a bunch more about repair after my corruption session this week at TechEd - watch this space!

Monday, June 09, 2008 7:54:32 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Thursday, May 22, 2008

Before I start, I want to make it clear that you can only hit this bug if you ALREADY have corruption, that it's quite rare, and that there is a workaround.

I've noticed a few more people in the forums having CHECKDB fail with this particular error in the last month

Msg 8967, Level 16, State 216, Line 1
An internal error occured in DBCC which prevented further processing. Please contact Product Support.

instead of completing properly and listing the corruptions in the database.

Whenever CHECKDB is using a database snapshot, it must check that the page it read through the snapshot does not have an LSN (Log Sequence Number) higher than that when the snapshot was created. If it did, this would mean that the page was modified AFTER the snapshot was created and hence CHECKDB would be working from an inconsistent view of the database. If this case is discovered, CHECKDB stops immediately. When I rewrote CHECKDB for SQL Server 2005, I changed a bunch of code assertions into seperate states of the 8967 error, so that CHECKDB would fail gracefully if some condition occured that indicates a bug or something that should never happen. State 216 is for the bad LSN condition I've just described.

I used to think it was caused by a race condition with the NTFS code that implements sparse files, which is used by the hidden database snapshot that CHECKDB uses by default. However, I've come to learn that this is a bug in CHECKDB (not one of mine I should say :-)) that causes this behavior under certain circumstances when corruption is present. The bug is that if a corrupt page fails auditing inside CHECKDB, the LSN check is still performed. If the corruption affects the LSN stamped in the page header, the 8967 error could be triggered. I've seen this a handful of times in the last few weeks - hence the need for a blog post. I've discussed this with the dev team and hopefully the fix will make it into the next SPs for 2005 and 2008 (too late to fix such a rare problem in such a critical component at this stage of 2008 development). They're going to put a KB article together too - but in the meantime, I wanted to get this on the Internet so Google/Live Search pick it up.

Now let's repro the problem. Starting with a simple database and table, I'll find the first page so I can corrupt it.

CREATE DATABASE TestCheckdbBug;
GO
USE TestCheckdbBug;
GO
CREATE TABLE test (c1 INT, c2 CHAR (5000));
INSERT INTO test VALUES (1, 'a');
GO
EXEC sp_AllocationMetadata 'test';
GO

Object Name  Index ID  Alloc Unit ID      Alloc Unit Type  First Page  Root Page  First IAM Page
-----------  --------  -----------------  ---------------  ----------  ---------  --------------
test         0         72057594042318848  IN_ROW_DATA      (1:143)     (0:0)      (1:152)

Now I'm going to corrupt the page type on page (1:143) to be 255 (an invalid page type), which will guarantee the page fails the audit inside CHECKDB.

DBCC CHECKDB ('TestCheckdbBug') WITH ALL_ERRORMSGS, NO_INFOMSGS;
GO

Msg 8928, Level 16, State 1, Line 1
Object ID 2073058421, index ID 0, partition ID 72057594038321152, alloc unit ID 72057594042318848 (type In-row data): Page (1:143) could not be processed. See other errors for details.
Msg 8939, Level 16, State 6, Line 1
Table error: Object ID 2073058421, index ID 0, partition ID 72057594038321152, alloc unit ID 72057594042318848 (type In-row data), page (1:143). Test ((m_type >= DATA_PAGE && m_type <= UNDOFILE_HEADER_PAGE) || (m_type == UNKNOWN_PAGE && level == BASIC_HEADER)) failed. Values are 255 and 255.
CHECKDB found 0 allocation errors and 2 consistency errors in table 'test' (object ID 2073058421).
CHECKDB found 0 allocation errors and 2 consistency errors in database 'TestCheckdbBug'.
repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (TestCheckdbBug).

Now I'm going to corrupt the LSN on that page such that it's guaranteed to be higher than the creation LSN of the database snapshot (basically by filling the first part of the page header LSN field with 0xFF).

DBCC CHECKDB ('TestCheckdbBug') WITH ALL_ERRORMSGS, NO_INFOMSGS;
GO

Msg 8967, Level 16, State 216, Line 1
An internal error occurred in DBCC that prevented further processing. Contact Customer Support Services.
Msg 8921, Level 16, State 1, Line 1
Check terminated. A failure was detected while collecting facts. Possibly tempdb out of space or a system table is inconsistent. Check previous errors.

Bingo! And in the error log, there's some diagnostic information so we can tell which page caused the problem:

2008-05-22 14:55:01.95 spid53   DBCC encountered a page with an LSN greater than the current end of log LSN (31:0:1) for its internal database snapshot. Could not read page (1:143), database 'TestCheckdbBug' (database ID 15), LSN = (-1:65535:18), type = 255, isInSparseFile = 0. Please re-run this DBCC command.
2008-05-22 14:55:01.95 spid53   DBCC CHECKDB (TestCheckdbBug) WITH all_errormsgs, no_infomsgs executed by ROADRUNNERPR\paul terminated abnormally due to error state 1. Elapsed time: 0 hours 0 minutes 0 seconds.

Note the page ID (in black bold above) tells us the bad page and the LSN (in blue bold above) reflects the corruption that I caused. If the page ID field of the header was corrupt, it wouldn't be possible to tell from these diagnostics which page is corrupt.

However, all is not lost. This bug means that under these circumstances the default online behavior of CHECKDB can't run. The workaround is to use the WITH TABLOCK option of CHECKDB, which does offline checking and doesn't need the snapshot - but the trade-off is that an exclusive database lock is required for a short time and then shared table locks for all tables in the database (this is why online is the default). Running this option on my corrupt database gives:

DBCC CHECKDB ('TestCheckdbBug') WITH TABLOCK, ALL_ERRORMSGS, NO_INFOMSGS;
GO

Msg 8928, Level 16, State 1, Line 1
Object ID 2073058421, index ID 0, partition ID 72057594038321152, alloc unit ID 72057594042318848 (type In-row data): Page (1:143) could not be processed. See other errors for details.
Msg 8939, Level 16, State 6, Line 1
Table error: Object ID 2073058421, index ID 0, partition ID 72057594038321152, alloc unit ID 72057594042318848 (type In-row data), page (1:143). Test ((m_type >= DATA_PAGE && m_type <= UNDOFILE_HEADER_PAGE) || (m_type == UNKNOWN_PAGE && level == BASIC_HEADER)) failed. Values are 255 and 255.
CHECKDB found 0 allocation errors and 2 consistency errors in table 'test' (object ID 2073058421).
CHECKDB found 0 allocation errors and 2 consistency errors in database 'TestCheckdbBug'.
repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (TestCheckdbBug).

Which are the exact same results we had before I corrupted the LSN field (this is expected, as there is no check of a page's LSN field EXCEPT when running from a database snapshot). Now we can proceed to restore/repair as appropriate.

So - a scary little bug that has caused some people headaches, but I want to stress again - this can only happen if the database is ALREADY corrupt, and that it's quite rare. Hope this helps some of you picking this up from search engines in the future.

Thursday, May 22, 2008 2:20:58 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Thursday, March 13, 2008

This came up several times during the week so I thought it was about time to blog about it. One of the new features we put into SQL Server 2005 was storing the last time that DBCC CHECKDB completed successfully (called the last-known good time). What does successfully mean? This is the catch - if DBCC CHECKDB runs to completion then it considers that a successful run - EVEN if it found some corruptions. However, the last-known good time is ONLY updated if there were NO corruptions found. Slightly confusing I know.

Cool - but how can you see it? Well, the only time it's ever reported is when the database starts up. Not too useful if the database has been running for months. So how to see it??? The trick is to use DBCC PAGE. The last-known good time is stored in the boot page of the database - page 9. The following code will dump page 9 for you:

-- you need this to get the DBCC PAGE output to your console
DBCC TRACEON (3604);
GO

-- page 9 is the boot page
DBCC PAGE (dbname, 1, 9, 3);
GO

You need to look for the dbi_dbccLastKnownGood field. That was the last time that DBCC CHECKDB ran without finding any corruptions.

Now - what about if you're trusting your SQL Agent jobs to run DBCC CHECKDB for you every so often (how often is a whole other discussion...) and relying on the Agent job failing if DBCC CHECKDB finds corruptions. Well, if all you do is run the DBCC command, you're never going to know about corruption unless DBCC CHECKDB itself fails for some reason - remember, it returns successfully even it it found corruptions.The key is to add another statement after you run DBCC that checks the value of @@ERROR (the last error code reported by SQL Server). Although DBCC CHECKDB doesn't stop with an error when it finds corruption, it will set @@ERROR if there are any error messages in its output.

Edit: The stuff I've struck-through above is not true. I thought I'd remembered it from testing previously and I've heard plenty of anecdotal evidence from customers too BUT an Agent job WILL fail if a DBCC CHECKDB within it fails. I went back and forth with Tara Kizer on SQLteam.com and we tested on 2005 and 2000 - and it worked. Now here is a real issue - the Job History that's captured will not contain all the output from DBCC CHECKDB - especially if you didn't use the WITH NO_INFOMSGS option. You should make sure you capture the output to a file to avoid having to go back and run DBCC CHECKDB all over again.

Summary - make sure you really know when DBCC CHECKDB runs successfully, without finding any corruptions!

Thursday, March 13, 2008 12:32:17 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Tuesday, February 12, 2008

Almost a year ago to the day I asked a question on my old blog - how long does *your* CHECKDB take to run? Well, I had a bunch of replies but I never got around to posting the results. I was reminded of this recently so threw together some graphs of the results. I've plotted database size against run-time of DBCC CHECKDB, for a number of different numbers of CPUs, and mostly on SQL Server 2000. I've also added a linear trend line too.

Now - bear in mind this doesn't take into account the multitude of other factors that can affect how long DBCC CHECKDB takes to run (see my previous post here for the list).

Hopefully these are interesting to some of you!

Tuesday, February 12, 2008 12:35:20 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Sunday, February 03, 2008

Here's an issue that I thought was a one-off but it just popped up again over the weekend so I want to publicize it.

DBA runs a DBCC CHECKDB and gets output like the following:

Msg 5172, Level 16, State 15, Line 1
The header for file 'E:\Data\namechanged.mdf:MSSQL_DBCC14' is not a valid database file header. The PageAudit property is incorrect.
Msg 5120, Level 16, State 9, Line 1
Unable to open the physical file "E:\Data\namechanged.mdf:MSSQL_DBCC14". Operating system error 0: "(null)".
Msg 1823, Level 16, State 2, Line 1
A database snapshot cannot be created because it failed to start.
Msg 7928, Level 16, State 1, Line 1
The database snapshot for online checks could not be created. Either the reason is given in a previous error or one of the underlying volumes does not support sparse files or alternate streams. Attempting to get exclusive access to run checks offline.

etc

Wow! Looks like something's seriously wrong with that database. Until you find that this happens on *all* the databases on the server. And on all the other servers in the data center too. What on earth's going on? I saw a similar issue once before, back in 2004 or 2005 while customers were testing pre-RTM IDW builds of SQL Server 2005. After running a test and doing some investigation for me, the person who just hit this issue over the weekend confirmed my hypothesis that he'd hit the problem I saw a few years ago.

The problem is a 3rd-party file encryption solution. It installs a file-system filter driver that filters all reads and writes to the database files so it can perform the encryption/decryption. Unfortunately it doesn't cope with NTFS alternate streams - which is how the internal database snapshots that DBCC CHECKDB uses are created - and so returns garbage for all reads from the database snapshot. The simple workaround is to create your own database snapshot (whose constituent files will exist in their own right, rather than being alternate streams of the source database's files) and then run DBCC CHECKDB against that.

Now - I don't want to cry 'wolf' here, but be careful of attributing massive corruption problems to the real IO subsystem if there's any kind of file-system filter driver installed.

Sunday, February 03, 2008 10:36:34 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Sunday, November 18, 2007

This is a question that comes up a lot - how to run consistency checks on a VLDB?

We're talking hundreds of GBs or 1 TB or more. These databases are now common on SQL Server 2000 and 2005 - at TechEd IT Forum this week (and at SQL Connections the week before) there are many customers with multi-TB databases. Any experienced DBA knows the value of running consistency checks, even when the system is behaving perfectly and the hardware is rock-solid. The two problems that people have with running a full CHECKDB on their VLDB are:

  • It takes a long time to run (based on many factors – see my previous post here for details).
  • It uses lots of resources – memory, CPU, IO bandwidth, tempdb space.

So it uses lots of resources for a long time. Even with a decent sized maintenance window, the CHECKDB may run over into normal operations. There's also the case of a system that's already pegged in more or more resource dimensions. Whatever the case, there are a number of options:

  • Don't run consistency checks
  • Run CHECKDB using the WITH PHYSICAL_ONLY option
  • Use SQL Server 2005's partitioning feature and devise a consistency checking plan around that
  • Figure out your own scheme to divide up the consistency checking work over several days
  • Offload the consistency checks to a separate system

Let's look at each in turn.

Don't run consistency checks

A lot of people end up doing this as they can’t figure out how to run consistency checks on their DB. Don't even think about using this option – there’s always a way to do it. If you absolutely cannot figure out a way to get consistency checks on your system, send me email and I'll help you – I’ve helped many customers with all the schemes below, both in and out of Microsoft.

Use WITH PHYSICAL_ONLY

A full CHECKDB does a lot of stuff - see previous posts in this series for more details. You can vastly reduce the run-time and resource usage of CHECKDB by using the WITH PHYSICAL_ONLY option. With this option, CHECKDB will:

  • Run the equivalent of DBCC CHECKALLOC (i.e. check all the allocation structures)
  • Read and audit every allocated page in the database

So it skips all the logical checks, inter-page checks, and things like DBCC CHECKCATALOG. The fact that all allocated pages are read means that:

  • Any pages that cannot be read at all (i.e. 823 errors) will be discovered
  • If page checksums are enabled in SQL Server 2005, any corruptions caused by the IO subsystem will be discovered as the page checksum will be checked as part of reading the page into the buffer pool

So there's a trade-off of consistency checking depth against run-time and resource usage - but this option will pick up problems caused by the IO subsystem as long as page checksums are enabled and present.

Use the SQL Server 2005 partitioning feature

One of the obvious ways to reduce the time/resources issue is to partition the load. If you're using the partitioning feature in SQL Server 2005 then you're already setup for this. Given that you've hopefully got your partitions stored on separate filegroups, you can use the DBCC CHECKFILEGROUP command.

Consider this example - you have the database partitioned by date such that the current month is on a read-write filegroup and the past 11 months are on 11 read-only filegroups (data from more than a year ago is on some offline storage medium). The prior months also have multiple backups on various media so are considered much 'safer' than the current month. It makes sense that you don't need to check the read-only filegroups as often as the current month's filegroup so an example consistency checking scheme could be:

  • Run a DBCC CHECKFILEGROUP on each read-only filegroup every week or two
  • Run a DBCC CHECKFILEGROUP on the read-write filegroup every day or two (depending on the stability of the hardware, the criticality of the data, and the frequency and comprehensiveness of your backup strategy).

I know of several companies who've made the decision to move to SQL Server 2005 in part because of this capability to easily divide up the consistency checking.

Beware that until SP2 of SQL Server 2005, DBCC CHECKFILEGROUP would not check a table at all if it was split over multiple filegroups. This is now fixed and DBCC CHECKFILEGROUP will check partitions on the specified filegroup even if the table is now completely contained on the filegroup.

Figure out your own way to partition the checks

If you're on SQL Server 2000, or you just haven't partitioned your database on SQL Server 2005, then there are ways you can split up the consistency checking workload so that it fits within a maintenance window. You basically need to simulate what CHECKDB does in a staggered manner. Here's one scheme that I've recommended to several customers:

  • Figure out your largest tables (by number of pages) and split the total number into 7 buckets, such that there are a roughly equal number of database pages in each bucket.
  • Take all the remaining tables in the database and divide them equally between the 7 buckets (using number of pages again)
  • On Sunday:
    • Run a DBCC CHECKALLOC
    • Run a DBCC CHECKCATALOG
    • Run a DBCC CHECKTABLE on each table in the first bucket
  • On Monday, Tuesday, Wednesday:
    • Run a DBCC CHECKTABLE on each table in the 2nd, 3rd, 4th buckets, respectively
  • On Thursday:
    • Run a DBCC CHECKALLOC
    • Run a DBCC CHECKTABLE on each table in the 5th bucket
  • On Friday and Saturday:
    • Run a DBCC CHECKTABLE on each table in the 6th and 7th buckets, respectively

In pre-RTM builds of SQL Server 2005, DBCC CHECKTABLE could not bind to the critical system tables, just like with T-SQL - but that's fixed so you can cover all system tables in SQL Server 2000 and 2005 using the method above.

There's one drawback to this method - a new internal database snapshot is created each time you start a new DBCC command, even for a DBCC CHECKTABLE. If the update workload on the database is significant, then there could be a lot of transaction log to recover each time the database snapshot is created - leading to a long total run-time. In this case, you may need to alter the number of buckets you use to make the total operation fit within your available window.

Use a separate system

This alternative is relatively simple - restore your backup (you are taking regular backups, right?) on another system and run a full CHECKDB on the restored database. This offloads the consistency checking burden from the production system and also allows you to check that your backups are valid. There are some drawbacks to this however:

  • You need to have sufficient disk space on the spare system to be able to restore the backup onto. If the production database is several TB, you need the same several TB on the spare box. This equates to a non-trivial amount of money - initial capital investment plus ongoing storage management costs. (Hopefully a future release will alleviate this – while at Microsoft I invented and patented a mechanism for consistency checking a database in a backup without restoring it.)
  • If the consistency checks find an error, you don't know for sure that the database is corrupt on the production system. It could be a problem with the spare box that's caused the corruption. The only way to know for sure is to run a consistency check on the production system. This is a small price to pay though, because most of the time the consistency checks on the spare system will be ok, so you know the production database was clean at the time the backup was taken.

Summary

You've got a bunch of choices to allow you to run consistency checks, so there's really no excuse for not knowing (within a reasonable timeframe) that something's gone wrong with your database.

Sunday, November 18, 2007 8:11:20 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Thursday, November 15, 2007

In my previous post on interpreting CHECKDB output, plus in my DBCC Internals session at TechEd IT Forum yesterday, I mentioned there are some things that CHECKDB can’t repair. In this post I want to go into a bit more detail – based on a post from my old Storage Engine blog.

Before anyone takes this the wrong way - what do I mean by "can't be repaired"? Remember that that purpose of repair is to make the database structurally consistent, and that to do this usually means deleting the corrupt data/structure (that's why the option to do this was aptly named REPAIR_ALLOW_DATA_LOSS – see this post for more explanation on why repair can be bad). A corruption is deemed unrepairable when it doesn't make sense to repair it given the damage the repair would cause, or the corruption is so rare and so complicated to repair correctly that it's not worth the engineering effort to provide a repair. Remember also that recovery from corruptions should be based on a sound backup strategy, not on running repair, so making this trade-off in functionality makes sense.

Here's a few of the more common unrepairable corruptions that people run into along with the reasons they can't be repaired by DBCC.

PFS page header corruption

An example of this is on SQL Server 2005:

Msg 8946, Level 16, State 12, Line 1
Table error: Allocation page (1:13280496) has invalid PFS_PAGE page header values.
Type is 0. Check type, alloc unit ID and page ID on the page.

CHECKDB uses the PFS pages to determine which pages are allocated - and so which pages to read to drive the various consistency checks. The only repair for a PFS page is to reconstruct it - they can't simply be deleted as they're a fixed part of the fabric of the database. PFS pages cannot be rebuilt because there is no infallible way to determine which pages are allocated or not. There are various algorithms I've experimented with to rebuild them, with optimistic or pessimistic setting of page allocation statuses and then re-running the various consistency checks to try to sort out the incorrect choices, but they all require very long run-times. Given the frequency with which these corruptions are seen, and the engineering effort required to come up with an (imperfect) solution, I made the choice to leave this as unrepairable, and I don’t think that will change in future.

Critical system table clustered-index leaf-page corruption

An example of this is on SQL Server 2000:

Server: Msg 8966, Level 16, State 1, Line 1
Could not read and latch page (1:18645) with latch type SH. sysindexes failed.

And on SQL Server 2005:

Msg 7985, Level 16, State 2, Server SUNART, Line 1
System table pre-checks: Object ID 4. Could not read and latch page (1:51) with
latch type SH. Check statement terminated due to unrepairable error.

In a previous post in the series I described why how and why we do special checks of the clustered indexes of the critical system tables. If any of the pages at the leaf-level of these indexes are corrupt, we cannot repair them. Repairing would mean deallocating the page, wiping out the most important metadata for potentially hundreds of user tables and so effectively deleting all of these tables. That's obviously an unpalatable repair for anyone to allow and so CHECKDB doesn't do it.

Column value corruption

Here's an example of this on SQL Server 2005:

Msg 2570, Level 16, State 3, Line 1
Page (1:152), slot 0 in object ID 2073058421, index ID 0, partition ID 72057594038321152, alloc unit ID 72057594042318848 (type "In-row data"). Column "c1" value is out of range for data type "datetime".  Update column to a legal value.

This is where a column has a stored value that is outside the valid range for the column type. There are a couple of repairs we could do for this:

  1. delete the entire record
  2. insert a dummy value

#1 isn't very palatable because then data is lost and it’s not a structural problem in the database so doesn't have to be repaired. #2 is dangerous - what value should be chosen as the dummy value? Any value put in may adversely affect business logic, or fire a trigger, or have some unwelcome meaning in the context of the table - even a NULL. Given these problems, I chose to allow people to fix the corrupt values themselves.

Metadata corruption

Here's an example of this on SQL Server 2005:

Msg 3854, Level 16, State 1, Line 2
Attribute (referenced_major_id=2089058478) of row (class=0,object_id=2105058535,column_id=0,referenced_major_id=2089058478,referenced_minor_id=0) in sys.sql_dependencies has a matching row (object_id=2089058478) in sys.objects (type=SN) that is invalid.

This example is relatively benign. There are other examples that will cause CHECKDB to terminate - not as bad as the critical system table corruption example above, but enough that CHECKDB doesn't trust the metadata enough to use it to drive consistency checks. Repairing metadata corruption has the same problems as repairing critical system table corruption - any repair means deleting metadata about one or more tables, and hence deleting the tables themselves. It's far better to leave the corruption unrepaired so that as much data as possible can be extracted from the remaining tables.

Summary

Repair can't fix everything. You may end up having to perform manual and time-consuming data extraction from the corrupt database and losing lots of data because of, say, a critical system table corruption. Bottom line (as usual) - make sure you have valid backups so you don't get into this state!

Thursday, November 15, 2007 6:28:50 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Wednesday, November 14, 2007

This is a subject I posted about last year on my old blog but it came up at SQL Connections last week several times so I want to repost it for those who’ve just started following my blog.

There's only one time when you should be trying to work out how long a CHECKDB is going to take - when you're planning your regular database maintenance. If you're faced with a corrupt (or suspected corrupt) database and you're only just starting to think about how long a CHECKDB is going to take - you've made a mistake while planning your disaster recovery strategy. You always need to know how long CHECKDB takes (on average) to run for your database so:

  • You can tell whether a particular run of CHECKDB is taking longer than usual - a sign that it's found some corruption
  • You know how long it will take to get results in a disaster recovery situation

At every conference I go to, someone asks me how long CHECKDB will take to run on their database. There are several ways I could answer this:

  • The unhelpful answer - I've got no idea.
  • The almost-helpful answer - how long did it take to run last time and are the conditions exactly the same?
  • The answer I usually give - it depends.

Now, many people would see the third answer as being somewhat equivalent to the first answer - unhelpful. The problem is that there are many factors which influence how long CHECKDB will take to run. Let me explain the ten most important factors so you get an idea why this is actually a helpful answer. These aren't in any particular order of importance.

1) The size of the database

Pretty obvious... CHECKDB has to read every allocated page in the database so the bigger it is, the longer it will take to read all the pages.

2) Concurrent IO load on the server

At the simplest level, what is CHECKDB going to do? It reads every allocated page in the database. That's a lot of IO. CHECKDB takes great pains to do the most efficient IO it can and read the database pages in their physical order with plenty of readahead so that the disk heads move smoothly across the disks (rather than jumping around randomly and incurring disk head seek delays). If there's no concurrent IO load on the server, then the IOs will be as efficient as CHECKDB can make them. However, introducing any additional IO from SQL Server means that the disk heads will be jumping around - slowing down the CHECKDB IOs. If the IO subsystem is at capacity already from CHECKDB's IO demands, any additional IO is going to reduce the IO bandwidth available to CHECKDB - slowing it down.

3) Concurrent CPU activity on the server

At the next level of simplicity, CHECKDB is going to process every page it reads in some way. Depending on the various options you've specified and the database schema (details below), that's going to use a lot of CPU - it's possible that the server may be pegged at 100% CPU when CHECKDB is running. If there's any additional workload on the server, that's going to take CPU cycles away from CHECKDB and it going to slow it down.

Basically what points #2 and #3 are saying is that CHECKDB is very resource intensive! It’s probably one of the most resource intensive things you can ask SQL Server to do and so it's usually a good idea to not run it during peak workload times, as you'll not only cause CHECKDB to take longer to run, you will slowdown the concurrent workload, possibly unacceptably.

4) Concurrent update activity on the database

This is relevant for both SQL 2000 and SQL 2005, but for different reasons.

In SQL 2000, CHECKDB gets its consistent view of the database from transaction log analysis of concurrent DML transactions (see here for details). The more concurrent DML there is while CHECKDB is running, the more transaction log will be generated - and so the longer it will take for CHECKDB to analyze that transaction log. It's possible that on a large multi-CPU box with a ton of concurrent DML and CHECKDB limited to a single CPU that this phase of CHECKDB could take several times longer than the reading and processing of the database pages! (I've seen this in real-life several times.)

In SQL 2005, CHECKDB gets its consistent view of the database from a database snapshot, which is stored on the same disk volumes as the database itself. If there are a lot of changes in the database while CHECKDB is running, the changed pages are pushed to the snapshot so that it remains consistent. As the snapshot files are stored in the same location as the database files, every time a page is pushed to the snapshot, the disk head has to move, which interrupts the efficient IO described in #2. Also, whenever CHECKDB goes to read a page and it needs to read the page from the snapshot files instead of the database files, that's another disk head move, and another efficient IO interruption. The more concurrent changes to the database, the more interruptions to efficient IO and the slower that CHECKDB runs.

5) Throughput capabilities of the IO subsystem

This one's simple. CHECKDB is going to do a boat-load of IOs and it could even end up being IO-bound (meaning that the CPUs are idle periodically waiting for IOs to complete) depending on the options specified and the database schema. This means that the throughput of the IO subsystem is going to have a direct effect on the run-time of CHECKDB. so, if you have a 1TB database and the IO subsystem can only manage 100MB/sec, it's going to take almost 3 hours just to read the database (1TB / 100MB / 3600 secs) and there's nothing you can do to speed that up except upgrade the IO subsystem.

I've lost count of the number of times I've heard customers complain that CHECKDB (or index rebuilds or other IO-heavy operations) are running sloooowly only to find that the disk queue lengths are enormous and the IO subsystem it entirely unmatched to the server and workload.

6) The number of CPUs (processing cores) on the box

This also really encompasses the Edition of SQL Server that's being run. In Enterprise Edition, CHECKDB can run in parallel across all the CPUs in the box (or as many as the query processor decides to parallelize over when the CHECKDB internal queries are compiled). Running in parallel can give a significant performance boost to CHECKDB and lower run times, as long as the database is also spread over multiple files too (so the IOs can be parallelized). There's a nifty algorithm that’s used that allows CHECKDB to run in parallel which I'll explain in detail in a future post.

On the other hand, the fact that CHECKDB can run in parallel in Enterprise Edition can be bad for some scenarios, and so some DBAs chose to force CHECKDB to be single-threaded. SAP typically recommends this to help with user query predictability. The way to do this is to turn on the documented trace flag 2528.

7) The speed of the disks where tempdb is placed

Running CHECKDB against a VLDB uses lots of memory for internal state and for VLDBs the memory requirement usually exceeds the amount of memory available to SQL Server. In this case, the state is spooled out to tempdb and so the performance of tempdb can be a critical factor in CHECKDB performance. See this post for more details of this and how CHECKDB can run out of disk space if tempdb is too small.

8) The complexity of the database schema

This can have a really high impact on the run-time of CHECKDB because it impacts the amount of CPU that CHECKDB requires. For example, the most expensive checks that CHECKDB does are for non-clustered indexes. It needs to check that each row in a non-clustered index maps to exactly one row in the heap or clustered index for the table, and that every heap/clustered index row has exactly one matching row in each non-clustered index. Although there's a highly efficient algorithm for doing this, it still takes around 30% of the total CPU that CHECKDB uses!

There are a bunch of other checks that are only done if the features have been used in the database - e.g. computed column evaluation, links between off-row LOB values, Service Broker, XML indexes, indexed views - so you can see that empirical factors along aren't enough to determine the run-time.

9) Which options are specified

This is almost the same as #7 in that by specifying various options you're limiting what checks CHECKDB actually performs. For instance, using the WITH NOINDEX option will turn off the non-clustered index checks that I described in #7 and using the WITH PHYSICAL_ONLY option will turn off all logical checks, vastly decreasing the run-time of CHECKDB and making it nearly always IO-bound rather than CPU-bound (in fact this is the most common option that DBAs of VLDBs use to make the run-time of CHECKDB manageable).

One thing to be aware of - if you specify any repair options, CHECKDB always runs single-threaded, even on a multi-proc box on Enterprise Edition.

10) The number and type of corruptions that exist in the database

Again, this is similar to #7 and #8. If there are any corruptions present, there may be extra checks triggered to try to figure out more details of the corruptions. For instance, for the non-clustered index checks, the algorithm is tuned very heavily for the case when there are no corruptions present (the overwhelming majority of cases considering the millions of times CHECKDB is run every day around the world). When a non-clustered index corruption is detected, a more in-depth algorithm has to be used to figure out exactly where the corruption is, which involves re-scanning a bunch of data and so taking a bunch more time. There are a few other algorithms like this too.

Summary

So you can see that there's no simple answer.

Wednesday, November 14, 2007 8:22:36 PM (Pacific Standard Time, UTC-08:00)  #    Comments [3]  | 
Tuesday, November 13, 2007

Last week at SQL Connections someone said that CHECKDB’s output is ‘useless’.  Given that I wrote CHECKDB for SQL Server 2005 I was only mildly offended :-) But there’s a lot of truth in that statement – the error messages from CHECKDB do tell you exactly what’s corrupt in the database but they’re not exactly easy to read and understand unless you’re intimately familiar with the on-disk structures used to store a SQL Server database. It’s pretty easy to understand the output if there’s only one or two error messages, but once you get past 10 or so it becomes extremely difficult to figure out what the actual problem is for normal people (i.e. not CHECKDB geeks like me, Ryan or a handful of senior people in Product Support)

So, how are you supposed to figure out what’s wrong? In this post I want to run through a few tips and tricks you can use.

  • If the recommended repair level (at the very bottom of the output) is REPAIR_REBUILD then only non-clustered indexes are damaged. This meanscyou don’t need to run repair (which needs the database in single-user mode) or restore from backups. You can just manually rebuild the damaged indexes to fix the problem, possibly even online if you're running Enterprise Edition and the index doesn't have any LOB columns.
  • There are a couple of cases where even if it’s only non-clustered indexes that are damaged, REPAIR_ALLOW_DATA_LOSS may still be the recommended repair option. In this case, check the index ID in all the error messages – if all index IDs are 2 or higher, then it’s only non-clustered indexes that are damaged and you can rebuild them yourself instead of running repair/restore.
  • Figure out a list of tables that are affected by looking at the summary messages about how many corruptions were found per table. If each of the tables is isolated on a separate filegroup then these filegroups can be taken offline and restored individually, possibly without interrupting the application workload. If there's no backup, then you can run DBCC CHECKTABLE with repair, for a faster repair operation than running a full CHECKDB.
  • Figure out the list of damaged pages. There are a number of error messages that could point to a damaged page – usually 8928 or 2537 – and then you can choose to do single-page restores from your backups to minimize downtime, possibly even online in Enterprise Edition.
  • Are there any errors that CHECKDB can’t repair?  If so, you have no choice but to restore from backups or to extract as much info as possible into a new database. Things to look for that mean repair won’t be able to fix everything are:
    • CHECKDB stops early and complains about system table pre-checks failing (errors 7984 – 7988 inclusive)
    • CHECKDB reports any metadata corruption (8992, 8995 errors)
    • CHECKDB reports any errors on PFS page headers (8939, 8946 errors with a possible 8998 error as well)

While I was at Microsoft I wrote two 200-page+ documents detailing all the CHECKDB error messages for SQL Server 2000 and 2005 – the team says they should all be translated into Books Online entries by next summer – a bunch of them are there already. In the meantime, I hope the hints above will take a little of the mystery out of CHECKDB’s output.

Tuesday, November 13, 2007 11:07:50 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, November 06, 2007

This is a question that came up yesterday in out Disaster Recovery class so I'm typing it up in between attending sessions at Microsoft Day here at SQL Connections. It's an interesting experience watching all the MS speakers walking around in the distinctive blue shirts and no longer having to wear one myself.

The question is the following - why does DBCC CHECKDB terminate with an out-of-space error in SS2005? A customer had a 500GB database spread over 17 LUNs on a SAN, with each LUN having only 5GB free. There's a heavy concurrent workload running while CHECKDB is running and very often it doesn't compelte but instead stops with an error. What's going on?

The reason for this is the way that CHECKDB gets a transactionally consistent view of the database. In a nutshell, it creates a internal database snapshot of the database and then checks the database snapshot (you can read more about this in my previous post detailing all the steps of CHECKDB). A database snapshot needs to have one snapshot file for each file in the source database. In the case of CHECKDB, the snapshot files are created as alternate-streams of the existing database files - i.e. stored on the same disk volume - and you have no control over this. This means any changes to the database while CHECKDB is running will cause these alternate streams to grow.
 
If there's a significant concurrent workload while CHECKDB is running, then these alternate streams can get very big very quickly. In the case described in the question, the workload caused one of the alternate streams to take up all available space on the LUN and then it ran out of space. When this happens the snapshot is no longer valid and so CHECKDB has to stop.
 
There are two solutions to this. The obvious first one is to run CHECKDB in a period with low concurrent workload. That's not feasible for many 24x7 shops so the better solution is to create your own database snapshot (so you can control the placement of it) and then run CHECKDB on that. This is no different than having CHECKDB create its own snapshot.
 
There's another reason that CHECKDB may stop with an out-of-space error.  CHECKDB has to validate the contents of the whole database. As such it needs to store info about things it has seen at one point during database scannig so that it can match that info against things it sees at later points during database scanning. These bits of info are stored in an in-memory worktable. In many cases though, the amount of info being stored exceeds the available memory and the worktable needs to spill over to tempdb. (For example, running CHECKDB against a 1TB database on a machine with 16GB of memory - the amount of info that CHECKDB needs to store the intermediate pieces of info will likely exceed the memory available to SQL Server). So - if tempdb isn't big enough to store the worktable, it has to grow. If it can't grow, then CHECKDB will fail because it needs the worktable to operate.
 
You can check in advance how much (estimated) space will be needed by CHECKDB by running it using the WITH ESTIMATEONLY option and then making sure that tempdb is sized accordingly to accomodate the CHECKDB requirements AND the regular tempdb space requirements. See Capacity Planning for tempdb in Books Online for more info.
Tuesday, November 06, 2007 2:23:12 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, October 02, 2007

This is a follow-on article from two posts:

People get themselves into situations where they have no backups (or damaged backups) and the data or log files are damaged such that the only way to access the database is with EMERGENCY mode. In these situations, prior to SQL Server 2005, there was no documented or supported way to fix a database while in EMERGENCY mode - the only guidance could be found on the Internet or from calling Product Support and paying for help. The sequence of events was:

  1. Hack the system tables to get the database into 'emergency' mode.
  2. Use the undocumented and unsupported DBCC REBUILD_LOG command to build a new transaction log.
  3. Run DBCC CHECKDB with the REPAIR_ALLOW_DATA_LOSS option to fix up corruptions in the data files - both those that may have caused the issue, and those caused by rebuilding the transaction log (e.g. because an active transaction altering the database structure was lost).
  4. Figure out what data was lost or is transactionally inconsistent (e.g. because a transaction altering multiple tables was lost) as far as your business logic is concerned
  5. Take the database out of emergency mode
  6. And then all the other stuff like root-cause analysis and getting a better backup strategy

I decided to add a new feature to SQL Server 2005 called EMERGENCY mode repair that will do steps 2 and 3 as an atomic operation. The reasons for this were:

  • Much of the advice of how to do this on the Internet missed steps out (particularly missing step 3!)
  • The DBCC REBUILD_LOG command was unsupported and undocumented and we didn't like advising customers to use it
  • Adding a documented last-resort method of recovering from this situation would reduce calls to Product Support - saving time and money for customers and Microsoft.

So, when in EMERGENCY mode, you can use DBCC CHECKDB to bring the database back online again. The only repair option allowed in EMERGENCY mode is REPAIR_ALLOW_DATA_LOSS and it does a lot more than usual:

  • Forces recovery to run on the transaction log (if it exists). You can think of this as 'recovery with CONTINUE_AFTER_ERROR' - see this post for more details on the real CONTINUE_AFTER_ERROR option for BACKUP and RESTORE. The idea behind this is that the database is already inconsistent because either the transaction log is corrupt or something in the database is corrupt in such a way that recovery cannot complete. So, given that the database is inconsistent and we're about to rebuild the transaction log, it makes sense to salvage as much transactional information as possible from the log before we throw it away and build a new one.
  • Rebuild the transaction log - but only if the transaction log is corrupt.
  • Run DBCC CHECKDB with the REPAIR_ALLOW_DATA_LOSS option.
  • Set the database state to ONLINE.

It's a one-way operation and can't be rolled back. I always advise taking a copy of the database files before doing this in case something goes wrong or there are unrepairable errors. And if it does? Probably time to update your resume for not having a water-tight backup and disaster-recovery strategies in place. Saying that, I've never seen it fail. I can think of some pathalogical cases where it would fail though (involving the file system itself having problems) but that's really unlikely.

Let's walk-through an example of using it. I'm assuming there's a database called emergencydemo that's in the same state as at the end of the Search Engine Q&A #4 blog post - the database has no log file, is in EMERGENCY mode and the salaries table is corrupt.

First off I'll try bringing the database online, just to see what happens:

ALTER DATABASE emergencydemo SET ONLINE;

GO

File activation failure. The physical file name "C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\emergencydemo_log.LDF" may be incorrect.

The log cannot be rebuilt because the database was not cleanly shut down.

Msg 945, Level 14, State 2, Line 1

Database 'emergencydemo' cannot be opened due to inaccessible files or insufficient memory or disk space. See the SQL Server errorlog for details.

Failed to restart the current database. The current database is switched to master.

Msg 5069, Level 16, State 1, Line 1

ALTER DATABASE statement failed.

The first message makes sense - the database knows it needs to be recovered because it wasn't cleanly shut down, but the log file simply isn't there. The second message is from the new feature in 2005 that will automatically create a log file if one is missing on startup or attach - as long as the database was cleanly shut down. The 945 and 5069 errors are self-explanatory but notice that the database gets switched to master underneath us. This has bitten me several times in the past.

Well, I expected that not to work. Let's run EMERGENCY mode repair:

DBCC CHECKDB (emergencydemo, REPAIR_ALLOW_DATA_LOSS) WITH NO_INFOMSGS;

GO

Msg 945, Level 14, State 2, Line 1

Database 'emergencydemo' cannot be opened due to inaccessible files or insufficient memory or disk space. See the SQL Server errorlog for details.

Hmm - looks like the failed ALTER DATABASE statement did change the state - but what to?

SELECT state_desc FROM sys.databases WHERE name='emergencydemo';

GO

state_desc
------------------------------------------------------------
RECOVERY_PENDING

That makes sense I guess. Ok - back to emergency mode and run repair:

ALTER DATABASE emergencydemo SET EMERGENCY;

GO

DBCC CHECKDB (emergencydemo, REPAIR_ALLOW_DATA_LOSS) WITH NO_INFOMSGS;

GO

Msg 7919, Level 16, State 3, Line 1

Repair statement not processed. Database needs to be in single user mode.

EMERGENCY mode is not SINGLE_USER mode - a database needs to be in SINGLE_USER mode for repair to run and EMERGENCY mode allows multiple connections from members of the sysadmin role. You can set SINGLE_USER mode as well as EMERGENCY mode - however, the sys.databases field state_desc will still just say EMERGENCY.

ALTER DATABASE emergencydemo SET SINGLE_USER;

GO

DBCC CHECKDB (emergencydemo, REPAIR_ALLOW_DATA_LOSS) WITH NO_INFOMSGS;

GO

File activation failure. The physical file name "C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\emergencydemo_log.LDF" may be incorrect.

The log cannot be rebuilt because the database was not cleanly shut down.

Warning: The log for database 'emergencydemo' has been rebuilt. Transactional consistency has been lost. The RESTORE chain was broken, and the server no longer has context on the previous log files, so you will need to know what they were. You should run DBCC CHECKDB to validate physical consistency. The database has been put in dbo-only mode. When you are ready to make the database available for use, you will need to reset database options and delete any extra log files.

This time it worked. First of all we get the same error as if we tried to bring the database online - that's from the code that's trying to run 'recovery with CONTINUE_AFTER_ERROR' on the transaction log. Next we get a nice long warning that the transaction log has been rebuilt and the consequences of doing that (basically that you need to start a new log backup chain by taking a full backup). If there had been any corruptions we'd have seen the usual output from DBCC CHECKDB about what errors it found and fixed. There's also a bunch of stuff in the error log

2007-10-02 17:21:20.95 spid51      Starting up database 'emergencydemo'.
2007-10-02 17:21:20.96 spid51      Error: 17207, Severity: 16, State: 1.
2007-10-02 17:21:20.96 spid51      FileMgr::StartLogFiles: Operating system error 2(The system cannot find the file specified.) occurred while creating or opening file 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\emergencydemo_log.LDF'. Diagnose and correct the operating system error, and retry the operation.
2007-10-02 17:21:20.96 spid51      Starting up database 'emergencydemo'.
2007-10-02 17:21:21.10 spid51      Starting up database 'emergencydemo'.<