(Check out my Pluralsight online training course: SQL Server: Detecting and Correcting Database Corruption.)
This is a post I’ve been trying to get to since I started blogging a couple of years ago: how to re-attach a detached SUSPECT database. This is a pretty common scenario I see on the forums – a database goes SUSPECT so the DBA tries to detach/attach, which fails. I wrote a demo for my corruption session at TechEd this year that shows how to create a SUSPECT database with a hex editor, then detaches it and shows how to re-attach and fix it. It’s going to be a long blog post, but bear with me – you never know when you’ll need to know how to recover from this.
Creating a SUSPECT Database
First off I’m going to create a simple database to use, called DemoSuspect with a table and some random data.
USE [master]; GO CREATE DATABASE [DemoSuspect]; GO USE [DemoSuspect]; GO CREATE TABLE [Employees] ( [FirstName] VARCHAR (20), [LastName] VARCHAR (20), [YearlyBonus] INT); GO INSERT INTO [Employees] VALUES ('Paul', 'Randal', 10000); INSERT INTO [Employees] VALUES ('Kimberly', 'Tripp', 10000); GO
Now I’ll perform an update in an explicit transaction and force it to be written out to disk with a CHECKPOINT. I’ve accidentally deleted Kimberly’s bonus!
-- Simulate an in-flight transaction BEGIN TRAN; UPDATE [Employees] SET [YearlyBonus] = 0 WHERE [LastName] = 'Tripp'; GO -- Force the update to disk CHECKPOINT; GO
Then in another window, I’ll simulate a crash using:
SHUTDOWN WITH NOWAIT; GO
Now that SQL Server is shutdown, I’m going to simulate an I/O failure that corrupts the log file. I’m going to use a hex editor to do this – my editor of choice is the immensely popular and useful XVI32, written by Christian Maas. I opened the log file, filled the first section with zeroes, and then saved it again. See the screenshot below. (As a small note of warning, this hex editor will truncate files that are over 2GB. Used the HxD editor instead for larger files.)
When I start up SQL Server again, it will try to run recovery on the DemoSuspect database and fail. This will put the database into the SUSPECT state.
So I restarted SQL Server, let’s try getting in to the DemoSuspect database.
USE [DemoSuspect]; GO
Msg 945, Level 14, State 2, Line 1 Database 'DemoSuspect' cannot be opened due to inaccessible files or insufficient memory or disk space. See the SQL Server errorlog for details.
Now let’s check the database status:
SELECT DATABASEPROPERTYEX (N'DemoSuspect', N'STATUS') AS N'Status'; GO
Status ------- SUSPECT
At this point, the correct procedure is to restore from backups. If there are no backups available, then the next best thing is to get the database into EMERGENCY mode and extract as much data as possible, or run EMERGENCY-mode repair. However, I’m going to try the detach/attach route instead.
Detaching the Database
On SQL Server 2005 you can detach a SUSPECT database using sp_detach_db, but on later versions SQL Server won’t let you do this:
EXEC sp_detach_db N'DemoSuspect'; GO
Msg 3707, Level 16, State 2, Line 1 Cannot detach a suspect or recovery pending database. It must be repaired or dropped.
I was *so* pleased when I saw this change was made. I’m going to have to set the database offline to release the NTFS-locks on the files, copy the files to somewhere safe, then drop the database and delete the files. It’s no longer possible to accidentally detach a SUSPECT database.
-- Not allowed on 2008 - let's copy then drop ALTER DATABASE [DemoSuspect] SET OFFLINE; GO -- ***** COPY THE FILES ***** -- Copy... then: DROP DATABASE [DemoSuspect]; GO
Now the DemoSuspect is really detached from SQL Server, and now the fun starts, which is why I’m sure many of you are reading this post.
Re-attaching a SUSPECT Database
Let’s try the obvious sp_attach_db:
EXEC sp_attach_db @dbname = N'DemoSuspect', @filename1 = N'C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\DATA\DemoSuspect.mdf', @filename2 = N'C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\DATA\DemoSuspect_log.ldf'; GO
Msg 5172, Level 16, State 15, Line 1 The header for file 'C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\DATA\DemoSuspect_log.ldf' is not a valid database file header. The PageAudit property is incorrect.
Hmm. How about using the ATTACH_REBUILD_LOG option on CREATE DATABASE? That should create a new log file for me:
CREATE DATABASE [DemoSuspect] ON (NAME = N'DemoSuspect', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\DATA\DemoSuspect.mdf') FOR ATTACH_REBUILD_LOG GO
Depending on the version of SQL Server you’re using, you’ll see either:
Msg 5172, Level 16, State 15, Line 1 The header for file 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\DemoSuspect_LOG.ldf' is not a valid database file header. The PageAudit property is incorrect. File activation failure. The physical file name "C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data\DemoSuspect_LOG.ldf" may be incorrect. The log cannot be rebuilt because the database was not cleanly shut down. Msg 1813, Level 16, State 2, Line 1 Could not open new database 'DemoSuspect'. CREATE DATABASE is aborted.
or the slightly less helpful:
File activation failure. The physical file name "C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\DATA\DemoSuspect_log.LDF" may be incorrect. Msg 1813, Level 16, State 2, Line 1 Could not open new database 'DemoSuspect'. CREATE DATABASE is aborted. Msg 5243, Level 22, State 8, Line 1 An inconsistency was detected during an internal operation. Please contact technical support.
Hmm. The database knows that there was an active transaction. Using the ATTACH_REBUILD_LOG command only works if the database was cleanly shut down and the log is missing. Even removing the damaged log file makes no difference.
Basically the problem is that the database wasn’t cleanly shutdown, which means that recovery HAS to run and complete before the database can be attached again. Given that our log file is corrupt, that’s impossible.
So, never detach a suspect database.
The only way to get the database back into SQL Server is to use a hack. I’m going to create a new dummy database with the exact same file layout as the detached database. Then I’m going to set the dummy database offline, swap in the corrupt database files, and bring the database online again. If all goes well, the corrupt database will be attached again.
The one major downside of this is that if the SQL Server instance doesn’t have instant initialization enabled (see How to tell if you have instant initialization enabled?), then creating the dummy database could take a long time if the data files are very big. This means that your application is offline while the files are created and zero’d out.
You’ll need to delete the existing files. Before doing this you want to make absolutely sure you’ve got multiple copies of the corrupt database files… just in case. After deleting the files, I can create my dummy database and set it offline.
CREATE DATABASE [DemoSuspect]; GO -- Check the files are there... ALTER DATABASE [DemoSuspect] SET OFFLINE; GO
If you forget to delete the existing corrupt files first, you’ll get the following error:
Msg 5170, Level 16, State 1, Line 1 Cannot create file 'C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\DATA\DemoSuspect.mdf' because it already exists. Change the file path or the file name, and retry the operation. Msg 1802, Level 16, State 4, Line 1 CREATE DATABASE failed. Some file names listed could not be created. Check related errors.
Now I’ll delete the file created for the dummy database, copy back in the corrupt database files, and bring the database online, checking its state:
ALTER DATABASE [DemoSuspect] SET ONLINE; GO SELECT DATABASEPROPERTYEX (N'DemoSuspect', N'STATUS'); GO
File activation failure. The physical file name "C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\DATA\DemoSuspect_log.LDF" may be incorrect. Msg 945, Level 14, State 2, Line 1 Database 'DemoSuspect' cannot be opened due to inaccessible files or insufficient memory or disk space. See the SQL Server errorlog for details. Msg 5069, Level 16, State 1, Line 1 ALTER DATABASE statement failed. Msg 5243, Level 22, State 8, Line 1 An inconsistency was detected during an internal operation. Please contact technical support.
This looks like it failed, but it didn’t. If I try the status check again, I get:
Status ------- SUSPECT
Woo-hoo – I’m back to having a SUSPECT database attached again – after having to mess about deleting and copying files around. Not good. Now I can actually fix it.
Repairing a SUSPECT Database
If you don’t have any backups, then the only way to get into the database is to use EMERGENCY mode. This lets you into the database but you need to be aware that recovery has not completed so the contents of the database are transactionally (and possibly structurally) inconsistent. I’m going to choose to repair the database using EMERGENCY-mode repair. See CHECKDB From Every Angle: EMERGENCY mode repair – the very, very last resort for a detailed description of this tool. Note that you have to put the database into EMERGENCY and SINGLE_USER modes to do this.
ALTER DATABASE [DemoSuspect] SET EMERGENCY; GO ALTER DATABASE [DemoSuspect] SET SINGLE_USER; GO DBCC CHECKDB (N'DemoSuspect', REPAIR_ALLOW_DATA_LOSS) WITH NO_INFOMSGS, ALL_ERRORMSGS; GO
Msg 5172, Level 16, State 15, Line 1 The header for file 'C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\DATA\DemoSuspect_log.LDF' is not a valid database file header. The PageAudit property is incorrect. File activation failure. The physical file name "C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\DATA\DemoSuspect_log.LDF" may be incorrect. The log cannot be rebuilt because there were open transactions/users when the database was shutdown, no checkpoint occurred to the database, or the database was read-only. This error could occur if the transaction log file was manually deleted or lost due to a hardware or environment failure. The Service Broker in database "DemoSuspect" will be disabled because the Service Broker GUID in the database (B72D1765-80C6-4C2F-8C12-5B78DAA2DA83) does not match the one in sys.databases (001AE95A-AE22-468F-93A4-C813F4A9112D). Warning: The log for database 'DemoSuspect' has been rebuilt. Transactional consistency has been lost. The RESTORE chain was broken, and the server no longer has context on the previous log files, so you will need to know what they were. You should run DBCC CHECKDB to validate physical consistency. The database has been put in dbo-only mode. When you are ready to make the database available for use, you will need to reset database options and delete any extra log files.
First off it tries to do the regular ATTACH_REBUILD_LOG. When that fails, DBCC CHECKDB takes over and forces the log to be rebuilt, after trying to force as much of the damaged log to be processed for recovery as it can. It then runs a full repair, in case there’s anything corrupt in the database – in this case there isn’t so there are no corruption messages in the output.
Notice the line about the Service Broker GUID being wrong. I had to use the hack method to get the database attached again, but when I created the dummy database, it created a Service Broker GUID for the DemoSuspect database in master.sys.databases. When I swapped in the corrupt database, it has a different GUID – so now I can’t use Service Broker until the Service Broker GUID is reset using the NEW BROKER option of ALTER DATABASE (see this post for details).
So what’s the state of the data after all of that?
-- Now try again... USE [DemoSuspect]; GO -- Check the state SELECT DATABASEPROPERTYEX (N'DemoSuspect', N'STATUS') AS N'Status'; GO -- What about the data? SELECT * FROM [Employees]; GO
Status ------- ONLINE (1 row(s) affected) FirstName LastName YearlyBonus ---------- --------- ------------ Paul Randal 10000 Kimberly Tripp 0 (2 row(s) affected)
Kimberly doesn’t get a bonus this year – she won’t be happy! This is contrived and flippant, of course, but it illustrates the point that after doing an EMERGENCY-mode repair, transactions that were active at the time the log was damaged will not get a chance to roll-back, most likely. In this case, I know what was going on when the crash occurred, but what about on a busy OLTP system with hundreds or thousands of active transactions? What state will the data be in?
Yes, you can recover from a detached SUSPECT database, but it’s not pretty and you have to be very careful. The best course of action is always to have a comprehensive backup strategy that allows you to restore as quickly as possible. If you do have a SUSPECT database and no backups, use EMERGENCY mode to access and/or repair the database. Hopefully this article will help people that find themselves in these situations – let me know if it helped you.