One of the hottest features in SQL Server 2005 is database mirroring, and it's helped many companies implement successful and relatively inexpensive high-availability strategies. In SQL Server 2008, Database Mirroring has been enhanced in several ways - one of which is the ability to automatically repair corrupt pages!
This feature is based on the fact that the principal and mirror databases are exactly the same. So, if a page becomes corrupt on the principal, SQL Server should be able to read the page from the mirror and use it to fix the principal. Similarly, if a page becomes corrupt on the mirror, the page can be read from the principal to fix up the mirror. Pretty cool, eh?
Details
In Action
I've got a system with a few SQL Server 2008 instances running so I decided to give it a try. Here's what I did, using the TicketSalesDB from the Always-On DVDs and labs that SQLskills produces:
And it worked! Going back into the instance with the principal database and querying the DMV gives:
C:\>sqlcmd /E /S.\KATMAI011> SELECT * FROM sys.dm_db_mirroring_auto_page_repair;2> GOdatabase_id file_id page_id error_type page_status modification_time----------- ----------- -------------------- ---------- ----------- ----------------------- 6 4 4256 -1 5 2007-09-27 17:23:20.067 (1 rows affected)1>
C:\>sqlcmd /E /S.\KATMAI011> SELECT * FROM sys.dm_db_mirroring_auto_page_repair;2> GOdatabase_id file_id page_id error_type page_status modification_time----------- ----------- -------------------- ---------- ----------- ----------------------- 6 4 4256 -1 5 2007-09-27 17:23:20.067
(1 rows affected)1>
The page I corrupted was (4:4256) and page_status of 5 means the repair succeeded. Running the query again confirms that the corruption has been fixed. The page was also logged in the suspect_pages table in msdb:
1> SELECT * FROM msdb..suspect_pages;2> GOdatabase_id file_id page_id event_type error_count last_update_date----------- ----------- -------------------- ----------- ----------- ----------------------- 6 4 4256 5 1 2007-09-27 17:23:20.407 (1 rows affected)1>
1> SELECT * FROM msdb..suspect_pages;2> GOdatabase_id file_id page_id event_type error_count last_update_date----------- ----------- -------------------- ----------- ----------- ----------------------- 6 4 4256 5 1 2007-09-27 17:23:20.407
In Books Online (both 2005 and 2008), event_type of 5 means that the page was repaired.
I checked the SQL Server error log for the principal and this is what I found (the bolding is mine):
2007-09-27 17:17:10.41 spid25s Database mirroring is active with database 'TicketSalesDB' as the principal copy. This is an informational message only. No user action is required.2007-09-27 17:23:19.92 spid51 Error: 824, Severity: 24, State: 2.2007-09-27 17:23:19.92 spid51 SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x55684fbe; actual: 0x16e84fbe). It occurred during a read of page (4:4256) in database ID 6 at offset 0x00000002140000 in file 'C:\Program Files\Microsoft SQL Server\MSSQL.6\MSSQL\Data\TicketSalesFG2005Q1.NDF'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.2007-09-27 17:23:19.96 spid26s Database mirroring is attempting to repair physical page (4:4256) in database "TicketSalesDB" by requesting a copy from the partner.2007-09-27 17:23:20.42 spid26s Database mirroring successfully repaired physical page (4:4256) in database "TicketSalesDB" by obtaining a copy from the partner.
I also checked the error log for the mirror and there's nothing relevant in there. I guess the same info would be output to the mirror database's error log if the mirror became corrupted. It's reasonably simple to check this using similar steps as above, but corrupting the database before the initial backup is taken, fixing the corruption again after the backup and before mirroring starts (so the corruption is on the mirror but not the principal), and then triggering an update on the corrupt page. When the update is replayed on the mirror, the corruption will be hit and the page repaired in the same way. I'll check later and blog if there's any difference.
Summary
SQL Server 2008 introduces an enhancement to database mirroring that can bi-directionally pull pages between the principal and mirror databases to fix page corruptions. One word of caution I'd give is that this feature doesn't mean you can ignore these errors when they occur - you still need to do root-cause analysis on the corruption and take steps to prevent them happening again before a corruption occurs that automatic page repair cannot fix (as I mentioned above).
Nevertheless, this is a tremendously useful feature that's going to save a lot of downtime. Cool!
Remember Me
a@href@title, strike
Theme design by Jelle Druyts
Pick a theme: BlogXP sqlx BlogXP sqlx
Powered by: newtelligence dasBlog 2.0.7226.0
The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.
© Copyright 2008, Paul S. Randal
E-mail