Bug: Error: 3449 and server restart during DBCC CHECKDB

This is a quick post to let you know about a bug that a few people are hitting when running DBCC CHECKDB. The symptoms are a series of errors in the error log plus SQL Server forcibly shuts itself down and restarts. I’ve heard of people hitting the bug on SQL Server 2014 and SQL Server 2012 SP1.

[Update 2/24/15] Microsoft has confirmed that it’s a bug in 2014 and 2012 and they’re planning a fix for the next CU of both.

[Update 3/18/15] The fix for SQL Server 2012 is in CU5 of SP2 and is described in KB 3044958.

The error log symptoms look something like I show below:

2014-11-18 09:04:15.69 spid64      The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x00001EE6FD8000 in file 'C:\SQLskills\\Company_file2.ndf:MSSQL_DBCC23'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
2014-11-18 09:04:15.72 spid64      Error: 3314, Severity: 17, State: 3.
2014-11-18 09:04:15.72 spid64      During undoing of a logged operation in database 'Company', an error occurred at log record ID (887654:3321:14). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.
2014-11-18 09:04:34.38 spid64      Error: 831, Severity: 20, State: 1.
2014-11-18 09:04:34.38 spid64      Unable to deallocate a kept page.
2014-11-18 09:04:34.40 spid64      Error: 3449, Severity: 21, State: 1.
2014-11-18 09:04:34.40 spid64      SQL Server must shut down in order to recover a database (database ID 23). The database is either a user database that could not be shut down or a system database. Restart SQL Server. If the database fails to recover after another startup, repair or restore the database.

The 665 error is from the snapshot file that DBCC CHECKDB creates hitting an NTFS limitation on the number of file fragments in a sparse file. This causes the snapshot creation to fail. The failure causes the undo of a log record in the snapshot to fail (remember that a database snapshot undergoes crash recovery to make it transactionally consistent). This failure then leads to SQL Server thinking it has to forcibly restart to recover the snapshot database, which is should never do for a snapshot – and that’s the bug.

I’ll update this post as soon as I hear about the builds that the fix is in.

This is a rare bug to hit, but it’s a regression (from builds people are reporting), and you can help yourself to avoid it by:

  • Creating your own database snapshot, on a volume without file-system free space fragmentation, and running DBCC CHECKDB against the snapshot
  • OR, trying to run DBCC CHECKDB when there isn’t a significant amount of change occurring in the database, so the database snapshot doesn’t have to become very large
  • OR, using the backup-copy-restore-check method of running DBCC CHECKDB on a restored backup of the database on another server

Thanks

Two new courses on Advanced Corruption Recovery Techniques and Maintenance Plans

I’ve just heard from our good friends at Pluralsight that our two latest online training courses are now live and available for watching!

My latest course is SQL Server: Advanced Corruption Recovery Techniques which follows on from my earlier SQL Server: Detecting and Recovering from Database Corruption course.

The course is 4 hours long and has the following modules:

  • Introduction
  • DBCC CHECKDB Internals and Performance
  • Useful Undocumented DBCC Commands
  • Dealing with Transaction Log Problems
  • Advanced Restore Techniques
  • Advanced Repair Techniques

Check it out here.

Jonathan’s latest course is SQL Server: Maintenance Plans. The course is two and a half hours long and has the following modules:

  • Introduction
  • Approaches to Database Maintenance
  • Configuring SQL Server Agent Settings
  • Common Maintenance Tasks
  • Other Tasks
  • Creating Maintenance Plans
  • Maintenance Plan Challenges and Alternatives

Check it out here.

Now you have something to do this weekend :-) Enjoy!

New online course: Detecting and Correcting Database Corruption

My latest Pluralsight online training course went live today: SQL Server: Detecting and Correcting Database Corruption

It’s 4 hours long and is the first in a two-part series, with a more advanced course coming in a few weeks.

The modules in this course are:

  • Introduction
  • Causes of Database Corruption
  • Detecting Page Corruption
  • Consistency Checking
  • DBCC CHECKDB and Related Commands
  • Interpreting DBCC CHECKDB Output
  • Simple Restore Techniques
  • Simple Repair Techniques

You can get to the course here.

Pluralsight starts at US$29/month, for more than 80 hours of SQLskills SQL Server training (growing all the time) and more than 770 total developer and IT Pro courses (also growing all the time).

Enjoy!