New course: Understanding and Using DBCC Commands

Erin’s new course is called SQL Server: Understanding and Using DBCC Commands and is just over 2.25 hours long. It covers all the documented DBCC commands and a few of the undocumented ones, plus Erin goes into details about how to use DMVs, when available, to provide more detailed information.

The modules are:

  • Introduction
  • Basic Commands
  • Informational Commands
  • Maintenance Commands
  • Validation Commands
  • Undocumented Commands

Check it out here.

We now have more than 135 hours of SQLskills online training available (see all our courses here), all for as little as $29/month through Pluralsight (including more than 4,500 other developer and IT training courses). That’s unbeatable value that you can’t afford to ignore.

Enjoy!

When DBCC DROPCLEANBUFFERS doesn’t work…

Here’s an interesting question that came up in our IEPTO1 class in Chicago this week (paraphrasing):

I was doing a demo recently where I was demonstrating physical I/Os occurring. I used DBCC DROPCLEANBUFFERS, then enabled SET STATISTICS IO ON and performed a SELECT operation. I was expecting to see physical reads occurring but I only saw logical reads. What’s going on? Why weren’t there any physical reads after I’d flushed the buffer pool with DBCC DROPCLEANBUFFERS?

It’s a very interesting question. If you ask most people what DBCC DROPCLEANBUFFERS does, you’ll get the response that it clears out the buffer pool (as my student expected).

But it doesn’t.

It drops *clean* pages from the buffer pool only.

A clean page is one that has not been changed since it was read into memory or last written to disk. A dirty page is one that has not been written to disk since it was last changed. Dirty pages are not dropped by DBCC DROPCLEANBUFFERS, they are only made clean by writing them to disk (either through one of the various kinds of checkpoints or by the lazy writer – or one of the per-NUMA node lazy writers if you have NUMA configured).

The demo in question had updated the table being selected, and so when DBCC DROPCLEANBUFFERS was executed, the pages from that table remained in memory – hence no physical reads were required for the subsequent SELECT.

If you want to ensure that all pages from a database are flushed from memory, you need to first perform a manual CHECKPOINT of that database and then run DBCC DROPCLEANBUFFERS.

You can then verify that there are no pages in memory for the database in question using the following code:

SELECT *,
	[DirtyPageCount] * 8 / 1024 AS [DirtyPageMB],
	[CleanPageCount] * 8 / 1024 AS [CleanPageMB]
FROM
	(SELECT 
		(CASE WHEN ([database_id] = 32767)
			THEN N'Resource Database'
			ELSE DB_NAME ([database_id]) END) AS [DatabaseName], 
		SUM (CASE WHEN ([is_modified] = 1)
			THEN 1 ELSE 0 END) AS [DirtyPageCount], 
		SUM (CASE WHEN ([is_modified] = 1)
			THEN 0 ELSE 1 END) AS [CleanPageCount]
	FROM sys.dm_os_buffer_descriptors
	GROUP BY [database_id]) AS [buffers]
ORDER BY [DatabaseName]
GO 

Hope this helps some of you out there!

Bug: Error: 3449 and server restart during DBCC CHECKDB

This is a quick post to let you know about a bug that a few people are hitting when running DBCC CHECKDB. The symptoms are a series of errors in the error log plus SQL Server forcibly shuts itself down and restarts. I’ve heard of people hitting the bug on SQL Server 2014 and SQL Server 2012 SP1.

[Update 2/24/15] Microsoft has confirmed that it’s a bug in 2014 and 2012 and they’re planning a fix for the next CU of both.

[Update 3/18/15] The fix for SQL Server 2012 is in CU5 of SP2 and is described in KB 3044958.

The error log symptoms look something like I show below:

2014-11-18 09:04:15.69 spid64      The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x00001EE6FD8000 in file 'C:\SQLskills\\Company_file2.ndf:MSSQL_DBCC23'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
2014-11-18 09:04:15.72 spid64      Error: 3314, Severity: 17, State: 3.
2014-11-18 09:04:15.72 spid64      During undoing of a logged operation in database 'Company', an error occurred at log record ID (887654:3321:14). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.
2014-11-18 09:04:34.38 spid64      Error: 831, Severity: 20, State: 1.
2014-11-18 09:04:34.38 spid64      Unable to deallocate a kept page.
2014-11-18 09:04:34.40 spid64      Error: 3449, Severity: 21, State: 1.
2014-11-18 09:04:34.40 spid64      SQL Server must shut down in order to recover a database (database ID 23). The database is either a user database that could not be shut down or a system database. Restart SQL Server. If the database fails to recover after another startup, repair or restore the database.

The 665 error is from the snapshot file that DBCC CHECKDB creates hitting an NTFS limitation on the number of file fragments in a sparse file. This causes the snapshot creation to fail. The failure causes the undo of a log record in the snapshot to fail (remember that a database snapshot undergoes crash recovery to make it transactionally consistent). This failure then leads to SQL Server thinking it has to forcibly restart to recover the snapshot database, which is should never do for a snapshot – and that’s the bug.

I’ll update this post as soon as I hear about the builds that the fix is in.

This is a rare bug to hit, but it’s a regression (from builds people are reporting), and you can help yourself to avoid it by:

  • Creating your own database snapshot, on a volume without file-system free space fragmentation, and running DBCC CHECKDB against the snapshot
  • OR, trying to run DBCC CHECKDB when there isn’t a significant amount of change occurring in the database, so the database snapshot doesn’t have to become very large
  • OR, using the backup-copy-restore-check method of running DBCC CHECKDB on a restored backup of the database on another server

Thanks