Survey: what’s the oldest version of SQL Server you have in production?

In this survey I’d like to know what the oldest version of SQL Server is in your environment – it has to be an instance that’s being used for a production workload in some way (even just occasionally).


I’ll report on the results next week.

Thanks!

Chicago classes are all sold out, don’t wait to register for London

As of this afternoon all eight classes we’re doing in Chicago in April/May are sold out – that’s 160 people we’ll be training!

The reason for this post is to remind those of you who are planning to come to one of our London classes in June that they’re all more than half full already and we expect them to sell out in the next 2-3 weeks. We had multiple people wait too long to register for Chicago and we were able to add extra IE1 and IEBI classes to cope with demand, but we can’t do that in London.

Don’t forget that the early-bird price for IE1 in London expires Friday 19th!

Check out the London classes here - hope to see you there!

Easy automation of SQL Server database maintenance

A while back I kicked off a survey asking what mechanism you use for running your regular SQL Server database maintenance.

Here are the results:

maintenance Easy automation of SQL Server database maintenance

The “Other” responses were:

  • 7 x “A combination of maintenance plans generated from SSMS wizard and a home-grown index maintenance script”
  • 5 x “Combo of own and modified scripts of Michelle Ufford”
  • 2 x “Home-grown scripts for main databases and SSMS wizard in some simple cases”
  • 1 x “A combination of enterprise backup agent and AdaptiveIndexDefrag
  • 1 x “A combination of home-grown scripts and Ola’s scripts (for Indexing and Statistics)”
  • 1 x “Combination of Ola’s scripts (indexes/stats) and homegrown (DBCC and backups)”

The main purpose of this survey is to show people that there are freely-available and comprehensive scripts that you can download to help run your regular database maintenance, and that many, many people use them in production.

I don’t like to recommend using the SSMS Maintenance Plan Wizard. It has quite limited options, has had a number of high profile bugs in the past, and to this day, even in SQL Server 2012, it still allows you to perform regular shrink operations without any warning as to the side-effects.

Home grown scripts  are OK, but time after time when I’m reviewing client maintenance scripts I see coding errors, lack of logging of what happened, and lack of useful error handling. An example of a common error is in code to figure out which indexes are fragmented, where the results from sys.dm_db_index_physical_stats are not filtered by alloc_unit_type_desc, so there are false positives from LOB_DATA and ROW_OVERFLOW_DATA allocation units. In one client, many of their large clustered indexes had no logical fragmentation but were being rebuilt every night needlessly because of benign LOB_DATA fragmentation, generating a ton of extra transaction log that had to be backed up.

More than 40% of the almost 500 respondents use some or all of Ola Hallengren’s Maintenance Solution, and I always recommend our clients download and play around with Ola’s scripts before asking us to write customized code for them, or at least let us use Ola’s code where possible to save them consulting time. Ola’s Maintenance Solution has won multiple awards and is very widely used in the SQL Server community. There are also some popular scripts for performing index maintenance from SQL Server MVPs Michelle Ufford and Tara Kizer, but Ola’s are the gold standard and allow you to do backups, index and statistics maintenance, and consistency checks.

One of the cool things about using Ola’s scripts is that they’re tested constantly by thousands of installations around the world and they’re very robust. If you’ve never seen them, or want to upgrade your database maintenance, check them out!

Survey: what mechanism do you use for regular database maintenance?

I haven’t done any surveys this year and I’d like to get back into doing one every week or so for the next few months.

In this survey I’d like to know what mechanism(s) you use to perform your regular database maintenance.


I’ll editorialize the results in a week or so.

Thanks!

DBCC CHECKDB scalability and performance benchmarking on SSDs

Back in February I ran a bunch of performance tests of DBCC CHECKDB on SSDs, to evaluate the effects of degree-of-parallelism (DOP) and various options and traceflags, and now I’m finally getting around to presenting the results. Make sure to also read the recent post where I talk about the detrimental effect of computed-column indexes on DBCC CHECKDB performance.

The two variables I altered during my tests were:

  • DOP – 32, 16, 8, 4, 2, 1
  • Configuration – full DBCC CHECKDB, using WITH PHYSICAL_ONLY (PO), using PO and TF2549, using PO and TF2562, using PO and both trace flags

Trace flags 2549 and 2562 are described in KB 2634571, and can be summarized as:

  • 2549: treat each database file as if it’s on a separate physical drive for the purposes of driving readahead
  • 2562: perform the entire set of checks in a single ‘batch’ instead of multiple batches, and increase the efficient of per-thread calls to get a new list of pages to read

These two trace flags really only apply to DBCC CHECKDB when the WITH_PHYSICAL_ONLY option is used.

My test system is as follows:

  • DELL R720 with two 8-core E5-2670 CPUs and hyper-threading enabled
  • 64GB of memory
  • The test database is AdventureWorks that Jonathan expanded to 500GB for me using his cool scripts. The database is split over 8 data files stored on two 320GB Fusion-io drives, with tempdb and its log placed on two more 320GB Fusion-io drives. I set things up this way to remove I/O waits from the test.
  • There was no additional load running concurrently with DBCC CHECKDB.

Complete Results

The complete results are shown below:

CHECKDB DOP1 DBCC CHECKDB scalability and performance benchmarking on SSDs

There are some very interesting results from this graph, for the test I’m doing using SSDs:

  • For a full DBCC CHECKDB, there is a 70-80% performance gain from DOP 1 to DOP 2, from DOP 2 to DOP 4, and from DOP 4 to DOP 8.
  • For a full DBCC CHECKDB run, there is really no scalability gain in performance for a DOP greater than 8, and in fact a slight performance degradation for higher DOPs.
  • As you can see for the results of a full DBCC CHECKDB with DOP 8 and higher, DBCC CHECKDB on that server without I/O waits can process 0.5GB of the database per second. That’s pretty cool and is down to the fast SSDs (there were no PAGEIOLATCH_SH waits) and fast processors. Back in SQL Server 2000, our performance benchmark was about 1GB per minute. See here for some example SQL Server 2000 benchmark times that show around 3-5GB/minute, but not on SSDs.
  • The main boost in performance when running on SSDs comes from using the WITH PHYSICAL_ONLY option. This is, of course, to be expected as it cuts out all of the logical checks that make up the bulk of the CPU usage of DBCC CHECKDB.

For the scalability result running a full DBCC CHECKDB, some latch contention for structures inside DBCC CHECKDB starts to show up at DOP 8, but not enough to give zero scalability at higher DOPs. This comes from a much higher time spent waiting for CXPACKET waits, roughly doubling from DOP 8 to DOP 16, and again from DOP 16 to DOP 32. I blame the query processor for that :-) In all seriousness, I didn’t expect to see a scalability gain with DOP higher than 8, based on what I’ve seen in the past.

PHYSICAL_ONLY Results

The results without DOP 1 and DOP 2 are shown below, which allows us to focus on the PHYSICAL_ONLY results:

CHECKDB DOP2 DBCC CHECKDB scalability and performance benchmarking on SSDs

Again, for the test I’m doing using SSDs, there are some interesting results:

  • DOP 4 seems to give the best performance overall, apart from the use of both trace flags with DOP 8.
  • Higher DOPs get progressively worse performance. Analysis of wait and latch statistics show that this is caused by increasing amounts of latch contention at higher DOPs.
  • The two trace flags don’t have a pronounced effect on performance in my situation.

Summary

When running with a database on SSDs and very fast processors, full DBCC CHECKDB performance scales to DOP 8, and the main performance gain comes from using the WITH PHYSICAL_ONLY option. The two trace flags introduced in late 2011 don’t have a pronounced effect on PHYSICAL_ONLY performance. This is all to be expected, as the SSDs really remove the I/O component of DBCC CHECKDB performance and scalability from consideration.

Although these tests are not conclusive until I’ve run similar tests on a slower I/O subsystem, I’m tempted to recommend limiting DBCC CHECKDB‘s DOP to 8 if you can (maybe using Resource Governor) to limit its resource drain on your system.

I obviously also have to recommend using SSDs to vastly reduce or effectively eliminate (in my case) I/O waits.

During April and May I’ll perform the same set of tests using 15k SCSI drives and also 7.2k SATA drives to see if and how the results differ.

Hope you found this interesting!

Time to take a stand

20110120 IMG 2900 Small Time to take a standWhile Kimberly was selecting underwater photos for our new website, she came across this one from our dive trip to the remote NE of Indonesia in January 2012.

After several people tried to do an underwater hand-stand, I gave one of our buddies my camera and showed them how it’s done in the middle of an archway. Not the most flattering position for a photo :-)

Click the image for a full-size version. We were at a dive site called Boo Jendela where there are fantastic swim-throughs. Click here for a Google Map of where we were based on the GPS reading I took.

Our new website and April special offer

Our New Website

Late last year we finally decided to make time to revamp our blogs and website, and move onto WordPress. I  really wish we’d done it a lot sooner as WordPress is such a joy to work with compared to hand-coded ASP.NET, but client work always seemed to get in the way. Now we’ve upgraded, it’s so much easier to change website content and write blog posts.

Kimberly sourced our new logo using the 99 Designs website, which allows artists around the world to compete with each other to win a design competition. Our new blogs went live in January and we just launched our new website on Wednesday. It took us almost five months of elapsed time to do it, mostly waiting for me to populate content, but we did it all ourselves, on our own server hosted by Amazon.  Jonathan  and I spent many, many hours on con-calls figuring stuff out, and Jonathan now knows more about WordPress and WordPress plug-ins than he ever wanted to know and was the driving force behind getting this all done (he really is utterly fabulous – thank you Jon!). Overall, we’re really proud of what we achieved together – it’s a huge improvement over the old site.

By the way, if you haven’t seen them yet, check out the image rotators on the home page and on the training pages which showcase some of Kimberly’s finest underwater photography. You can get hi-res version of those photos here.

Special Offer

To celebrate getting our new website completely live, we’re offering all new clients a special offer. If you sign up for our signature SQL Server Health Check service during the month of April, it will only cost you US$2500. This covers the health check of a single SQL Server instance, plus up to an hour of email/phone review with us about the results, once you’ve read our comprehensive report. That’s more than a 1/3 saving over the usual price!

Check out our SQL Server Health Check description and then let us know that you’re interested.

We’re looking forward to helping you. It’s what we do best.

DBCC CHECKDB performance and computed-column indexes

It’s no secret that DBCC CHECKDB has some performance quirks based on the schema of the database being checked and various kinds of corruptions. I was recently doing some scalability testing of DBCC CHECKDB for a blog post and discovered quite a nasty performance issue that exists in all versions of SQL Server back to SQL Server 2005. This isn’t a bug, it’s just the way things work.

The issue occurs when a nonclustered index exists that has a computed column as part of the index key or as one of the INCLUDEd columns and affects DBCC CHECKDB, DBCC CHECKFILEGROUP, and DBCC CHECKTABLE.

As a bit of background, here’s an excerpt from the SQL Server 2012 Internals book where I describe one of the ways nonclustered indexes are checked for consistency:

The nonclustered index cross-checks verify that

  • Every record in a nonclustered index (whether filtered or nonfiltered) must map to a valid record in the base table (that is, the heap or clustered index).
  • Every record in the base table must map to exactly one record in each nonfiltered, nonclustered index, and one record in each filtered index, where the filter allows.

The mechanism to carry out these checks efficiently has changed in every release since SQL Server 7.0—becoming progressively more and more efficient. In SQL Server 2012, two hash tables are created for each partition of each nonclustered index—one hash table is for the actual records in that partition of the nonclustered index, and the other is for the records that should exist in that partition of the nonclustered index (as calculated from the existing data records in the table).

When a nonclustered index record is processed, all columns in the record are hashed together into a BIGINT value. This includes

  • The physical or logical link back to the base table (known as the base table RID)
  • All included columns—even LOB and FILESTREAM values) are hashed together into a BIGINT value

The resulting value is added to the master hash value for actual records for the nonclustered index partition of which the record is part.

DBCC CHECKDB knows which nonclustered indexes exist for the table and what the complete nonclustered index record composition should be for each. When a data record is processed, the following algorithm is run for each matching nonclustered index record that should exist for the data record (taking into account any filter predicates for filtered nonclustered indexes):

  1. Create the nonclustered index record in memory (again, including the base table RID, plus included columns).
  2. Hash all columns in the index record together into a BIGINT value.
  3. Add the resulting value to the “should exist” master hash value for the relevant nonclustered index partition of which the index record is part.

The premise that this algorithm works on is that if no corruptions exist, the master hash values for the actual records and “should exist” records for each nonclustered index partition should match exactly at the end of the DBCC CHECKDB batch.

When a nonclustered index uses a computed column, the value of the computed column has to be computed based on the column definition. To do that, an internal mechanism called an ‘expression evaluator’ is created. The expression evaluator is provided by the Query Processor code and its behavior is entirely outside the control of DBCC CHECKDB. The drawback of the expression evaluator is that every time it is used, an exclusive latch must be held by the thread using it. This creates an incredible bottleneck and drastically affects performance.

My test system is the Dell R720 we bought last year, with 64GB of memory and 2 E5-2670 CPUs with 8 physical cores and hyperthreading enabled. It’s running SQL Server 2012 SP1 CU3.

The test database is AdventureWorks that Jonathan expanded to 500GB for me using his cool scripts. The database is split over 8 data files stored on two 320GB Fusion-io drives, with tempdb and its log placed on two more 320GB Fusion-io drives. I set things up this way to remove I/O waits from the test.

The first few runs of my testing, with unlimited parallelism produced DBCC CHECKDB elapsed times of around 340 minutes. This seemed incredibly slow to me, so I ran another test and looked at the output of sys.dm_os_waiting_tasks. I discovered that half the threads at any time were waiting for exclusive access to the DBCC_OBJECT_METADATA latch. Adding in some diagnostics to capture waits and latches gave the data below:

WaitType        Wait_S     Resource_S  Signal_S  WaitCount  Percentage  AvgWait_S  AvgRes_S  AvgSig_S
--------------- ---------- ----------- --------- ---------- ----------- ---------- --------- ---------
CXPACKET        684482.45  682667.13   1815.32   60294236   34.90       0.0114     0.0113    0.0000
OLEDB           659325.08  659325.08   0.00      207661462  33.62       0.0032     0.0032    0.0000
LATCH_EX        615168.39  605357.28   9811.11   798224634  31.37       0.0008     0.0008    0.0000

LatchClass               Wait_S     WaitCount  Percentage  AvgWait_S
------------------------ ---------- ---------- ----------- ----------
DBCC_OBJECT_METADATA     611768.00  764845636  99.16       0.0008

This didn’t make any sense to me as I couldn’t believe that such an incredible bottleneck existed – almost 1ms per latch wait and 800 million latch waits! – surely we’d have heard about it by now? I pinged my friend Ryan Stonecipher who inherited the DBCC CHECKDB code from me and he reminded me about the expression evaluator behavior.

I went back to my test system and looked for nonclustered indexes using computed columns using the code below:

SELECT
    [s].[name],
    [o].[name],
    [i].[name],
    1.[name],
    [ic].*
FROM sys.columns 1
JOIN sys.index_columns [ic]
    ON [ic].[object_id] = 1.[object_id]
    AND [ic].[column_id] = 1.[column_id]
JOIN sys.indexes [i]
    ON [i].[object_id] = [ic].[object_id]
    AND [i].[index_id] = [ic].[index_id]
JOIN sys.objects [o]
    ON [i].[object_id] = [o].[object_id]
JOIN sys.schemas [s]
    ON [o].[schema_id] = [s].[schema_id]
WHERE 1.[is_computed] = 1;
GO

And I found 6 nonclustered indexes with computed columns in, on some of the largest tables in the database. I then disabled those nonclustered indexes and re-ran the DBCC CHECKDB tests.

17-18 minutes per run.

Wow! The bottleneck from the expression evaluator was causing DBCC CHECKDB to run about 20 times slower.

The wait and latch details for the test runs with the indexes disabled are as below, which is what I expected to see from running DBCC CHECKDB:

WaitType        Wait_S     Resource_S  Signal_S  WaitCount  Percentage  AvgWait_S  AvgRes_S  AvgSig_S
--------------- ---------- ----------- --------- ---------- ----------- ---------- --------- ---------
CXPACKET        33064.56   29331.74    3732.83   42034533   48.14       0.0008     0.0007    0.0001
OLEDB           28883.78   28883.78    0.00      173940154  42.05       0.0002     0.0002    0.0000
LATCH_EX        5021.20    4605.50     415.70    30340659   7.31        0.0002     0.0002    0.0000

LatchClass               Wait_S     WaitCount  Percentage  AvgWait_S
------------------------ ---------- ---------- ----------- ----------
DBCC_CHECK_AGGREGATE     5039.36    30267451   98.82       0.0002

There’s nothing we can do about this bottleneck except to disable nonclustered indexes on computed columns while DBCC CHECKDB is running and then rebuild them afterwards, which is not a very palatable solution. I wanted to blog about this just to make the issue known and help any of you who are beating your head against a wall trying to figure out poor DBCC CHECKDB performance.

Chicago IE1 class is sold out, so we added another one!

For those of you who were planning to register for our IE1 Immersion Event (on Internals and Designing for Performance) in Chicago at the end of April, I have bad news and good news…

The bad news is that the class reached maximum capacity (30 students) and is sold out.

The good news is that we’ve added another IE1 in Chicago from May 20-24, which opens up another 30 IE1 seats!

This means we have seven classes in Chicago over the four weeks from April 29 – May 24. You can see all the details here.

We hope to see you there!

PS Please note: we do not plan to add any other Immersion Events in 2013 other than what’s currently scheduled. We chose to add another IE1 course in Chicago because we had a few folks on the wait list for which these dates work and, we’re already in Chicago during that week. Adding a class was relatively easy to do because the hotel had availability and all of our gear would already be onsite. We will NOT be adding any other courses or locations in 2013. We are planning a new location in 2014 as well as a new course – IE5: Immersion Event on Relational Data Warehousing. Stay tuned and we hope to see you at an Immersion Event soon!

Buffer pool disfavoring

Have you ever wondered why DBCC CHECKDB, for instance, doesn’t flush the buffer pool when it runs on a database that is larger than the memory capacity of the server?

There’s cool mechanism that it uses called disfavoring. When a buffer is no longer required, the buffer can be marked as the least recently used of all the buffers in the buffer pool, meaning that it will used next when the buffer pool needs a new buffer to read a page from disk. This means that a small number of buffers will be used repeatedly instead of filling the buffer pool with pages from the database being consistency checked.

This mechanism isn’t anything to do with DBCC, but DBCC makes good use of it. DBCC CHECK* commands, DBCC SHOWCONTIG, DBCC UPDATEUSAGE, and the sys.dm_db_index_physical_stats DMV (which is really part of DBCC) all use this.

Let me show you…

On our fast test machine at home (rack-mounted Dell R720) I’ve got a 500GB database called AdventureWorks_Big residing on four Fusion-io SSDs. First off I’ll show you the table I’m going to use:

SELECT  [page_count]
FROM sys.dm_db_index_physical_stats (
DB_ID (),
	OBJECT_ID (N'Production.TransactionHistoryEnlarged'),
	1,
	NULL,
	N'LIMITED');
GO
page_count
-----------
18114867

It’s more than 138GB, and my buffer pool is 56,000MB. Now I’ll make sure the buffer pool is empty, and then run DBCC CHECKTABLE on that table:

DBCC DROPCLEANBUFFERS;
GO

DBCC CHECKTABLE (N'Production.TransactionHistoryEnlarged') WITH NO_INFOMSGS;
GO

Five minutes later it completes. But how much of the table did it read into the buffer pool?

SELECT *,
	[DirtyPageCount] * 8 / 1024 AS [DirtyPageMB],
	[CleanPageCount] * 8 / 1024 AS [CleanPageMB]
FROM
	(SELECT
		(CASE WHEN ([database_id] = 32767)
			THEN N'Resource Database'
			ELSE DB_NAME ([database_id]) END) AS [DatabaseName],
		SUM (CASE WHEN ([is_modified] = 1)
			THEN 1 ELSE 0 END) AS [DirtyPageCount],
		SUM (CASE WHEN ([is_modified] = 1)
			THEN 0 ELSE 1 END) AS [CleanPageCount]
	FROM sys.dm_os_buffer_descriptors
	GROUP BY [database_id]) AS [buffers]
ORDER BY [DatabaseName];
GO
DatabaseName        DirtyPageCount CleanPageCount DirtyPageMB CleanPageMB
------------------- -------------- -------------- ----------- -----------
AdventureWorks_Big  0              188            0           1
master              3              77             0           0
model               0              32             0           0
msdb                1              39             0           0
Resource Database   0              24             0           0
tempdb              260            179            2           1

Cool eh?

There are some other places that disfavor buffers when they’re done with them, including:

  • During redo of log records, if a page already has a log record’s effects on it and so doesn’t need to be changed
  • During bulk load into a heap
  • Large table scans that are more than 10% of the buffer pool size will disfavor pages instead of forcing pages from other databases to be flushed from memory

And backups don’t use the buffer pool at all so they’re not going to cause buffer pool flushing either.

You can watch disfavoring happening for table scans using the leaf_page_disfavored Extended Event.

Bottom line – some of your common maintenance activities don’t damage the buffer pool, which is a good thing.