Time to take a stand

20110120-IMG_2900 SmallWhile Kimberly was selecting underwater photos for our new website, she came across this one from our dive trip to the remote NE of Indonesia in January 2012.

After several people tried to do an underwater hand-stand, I gave one of our buddies my camera and showed them how it’s done in the middle of an archway. Not the most flattering position for a photo :-)

Click the image for a full-size version. We were at a dive site called Boo Jendela where there are fantastic swim-throughs. Click here for a Google Map of where we were based on the GPS reading I took.

Our new website and April special offer

Our New Website

Late last year we finally decided to make time to revamp our blogs and website, and move onto WordPress. I  really wish we’d done it a lot sooner as WordPress is such a joy to work with compared to hand-coded ASP.NET, but client work always seemed to get in the way. Now we’ve upgraded, it’s so much easier to change website content and write blog posts.

Kimberly sourced our new logo using the 99 Designs website, which allows artists around the world to compete with each other to win a design competition. Our new blogs went live in January and we just launched our new website on Wednesday. It took us almost five months of elapsed time to do it, mostly waiting for me to populate content, but we did it all ourselves, on our own server hosted by Amazon.  Jonathan  and I spent many, many hours on con-calls figuring stuff out, and Jonathan now knows more about WordPress and WordPress plug-ins than he ever wanted to know and was the driving force behind getting this all done (he really is utterly fabulous – thank you Jon!). Overall, we’re really proud of what we achieved together – it’s a huge improvement over the old site.

By the way, if you haven’t seen them yet, check out the image rotators on the home page and on the training pages which showcase some of Kimberly’s finest underwater photography. You can get hi-res version of those photos here.

Special Offer

To celebrate getting our new website completely live, we’re offering all new clients a special offer. If you sign up for our signature SQL Server Health Check service during the month of April, it will only cost you US$2500. This covers the health check of a single SQL Server instance, plus up to an hour of email/phone review with us about the results, once you’ve read our comprehensive report. That’s more than a 1/3 saving over the usual price!

Check out our SQL Server Health Check description and then let us know that you’re interested.

We’re looking forward to helping you. It’s what we do best.

DBCC CHECKDB performance and computed-column indexes

[Edit 2016: The team ‘fixed’ the problem in SQL Server 2016 by skipping consistency checking these indexes unless WITH EXTENDED_LOGICAL_CHECKS is used.]

It’s no secret that DBCC CHECKDB has some performance quirks based on the schema of the database being checked and various kinds of corruptions. I was recently doing some scalability testing of DBCC CHECKDB for a blog post and discovered quite a nasty performance issue that exists in all versions of SQL Server back to SQL Server 2005. This isn’t a bug, it’s just the way things work.

The issue occurs when a nonclustered index exists that has a computed column as part of the index key or as one of the INCLUDEd columns and affects DBCC CHECKDB, DBCC CHECKFILEGROUP, and DBCC CHECKTABLE.

As a bit of background, here’s an excerpt from the SQL Server 2012 Internals book where I describe one of the ways nonclustered indexes are checked for consistency:

The nonclustered index cross-checks verify that

  • Every record in a nonclustered index (whether filtered or nonfiltered) must map to a valid record in the base table (that is, the heap or clustered index).
  • Every record in the base table must map to exactly one record in each nonfiltered, nonclustered index, and one record in each filtered index, where the filter allows.

The mechanism to carry out these checks efficiently has changed in every release since SQL Server 7.0—becoming progressively more and more efficient. In SQL Server 2012, two hash tables are created for each partition of each nonclustered index—one hash table is for the actual records in that partition of the nonclustered index, and the other is for the records that should exist in that partition of the nonclustered index (as calculated from the existing data records in the table).

When a nonclustered index record is processed, all columns in the record are hashed together into a BIGINT value. This includes

  • The physical or logical link back to the base table (known as the base table RID)
  • All included columns—even LOB and FILESTREAM values) are hashed together into a BIGINT value

The resulting value is added to the master hash value for actual records for the nonclustered index partition of which the record is part.

DBCC CHECKDB knows which nonclustered indexes exist for the table and what the complete nonclustered index record composition should be for each. When a data record is processed, the following algorithm is run for each matching nonclustered index record that should exist for the data record (taking into account any filter predicates for filtered nonclustered indexes):

  1. Create the nonclustered index record in memory (again, including the base table RID, plus included columns).
  2. Hash all columns in the index record together into a BIGINT value.
  3. Add the resulting value to the “should exist” master hash value for the relevant nonclustered index partition of which the index record is part.

The premise that this algorithm works on is that if no corruptions exist, the master hash values for the actual records and “should exist” records for each nonclustered index partition should match exactly at the end of the DBCC CHECKDB batch.

When a nonclustered index uses a computed column, the value of the computed column has to be computed based on the column definition. To do that, an internal mechanism called an ‘expression evaluator’ is created. The expression evaluator is provided by the Query Processor code and its behavior is entirely outside the control of DBCC CHECKDB. The drawback of the expression evaluator is that every time it is used, an exclusive latch must be held by the thread using it. This creates an incredible bottleneck and drastically affects performance.

My test system is the Dell R720 we bought last year, with 64GB of memory and 2 E5-2670 CPUs with 8 physical cores and hyperthreading enabled. It’s running SQL Server 2012 SP1 CU3.

The test database is AdventureWorks that Jonathan expanded to 500GB for me using his cool scripts. The database is split over 8 data files stored on two 320GB Fusion-io drives, with tempdb and its log placed on two more 320GB Fusion-io drives. I set things up this way to remove I/O waits from the test.

The first few runs of my testing, with unlimited parallelism produced DBCC CHECKDB elapsed times of around 340 minutes. This seemed incredibly slow to me, so I ran another test and looked at the output of sys.dm_os_waiting_tasks. I discovered that half the threads at any time were waiting for exclusive access to the DBCC_OBJECT_METADATA latch. Adding in some diagnostics to capture waits and latches gave the data below:

WaitType        Wait_S     Resource_S  Signal_S  WaitCount  Percentage  AvgWait_S  AvgRes_S  AvgSig_S
--------------- ---------- ----------- --------- ---------- ----------- ---------- --------- ---------
CXPACKET        684482.45  682667.13   1815.32   60294236   34.90       0.0114     0.0113    0.0000
OLEDB           659325.08  659325.08   0.00      207661462  33.62       0.0032     0.0032    0.0000
LATCH_EX        615168.39  605357.28   9811.11   798224634  31.37       0.0008     0.0008    0.0000

LatchClass               Wait_S     WaitCount  Percentage  AvgWait_S
------------------------ ---------- ---------- ----------- ----------
DBCC_OBJECT_METADATA     611768.00  764845636  99.16       0.0008

This didn’t make any sense to me as I couldn’t believe that such an incredible bottleneck existed – almost 1ms per latch wait and 800 million latch waits! – surely we’d have heard about it by now? I pinged my friend Ryan Stonecipher who inherited the DBCC CHECKDB code from me and he reminded me about the expression evaluator behavior.

I went back to my test system and looked for nonclustered indexes using computed columns using the code below:

FROM sys.columns [co]
JOIN sys.index_columns [ic]
    ON [ic].[object_id] = [co].[object_id]
    AND [ic].[column_id] = [co].[column_id]
JOIN sys.indexes [i]
    ON [i].[object_id] = [ic].[object_id]
    AND [i].[index_id] = [ic].[index_id]
JOIN sys.objects [o]
    ON [i].[object_id] = [o].[object_id]
JOIN sys.schemas [s]
    ON [o].[schema_id] = [s].[schema_id]
WHERE [co].[is_computed] = 1;

And I found 6 nonclustered indexes with computed columns in, on some of the largest tables in the database. I then disabled those nonclustered indexes and re-ran the DBCC CHECKDB tests.

17-18 minutes per run.

Wow! The bottleneck from the expression evaluator was causing DBCC CHECKDB to run about 20 times slower.

The wait and latch details for the test runs with the indexes disabled are as below, which is what I expected to see from running DBCC CHECKDB:

WaitType        Wait_S     Resource_S  Signal_S  WaitCount  Percentage  AvgWait_S  AvgRes_S  AvgSig_S
--------------- ---------- ----------- --------- ---------- ----------- ---------- --------- ---------
CXPACKET        33064.56   29331.74    3732.83   42034533   48.14       0.0008     0.0007    0.0001
OLEDB           28883.78   28883.78    0.00      173940154  42.05       0.0002     0.0002    0.0000
LATCH_EX        5021.20    4605.50     415.70    30340659   7.31        0.0002     0.0002    0.0000

LatchClass               Wait_S     WaitCount  Percentage  AvgWait_S
------------------------ ---------- ---------- ----------- ----------
DBCC_CHECK_AGGREGATE     5039.36    30267451   98.82       0.0002

There’s nothing we can do about this bottleneck except to disable nonclustered indexes on computed columns while DBCC CHECKDB is running and then rebuild them afterwards, which is not a very palatable solution. I wanted to blog about this just to make the issue known and help any of you who are beating your head against a wall trying to figure out poor DBCC CHECKDB performance.

[Edit 12/2/2014: Here’s another set of performance tests using a kCura database schema of various sizes.]