SQLskills hires deep technical expert Joseph Sack

Days like this don’t come around very often for us here at SQLskills.com – we’re expanding again! It’s been seven months since Jonathan Kehayias came on board and we’ve had a huge amount of fun together. Now business has expanded to the point where we need to grow again so it’s time to hire the next member of our close-knit, expert team.

Specifically, we’ve asked Joseph Sack to join us and we’re extremely pleased that he accepted our offer. He’ll become employee #4 when he starts with us on Monday, October 3rd.

 

Joe has worked in the SQL Server space since 1997, and we first met him back in 2006 when he was a Premier Field Engineer at Microsoft and he was in one of the early SQL Server 2005 Microsoft Certified Master (MCM) rotations – which he passed! He then went on to get the SQL 2008 MCM certification and finally took over responsibility for the whole SQL Server MCM program within Microsoft from 2009 to 2011.

It was while Joe was running the SQL MCM program that we got to know Joe really well and gain an appreciation for his extensive SQL Server knowledge and enterprise consulting expertise, as well as his passion for developing content and helping people learn – whether as a consultant or as a teacher.

In fact we respect Joe so much that recently we asked if he’d like to get back into the wild and varied consulting life by leaving Microsoft to join us. We feel honored that he agreed. He brings a wealth of performance troubleshooting, development, scalability and architecture expertise to the team and maintains the high bar we have of all team members either having MCM certification or being MCM instructors (or both!).

With our increased full-time consulting team we’re now able to meet the burgeoning demand for our services, including our new Remote DBA Service that we announced earlier this week. Joe will be blogging on SQLskills.com and hanging out on Twitter (@JosephSack) and in the community with the rest of us.

As you can tell, we’re very excited to have Joe join our team!

Thanks as always,

Paul and Kimberly

Who is watching over your precious SQL Servers while you’re busy?

I bet you'd love the answer to be "SQLskills.com"? Well now it can be.

Over the last few months some of our clients for whom we've performed SQL Server health checks on their critical servers have asked us if we'd consider a regular service where we perform mini-health checks on those same servers and also permanently monitor the servers for anything untoward happening.

Now that we're expanding our team of world-class SQL experts again (more details on Friday!) we have the capacity to do this, so today I'm announcing a new "remote DBA" service.

Once we've completed an initial health check of the SQL Server instances you'd like us to look after, we'll install a monitoring package that will alert you and us by email if anything out-of-the-ordinary happens (in response to which we'll log in and see what's up) and on a regular interval (weekly, bi-weekly, monthly, or quarterly – whatever works for you and your budget) we'll log in and perform a mini health check to ensure that things are ticking along nicely. We can log in with or without you being present – completely up to you – and of course we'll be under complete NDA. And we can do it anywhere in the world.

We're the company that shows people how to be great DBAs and we solve the nasty performance/SAN/design problems that others give up on.

Why not take the pressure off, get some peace of mind, and let us be *your* DBAs?

Exorbitantly expensive? Not at all – you're paying for hours we work, not a massive retainer.

If you're interested in finding out more details, send me an email.

It may just be the best investment you make all year.

How does DBCC CHECKDB WITH ESTIMATEONLY work?

This is a question that came up today on Twitter, and is actually something I’ve been meaning to blog about.

One of the biggest space hogs in tempdb can be DBCC CHECKDB. It generates all kinds of information about what it’s seeing in the database (called facts) and stores them in a giant worktable. A fact may be something like ‘we read page F’ or ‘record X on page Y points to an off-row LOB column in record A of page B’ or it could be something like an entire IAM page bitmap. It is usually the case that the amount of memory required for the worktable is more than is available to store it in memory and so the worktable spills out to tempdb.

DBCC CHECKDB needs to use this fact generation method because it doesn’t read the data file pages in any kind of logical or depth-first order – it reads them in allocation order, which is the fastest way. In fact it spawns multiple threads and each thread reads a set of pages, which is why I/O performance is sucked down while it is running – its special readahead drives the I/O subsystem as hard as it can. As each thread is generating all the facts, it hands them to the query processor which sorts them by a key DBCC CHECKDB defines (page ID, object ID, index ID etc) and inserts them into the worktable.

Once fact generation has finished, the query processor then gives the facts back to DBCC CHECKDB again so that it can match them up (e.g. page X points to page Y, so we better have seen page Y) – called the aggregation phase. If any mismatched or extra facts are found, that indicates a corruption and the relevant error is generated.

Here’s a picture of the process:

Now, because DBCC CHECKDB can use up so much tempdb space, we put in a way to ask it to estimate how much tempdb space will be required – it’s called WITH ESTIMATEONLY. The output looks something like:

Estimated TEMPDB space needed for CHECKALLOC (KB)
————————————————-
56
(1 row(s) affected)

Estimated TEMPDB space needed for CHECKTABLES (KB)
————————————————–
3345
(1 row(s) affected)

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

This isn’t the total amount of space required to check the entire database, because DBCC CHECKDB never checks the entire database in one go (unless using the new trace flag 2562). To try to limit the amount of tempdb space required, it breaks the database down into batches. Batches are built by adding in more and more tables until one of the following limits is reached:

  • There are 512 or more indexes in the batch
  • The total estimate for facts for this batch is more than 32MB

The smallest possible batch is one table and all its indexes – so a very large table may easily blow the 32MB fact limit (or theoretically the 512 index limit).

The fact size estimation is calculated by looping through all partitions of all indexes for a table (remember a non-partitioned table or index is a special case of a single partition as far as SQL Server is concerned) and adding up:

  • Twice the sum of all pages allocated to the partition (HoBt, LOB, and SLOB)
  • Three times the number of HoBt pages in the clustered index
  • Two times the number of LOB columns in the table
  • Two times the number of tables rows, if a heap
  • Maximum row size times the number of HoBt pages

And these totals are multiplied by the sizes of the relevant facts.

The WITH ESTIMATEONLY option runs through all the batches and spits out the largest total fact estimate from all the batches. It’s supposed to be a very conservative estimate, but certain pathological cases can trip it up as it can’t account for all possible schemas.

One more piece of knowledge I can safely page-out of my head now! :-)

PS Beware also that I’ve heard of many cases of a bug in SQL Server 2008 R2 and 2012 where the output is incorrectly very low. The dev team is aware of this issue and are working on it. It’s working again in 2014 (I’d also heard it was working in 2012 SP2 but anecdotal evidence says it’s not).