So why do you want to come to our training? And the winners are…

Wow! We had 22 entries for our competition (writing a post is a high bar so I was expecting 20-30) to win a free seat in either our IEPTO1 or IEPTO2 training classes in Chicago in October. That was a lot of reading to get through to pick the winners, but we made it. A bunch of the entrants even started SQL Server blogs so they could participate!

First off – we’d like to thank you all for the kind words about us.

Here are some quotes:

  • “For anyone who’s seen their PluralSight or MCM videos, or been lucky enough to see them at conferences, you know the kind of quality you can expect.”
  • “When it comes to SQL Server, they don’t just work with it, they helped develop it!  That alone keeps me in a little state of awe.”
  • “I see these classes as the gold standard in SQL Server training, and plan to attend someday by hook or by crook.”
  • “If you can attend these classes, you will be a giant leap ahead in the path that is SQL Server enlightenment.”
  • “The wealth of information available on SQLskills website is astounding…”
  • “I want to attend training because I want to be better at my job today, tomorrow and next year.”
  • “I want my brain to be filled past the brim with SQL Server internals and performance tuning knowledge.”
  • “No other blogger goes into SQL’s internal mechanics like he does, and word on the street is that his class is phenomenal.”
  • “I’d be able to better support our development team, know WHY something broke, and the best way to fix it, and continue being the vital asset that the company hired me to be.”
  • “I attended the IE2 course a few years ago and it was easily the best training I’ve ever been to for anything. The attention to detail was a big part of that. No questions went unanswered and the PowerPoint slides were packed with the full content rather than high-level tag lines. So when you got home and opened the course materials again you knew what they referred to.”
  • “When you are attending these courses you are not being taught by some schmuck who is reading through a book. You are being taught and guided by people who live and breathe the subject.”
  • “…where the the mysterious is made mundane and the pieces of the puzzles, even the pieces that were lost, are found and put together.”
  • “Also, out of all the places I have learned from, SQLSkills training is by far the best.  It is the best training that I have found anywhere.”
  • “I have used the skills Jonathan and Erin conveyed to troubleshoot a plethora of SQL Server administrative problems experienced both internally at my current company and externally on many of our customer’s production sites.”
  • “SQL has a lot of Red buttons,  I want to know which ones to push,  but more importantly I want to know what they do so I can know why to push them.”

There were so many good entries that we literally couldn’t decide, and we struggled *really* hard. We narrowed it down to the top 5 entries and then tried to figure out some way to pick the winners. Then we said “screw it, we’ll give all five a seat”. So there are five winners!

The grand prize winners for IEPTO1 are…
ALAN HORSMAN, BERT WAGNER, and LISA HESS – who each win US$3,795 of training!

The grand prize winners for IEPTO2 are…
BRIAN AVERITT and DAVID STOKES – who each win US$3,795 of training!

Woohoo! Well done to all of you! I’ve sent the winners an email with instructions on what to do next (if you’re a winner and haven’t received an email, please check your junk folder). All prizes are non-transferable and have no cash value.

Commiserations to those who didn’t win, but again, thank you for trying.

We’re looking forward to seeing you in Chicago in six weeks! (Btw, we’ve extended the early-bird pricing for all Chicago classes until the end of this month…)

Paul and Kimberly

Here are all the entries, divided by class requested, in the order in which they were posted – if you need some way of justifying the training to your company, just point them here! Winning posts have an asterisk.

IEPTO1

  1. * Alan Horsman: SQL Server Performance and Optimization Training
  2. Justin Onstot: The Path to Enlightenment – SQLSkills style!
  3. Josh Smith: Why I’m Hoping I Can Attend Training with the SQL Skills Team
  4. * Bert Wagner: SQLskills is Giving Away Free Training!
  5. David Clary: When the Student is Ready, The Teacher Will Appear
  6. Sina Hassanpour: SQLSkills-Challenge
  7. Ryan Boyer: Head over to SQLSkills.com! They’re giving away training!
  8. Chris Albert: The Gift of Training
  9. Matt Bowler: Immersion Submission!!
  10. * Lisa Hess: Missing Puzzle Pieces
  11. Vijai Kingston: Learn from the expert – “For FREE?”
  12. Mike Laugle: SQLSkills Competition
  13. Zach Wagner: SQL Skills Competition
  14. Kevin Crisci: PLEASE PICK ME

IEPTO2

  1. Mike McKay: Training course rant & SQLSkills
  2. * Brian Averitt: Not All Training Is Created Equal
  3. Francis Hanlon: Improving your SQLskills
  4. Aleksey Fomchenko: The best SQL Server training ever
  5. Vedran Ikonic: Will SQLskills IEPTO2 Class make me a better DBA? IT DEPENDS!
  6. Karthick Jayavelu: Tune Yourself to Optimize SQL Performance
  7. * David Stokes: What an opportunity!
  8. Mohammad Darab: IEPTO2: Performance Tuning and Optimization – Part 2 (Competition)

Competition: win a free seat in our classes in October!

We’re really excited about the classes we’ve got coming up in Chicago and to say ‘thank you’ to the SQL community for all your support of SQLskills.com over the years, we’re running a competition. It’s been a while since we ran a competition so we’re giving away TWO prizes this time!

THE COMPETITION IS NOW CLOSED. Winner details are here.

The Prize

There are two prizes:

  • free seat in our 5-day IEPTO1: Performance Tuning and Optimization – Part 1 class in Chicago, October 2-6, 2017 (link to class details)
  • free seat in our 5-day IEPTO2: Performance Tuning and Optimization – Part 2 class in Chicago, October 9-13, 2017 (link to class details)

If you win the competition, you pay US$0 instead of the US$3,295 early-bird price or US$3,795 full price. You’ll be responsible for all travel expenses and your hotel bill. The class includes breakfast, lunch, all-day tea/coffee/soda and tons of juicy content :-)

The Challenge

All you have to do is pick which class you’d like to attend and write a blog post before end-of-day Sunday, August 13th, 2017 explaining:

  • Why you want to come to that class taught by us
  • Why you’d make the best use of the knowledge you’ll get from being in the class
  • What your favorite performance tuning challenge is

SQLskills will select what we consider to be the best entry for each prize.

The Rules

  • You can only enter the competition once (i.e. one blog post choosing one class)
  • You must write a blog post. Answers in the comments section of this post will be not be accepted. If you don’t have a blog, it’s easy to create one – e.g. wordpress.com
  • Your blog post must go live before 11.59pm Pacific Time on Sunday, August 13th, 2017
  • Your blog post must specify which class you’d like to attend and contain a link to the class details (see The Prize section above)
  • You must post a comment on this blog post that links to your blog post before 11.59pm Pacific Time on Sunday, August 13th, 2017 (comments are moderated, so don’t panic if you don’t see it appear immediately)
  • If you’re on Twitter, you must tweet your blog post using the #sqlskills #sqlserver hash tags
  • If you win, you will be able to register for the class you chose in your blog post at the special price of US$0
  • If you win, only you can attend the class at the special price of US$0 – your prize is not transferable and is only valid for the October 2017 class
  • If you win, you will be responsible for all travel and hotel costs for attending the class
  • If you enter the competition we will include your name and link to your post in a blog post announcing the lucky winners

If you don’t agree with the rules, don’t enter – simple.

The Summary

We’ll announce the lucky winners on Monday, August 14th, 2017 in a blog post on my blog and on Tuesday in the next Insider newsletter (delayed a day).

Good luck and we hope to see you there!

SQLskills SQL101: REBUILD vs. REORGANIZE

As Kimberly blogged about earlier this year, SQLskills has an ongoing initiative to blog about basic topics, which we’re calling SQL101. We’re all blogging about things that we often see done incorrectly, technologies used the wrong way, or where there are many misunderstandings that lead to serious problems. If you want to find all of our SQLskills SQL101 blog posts, check out SQLskills.com/help/SQL101.

Often when discussing index fragmentation I find that people aren’t aware of all the differences between rebuilding an index (using ALTER INDEX … REBUILD) and reorganizing an index (using ALTER INDEX … REORGANIZE), so that’s the topic for this post. Everything I say below applies to row-based indexes only (i.e. not columnstore indexes), and equally to a clustered index or a nonclustered index.

Space Required

Rebuilding an index requires building a new index before dropping the old index, regardless of the amount of fragmentation present in the old index. This means you need to have enough free space to accommodate the new index.

Reorganizing an index first squishes the index rows together to try to deallocate some index pages, and then shuffles the remaining pages in place to make their physical (allocation) order the same as the logical (key) order. This only requires a single 8-KB page, as a temporary storage for pages being moved around. So an index reorganize is extremely space efficient, and is one of the reasons I wrote the original DBCC INDEXDEFRAG for SQL Server 2000 (the predecessor of ALTER INDEX … REORGANIZE).

If you have space constraints, and can’t make use of single-partition rebuild, reorganizing is the way to go.

Algorithm Speed

An index rebuild will always build a new index, even if there’s no fragmentation. The length of time the rebuild takes is related to the size of the index, not the amount of fragmentation in it.

Reorganizing an index only deals with the fragmentation that exists, and the more fragmentation there is, the longer a reorganize will take.

This means that for a lightly fragmented index (e.g. less than 30% fragmentation), it’s generally faster to reorganize the index, but for a more heavily fragmented index, it’s generally faster to just rebuild the index. This is why you may have seen thresholds of 0 to 5-10% do nothing, 5-10% to 30% reorganize, 30%+ rebuild. I created this guidance while I was at Microsoft – see here.

Transaction Log Generated

In the FULL recovery mode, an index rebuild is fully logged, so the transaction log will have to accommodate the full size of the index in a single transaction. This also means the entire generated transaction log may need to be mirrored, sent to your AG replicas, scanned by replication, backed up, and so on.

In the SIMPLE and BULK_LOGGED recovery modes, the amount of transaction log generated by an index rebuild will be minimal – just the allocations of pages and extents. However, the next log backup performed (either in BULK_LOGGED or after switching to FULL) will also contain all the extents changed by the rebuild, and so the log backup will be roughly the same size as if the rebuild was done in the FULL recovery mode. The benefits are in time and the fact that the transaction log itself does not have to accommodate the full size of the index during the rebuild in a single transaction.

In all recovery modes, reorganizing an index is fully logged, but is performed as a series of small transactions so should not cause the transaction log to grow inordinately. And of course, transaction log is only generated for the operations performed, which may be less for a reorganize as it only deals with fragmentation that exists.

Locks Required

An offline index rebuild of any index holds a schema-modification (i.e. super-exclusive) table lock – no updates or reads of the entire table.

An online index rebuild of any index acquires a short-term shared table lock at the start of the operation, holds an intent-shared table lock throughout the operation (which will only block exclusive and schema-modification table locks), and then acquires a short-term schema-modification  table lock at the end of the operation. ‘Online’ is a bit of a misnomer. From SQL Server 2014, you can use the WAIT_AT_LOW_PRIORITY option to delay the potential for blocking – see this blog post.

An index reorganize holds an intent-exclusive table lock throughout the operation, which will only block shared, exclusive, and schema-modification table locks. One of the major reasons I wrote DBCC INDEXDEFRAG for SQL Server 2000 was as an online alternative to DBCC DBREINDEX.

Interruptible or Not

An index rebuild operation cannot be interrupted without it rolling back everything it’s done so far – it’s atomic – all or nothing. In SQL Server 2017, however, there is a resumable-online index rebuild feature.

An index reorganize can be interrupted and the worst that will happen is that a single page move operation is rolled back.

Progress Reporting or Not

Index rebuilds do not have proper progress reporting. You can hack it for online index operations by looking at the bigintdata1 column in the Progress Report: Online Index Profiler event, which happens to show how many rows of the old index have been scanned. You can also hack it for index operations by looking at the number of page reads the SPID has done in sys.dm_exec_requests.

Index reorganize operations populate the percent_complete column of sys.dm_exec_requests so you can easily gauge how much work remains. In fact DBCC INDEXDEFRAG also used to do progress reporting, but less elegantly, by printing a progress message to your connection every 30 seconds.

Statistics

An index rebuild will always rebuild the index column statistics with the equivalent of a full scan (or sampled, for an index partition or if the index is partitioned).

An index reorganize does not see a total view of the index and so cannot update statistics, meaning that manual index statistics maintenance is required.

Summary

As you can see, there are quite a few major differences between rebuilding and reorganizing, but there’s no right answer as to which one you should use – that’s your choice.

If you have an index maintenance routine that always rebuilds and never considers reorganizing, you should reconsider. It’s usually better to reorganize a lightly fragmented index and rebuild a more heavily fragmented index – to save time and resources. You’ll find that most index maintenance products and freely-available scripts allow you to make that choice.

And as always, rather than writing your own index maintenance solution, I recommend Ola Hallengren’s free code (yes, other people have done similar, but I think Ola’s is by far the best and most widely used).

New course: Upgrading and Migrating to SQL Server 2016

We’ve just published a new Pluralsight course!

This is Glenn’s latest course – SQL Server: Upgrading and Migrating to SQL Server 2016 – and is just over two hours long. It’s a must for anyone working on a production upgrade/migration.

The modules are:

  • Introduction
  • Upgrade Planning
  • Upgrade Testing
  • Migration Planning
  • Migration Testing
  • Production Migration

Check it out here.

We now have 150 hours of SQLskills online training available (see all our 52 courses here), all for as little as $29/month through Pluralsight (including more than 5,000 other developer and IT training courses). That’s unbeatable value that you can’t afford to ignore.

Enjoy!

SQLskills SQL101: Why DBCC CHECKDB can miss memory corruption

As Kimberly blogged about earlier this year, SQLskills has an ongoing initiative to blog about basic topics, which we’re calling SQL101. We’re all blogging about things that we often see done incorrectly, technologies used the wrong way, or where there are many misunderstandings that lead to serious problems. If you want to find all of our SQLskills SQL101 blog posts, check out SQLskills.com/help/SQL101.

An interesting situation was discussed online recently which prompted me to write this post. A fellow MVP was seeing periodic corruption messages in his error log, but DBCC CHECKDB on all databases didn’t find any corruptions. A subsequent restart of the instance caused the problem to go away.

My diagnosis? Memory corruption. Something had corrupted a page in memory – maybe it was bad memory chips or a memory scribbler (something that writes into SQL Server’s buffer pool, like a poorly-written extended stored procedure), or maybe a SQL Server bug. Whatever it was, restarting the instance wiped the buffer pool clean, removing the corrupt page.

So why didn’t DBCC CHECKDB encounter the corrupt page?

The answer is to do with DBCC CHECKDB‘s use of database snapshots (and all other DBCC CHECK* commands). It creates a database snapshot and then runs the consistency-checking algorithms on the database snapshot. The database snapshot is a transactionally-consistent, unchanging view of the database, which is what DBCC CHECKDB requires.

More info on DBCC CHECKDB’s use of snapshots, and potential problems can be found at:

A database snapshot is a separate database as far as the buffer pool is concerned, with its own database ID. A page in the buffer pool is owned by exactly one database ID, and cannot be shared by any other databases. So when DBCC CHECKDB reads a page in the context of the database snapshot, that page must be read from the source database on disk; it cannot use the page from the source database if it’s already in memory, as that page has the wrong database ID.

This means that DBCC CHECKDB reads the entire source database from disk when it uses a database snapshot. This is not a bad thing.

This also means that if there’s a page in the source database that’s corrupt in memory but not corrupt on disk, DBCC CHECKDB will not encounter it if it uses a database snapshot (the default).

If you suspect that a database has some corruption in memory, the only way to have DBCC CHECKDB use the in-memory pages, is to use the WITH TABLOCK option, which skips using a database snapshot and instead uses locks to quiesce changes in the database.

Hope this helps clear up any confusion!

SQLintersection Fall 2017

As we head towards our 10th SQLintersection in October, I’m excited to say that it’s once again our most diverse, complete, and information-packed show yet!

One of the pieces of feedback we hear over and over is that attendees love SQLintersection because it’s a smaller, laid-back show, where you get to actually spend time talking with the presenters 1-1. I have to say that’s one of the reasons why we love the show so much; *we* get to spend time talking to attendees, rather than being mobbed by hundreds of people after a session ends. And we only pick presenters who we know personally, and who we know to be humble, approachable, and eager to help someone out.

We have 2 pre-con days at the show and with our post-con day, there are 9 full-day workshops from which to choose. We have 40 technology-focused (NOT marketing) sessions from which to choose, plus two SQL Server keynotes, multiple industry-wide keynotes by Microsoft executives, and the ever-lively closing Q&A that we record as a RunAs Radio podcast.

You’ll learn proven problem-solving techniques and technologies you can implement immediately. Our focus is around performance monitoring, troubleshooting, designing for scale and performance, cloud, as well as new features in SQL Server 2014, 2016, and 2017. It’s time to determine your 2008 / 2008 R2 migration strategy – should you upgrade to 2016/2017 directly? This is the place to figure that out!

If you’re interested in how we got here – check out some of Kimberly’s past posts:

  1. SQLintersection: a new year, a new conference
  2. SQLintersection’s Fall Conference – It’s all about ROI!
  3. Fall SQLintersection is coming up soon and we can’t wait!
  4. SQLintersection Conference and SQLafterDark Evening Event – what a fantastic week in Vegas

And Kimberly recorded a Microsoft Channel 9 video where she discusses the Spring show – see here.

SQLafterDark

With minimal to no marketing filler, we’ve largely kept our conference focus on ROI and technical content (performance / troubleshooting / tales-from-the-trenches with best practices on how to fix them ) but we’ve also added even more social events so that you really get time to intersect with the conference attendees and speakers. The addition of the SQL-specific, pub-quiz-style evening event SQLafterDark was wildly popular from some of our past shows and that’s returning for Spring!

 

SQLintersection: Great Speakers!

Once again, I think a great show starts with great speakers and current / useful content. All of these speakers are industry-experts that have worked in data / SQL for years (some can even boast decades) but all are still focused on consulting and working in the trenches. And, they’re good presenters! Not only will you hear useful content but you’ll do so in a way that’s digestible and applicable. Every speaker is either an MCM (Master), a SQL Server MVP, or a past/present Microsoft employee (or a combination of all three!) But, regardless of their official credentials – ALL are focused on providing the most ROI that’s possible in their session(s) and/or their workshops, and ALL have spoken for SQLintersection multiple times.

Check out this phenomenal list of speakers:

  • Aaron Bertrand – MVP, SentryOne
  • David Pless – MCM, Microsoft
  • Jes Borland, past-MVP, Microsoft
  • Jonathan Kehayias – MCM, MCM Instructor, MVP
  • Justin Randall, MVP, SentryOne
  • Kimberly L. Tripp – MCM Instructor, MVP, past Microsoft, SQLskills
  • Paul S. Randal – MCM Instructor, MVP, past Microsoft, SQLskills
  • Shep Sheppard – past Microsoft, Consultant
  • Stacia Varga, MVP, Consultant
  • Tim Chapman – MCM, Microsoft
  • Tim Radney – MVP, SQLskills

You can read everyone’s full bio on our speaker page here.

SQLintersection: When is it all happening?

The conference officially runs from Tuesday, October 31 through Thursday, November 2 with pre-conference and post-conference workshops that extend the show over a total of up to 6 full days. For the full conference, you’ll want to be there from Sunday, October 29 through Friday, November 3.

  • Sunday, October 29 – pre-con day. There are two workshops running:
    • Data Due Diligence – Developing a Strategy for BI, Analytics, and Beyond with Stacia Varga
    • Performance Troubleshooting Using Waits and Latches with Paul S. Randal
    • SQL Server 2014 and 2016 New Features and Capabilities with David Pless and Tim Chapman
  • Monday, October 30 – pre-con day. There are two workshops running:
    • Building a Modern Database Architecture with Azure with Jes Borland
    • Data Science: Introduction to Statistical Learning and Graphics with R and SQL Server with Shep Sheppard
    • Extended Events: WTF OR FTW! with Jonathan Kehayias
  • Tuesday, October 31 through Thursday, November 2 is the main conference. Conference sessions will run all day in multiple tracks:
    • Check out our sessions online here
    • Be sure to check out our cross-conference events and sessions
    • Get your pop-culture trivia and techie-SQL-trivia hat on and join us for SQLafterDark on Wednesday evening, November 1
  • Friday, November 3 is our final day with three post-conference workshops running:
    • Common SQL Server Mistakes and How to Correct Them with Tim Radney
    • SQL Server 2016 / 2017 and Power BI Reporting Solutions with David Pless
    • Very Large Tables: Optimizing Performance and Availability through Partitioning with Kimberly L. Tripp

SQLintersection: Why is it for you?

If you want practical information delivered by speakers that not-only know the technologies but are competent and consistently, highly-rated presenters – this is the show for you. You will understand the RIGHT features to troubleshoot and solve your performance and availability problems now!

Check us out: www.SQLintersection.com.

We hope to see you there!

PS – Use the discount code ‘SQLskills’ when you register and receive $50 off registration!

Calling all user group leaders! We want to present for you again in 2017!

So far this year, we at SQLskills have presented remotely (and a few in-person) to 61 user groups and PASS virtual chapters around the world, and we have 13 more scheduled!

Now we’re into the second half of 2017, we’d like to schedule some more presentations through the rest of the year.

We’d love to present remotely for your user group in 2017, anywhere in the world. Doesn’t matter if we’ve already presented for you this year. It’s not feasible for us to travel to user groups or SQL Saturdays unless we’re already in that particular city, but remote presentations are easy to do and are becoming more and more popular. We haven’t had any bandwidth problems doing remote presentations so far this year to groups as far away as Brazil, Australia, and New Zealand, plus Norway, Bulgaria, UK, India, Belgium, Poland, Ireland, and Canada. This way we can spread the community love around user groups everywhere that we wouldn’t usually get to in person.

Note: we have our own Webex accounts which we generally use, or we can use your GoToMeeting or Webex, but you must use computer audio – we won’t call in by phone as the sound quality is too poor. We also will not use Skype/Lync as we’ve had too many problems with it around user group laptops and sound.

So, calling all user group leaders! If you’d like one of us (me, Kimberly, Jon, Erin, Glenn, Tim) to present remotely for you in 2017 (or maybe even multiple times), send me an email and be sure to include:

  • Details of which user group you represent (and if sending from a shared user group account, your name)
  • The usual day of the month, meeting time, and timezone of the user group
  • Which months you have available, starting in January 2017 (a list of available dates would be ideal)

And I’ll let you know who’s available with what topics so you can pick.

What’s the catch? There is no catch. We’re just continuing our community involvement next year and we all love presenting :-)

And don’t think that because you’re only reading this now (maybe a few weeks or months after the posting date) that we can’t fit you in – send me an email and we’ll see what we can do.

We’re really looking forward to engaging with you all!

Cheers

New class added to October lineup: IEPS on PowerShell

Due to popular demand, we’ve added another class this October (9th-11th) in Chicago: Immersion Event on PowerShell for SQL Server DBAs. We debuted this class in May in Chicago and it was very popular!

Note: the US$2,195 early-bird price expires on August 4th!

It’s a 3-day class, taught by MVP, MCM, and industry expert Ben Miller, and no prior PowerShell experience is necessary. PowerShell is getting more and more popular, and by the end of the class you’ll have learned the following:

  • Installation and configuration of PowerShell
  • Use of the ISE that comes with PowerShell
  • Shared Management Objects (SMO) for SQL Server
  • Programming concepts in PowerShell
  • Modules and scripts to manage SQL Server
  • Gathering data for analysis using PowerShell both with SMO and T-SQL
  • Repositories that contain scripts that you can leverage in your daily work

The modules are as follows:

  • PowerShell Environment
  • Commands, Cmdlets  and Modules
  • Environment Configuration
  • Assemblies in PowerShell
  • PowerShell Fundamentals
  • SQL Server PowerShell
  • PowerShell for the DBA Starter
  • Gathering Information
  • Power Tools
  • Scheduling

You can read a more detailed curriculum here and all the class registration and logistical details are here.

We hope to see you there!

SQLskills SQL101: Practicing disaster recovery

As Kimberly blogged about earlier this year, SQLskills has an ongoing initiative to blog about basic topics, which we’re calling SQL101. We’re all blogging about things that we often see done incorrectly, technologies used the wrong way, or where there are many misunderstandings that lead to serious problems. If you want to find all of our SQLskills SQL101 blog posts, check out SQLskills.com/help/SQL101.

At the weekend Kimberly and I attended our first ever SQLSaturday (in Dublin) and as well as a workshop on wait stats, I presented a session on Advanced Data Recovery Techniques. The contents of that session are way too advanced for a 101-level post (you can watch a video of it from the PASS Summit 2014 here if you’re interested) but one of the things I stressed at the start was that practicing disaster recovery techniques is crucial for success when a disaster happens for real.

It doesn’t matter how experienced you are with SQL Server, if you’re responsible for a SQL Server instance, you have to know the basics of how to recover when a disaster strikes (I touched on that earlier in the series in the post SQLskills SQL101: Dealing with SQL Server corruption) and you have to have practiced.

In this post I want to pose a short (by no means exhaustive) series of questions to you about what practicing you have (or haven’t) done, and make you think about an honest answer to each one.

Part 1: Information

  • Do you know where the latest copy of the disaster recovery handbook/run book is? Does everyone else know? (Here’s an example template.)
  • Do you know where the scripts are for automating restores of your backups?
  • Do you know where the installation media for Windows and SQL Server are kept in your environment?
  • Do you know where the Windows and SQL Server product keys are?
  • Do you know how you’ll be able to get new servers if your data center is destroyed?
  • And do you know where they will be installed? What about network? Power? HVAC?
  • Do you know who to call when/if you get stuck during the disaster recovery process?
  • Do you know the priority order for restoring databases/instances in your environment?
  • Do you know where the various SQL Server passwords and encryption keys are stored?

Part 2: Techniques

  • When was the last time you performed a full restore sequence, including tail-of-the-log backups?
  • When was the last time you performed a failover to your secondary servers/data center?
  • When was the last time you performed a bare-metal install?
  • When was the last time you rebuilt or restored master on a server?
  • When was the last time you practiced a recovery as if your main server was completely dead? (And did you successfully recover?)
  • When was the last time you practiced a recovery as if your main SAN was completely dead? (And did you successfully recover?)
  • When was the last time you practiced a recovery as if you didn’t have onsite backups and the SAN was dead? (And did you successfully recover?)

Summary

Think through the answers to the questions above and consider whether you’re comfortable with your responses. Now,think whether you’d be comfortable if someone responsible for some of your data (e.g. your bank, 401-k/retirement account holder, doctor’s office, favorite airline, credit-card companies) made those same answers about their disaster-recovery preparedness. My feeling is that you should be able to answer ‘yes’ for all the Part 1 questions, and answer ‘within the last 3 months’ for all the Part 2 questions, to feel fully comfortable that you’re practicing enough.

Increased SOS_SCHEDULER_YIELD waits on virtual machines

A few months ago while I was teaching wait statistics, I was asked whether there’s any expected differences with waits stats when SQL Server is running in a virtual machine.

My answer was yes – there’s a possibility of seeing longer wait times if something prevents the VM from running, as the wait times are based on the __rdtsc counter difference (essentially the processor clock tick count) between the wait starting and ending.

In VMware or Hyper-V, if a thread inside of SQLOS is waiting for a resource, and the VM has to wait to be scheduled to execute by the hypervisor due to the host being oversubscribed with vCPUs based on the hardware pCPUs, then the actual resource wait time noted in SQL Server will include that time that the VM was unable to run, and so the wait time will appear to be longer than it would have been had the VM not been delayed.

It’s an interesting discussion on whether this is problematic or not, but my view is that it could lead to someone chasing a SQL Server performance problem that’s actually a VM performance problem. Note: this isn’t a problem with the hypervisor, this is because of a misconfiguration of the virtual environment.

Anyway, after the class I got to thinking about thread scheduling in general on a VM that is periodically delayed from running and whether it could cause any other interesting effects around wait statistics.

Specifically, I was concerned about SOS_SCHEDULER_YIELD waits. This is a special wait type that occurs when a thread is able to run for 4ms of CPU time (called the thread quantum) without needing to get suspended waiting for an unavailable resource. In a nutshell, a thread must call into the SQLOS layer every so often to see whether it has exhausted its thread quantum, and if so it must voluntarily yield the processor. When that happens, a context switch occurs, and so a wait type must be registered: SOS_SCHEDULER_YIELD. A deeper explanation of this wait type is in my waits library here.

My theory was this: if a VM is prevented from running for a few milliseconds or more, that could mean that a thread that’s executing might exhaust its thread quantum without actually getting 4ms of CPU time, and so yield the processor causing an SOS_SCHEDULER_YIELD wait to be registered. If this happened a lot, it could produce a set of wait statistics for a virtualized workload that appears to have lots of SOS_SCHEDULER_YIELDs, when in fact it’s actually a VM performance problem and the SOS_SCHEDULER_YIELD waits are really ‘fake’.

I discussed this with my good friend Bob Ward from the SQL Product Group and after some internal discussions, they concurred that it’s a possibility because the thread quantum exhaustion time is calculated using the __rdtsc intrinsic when the thread starts executing, so any delay in the VM running could produce the effect I proposed.

Given that I’m a virtual machine neophyte, I asked Jonathan to run some tests inside of our VMware lab environment to see if he could show the issue happening. He ran a known workload that we use in our Immersion Events to demonstrate the performance impact of host oversubscription, causing a VM to be delayed, and lo and behold, he saw a substantially elevated level of SOS_SCHEDULER_YIELD waits (around 20x more) for the workload, compared to running the same workload on the same VM without any delays.

These same tests were repeated in our Hyper-V lab environment that is identical in hardware and VM configuration to the VMware environment and similar levels of elevated SOS_SCHEDULER_YIELD waits were also seen, so the issue is definitely not specific to any given hypervisor or virtual platform, it’s purely related to the host being oversubscribed for the workloads being run and the SQL Server VM having to wait for CPU resources to continue execution.

I’m deliberately not presenting Jonathan’s test results here because I’m not qualified to explain VMware esxtop output or Hyper-V performance counter values and how they correlate to the SOS_SCHEDULER_YIELD numbers to show the problem occurring. Jonathan will do a follow-up post in the next week or two that explains the results from a virtualization perspective.

However, with a simple set of tests we were able to show that with a VM that gets delayed from running, a SQL Server workload can show a much higher level of SOS_SCHEDULER_YIELD waits because of the use of the __rdtsc intrinsic to calculate thread quantum exhaustion times.

This is really interesting because this is a VM performance issue *causing* a wait type to appear, not just causing waits to be longer.

You should definitely consider this phenomenon if you’re investigating a high number of SOS_SCHEDULER_YIELD waits, a workload performance problem, and your workload is running in a VM. In the next post on this, which I’ll link to here when it’s published, Jonathan will explain how to correlate these waits with signs of a VM performance problem.

Hope this helps!