SQLskills SQL101: Is the recovery interval guaranteed?

SQLskills has an ongoing initiative to blog about basic topics, which we’re calling SQL101. We’re all blogging about things that we often see done incorrectly, technologies used the wrong way, or where there are many misunderstandings that lead to serious problems. If you want to find all of our SQLskills SQL101 blog posts, check out SQLskills.com/help/SQL101.

One of the concepts I find people misunderstand frequently is the recovery interval, either for the server as a whole or the per-database setting that was introduced in SQL Server 2012 for indirect checkpoints.

There are two misconceptions here:

  1. The recovery interval equals how often a checkpoint will occur
  2. SQL Server guarantees the recovery interval (i.e. crash recovery for the database will only take the amount of time specified in the recovery interval)

A bit of background: crash recovery has two tasks to perform: replaying log records from committed transactions (called REDO) and removing the effect of log records from uncommitted transactions (called UNDO). REDO only has to occur if there have been committed transactions where the changed data pages have not been written to disk (which is done by periodic checkpoints or a lazy writer background thread if there’s memory pressure on the buffer pool).

The recovery interval specifies an ideal upper bound on how long the REDO portion of crash recovery will take. The length of time REDO takes depends on how many log records need to be replayed from committed transactions. To help REDO stay on track, the recovery interval setting forces a checkpoint to occur when a certain number of log records have been generated (a number calculated using the chosen recovery interval).

So although recovery interval *does* control how often a checkpoint occurs, the recovery interval time does not equal the time between checkpoints.

For instance, if the recovery interval is set to one minute (the default), and a checkpoint occurs *now*, when will the next checkpoint occur? If activity in the database is very infrequent, a checkpoint may not occur for a very long time because the calculated recovery time for that activity will be less than one minute. If activity in the database is extremely high, checkpoints may occur every few seconds. Checkpoint frequency entirely depends on how fast log records are being generated *and* the recovery interval setting.

And of course SQL Server cannot *guarantee* the recovery interval. Imagine the recovery interval is set to one minute, and I start a transaction that updates millions of records over several hours. If SQL Server crashes just before the transaction commits, how long do you think it will take to roll back the millions of updates? Certainly it will take much longer than one minute – the configured recovery interval.

So I can update my definition to be: the recovery interval is therefore the ideal upper bound on how long the REDO portion of crash recovery will take, assuming no long-running transactions.

Hope this helps clear up any misconceptions.

No Fall U.S. classes in 2018

We’ve been finalizing our schedule for the remainder of 2018 and wanted to let everyone know that we will not have any Immersion Events in the U.S. this Fall. While we typically have several weeks of in-person classes in Chicago in October or November, we have several events and engagements already scheduled, including multiple Immersion Events in London.

The only in-person Immersion Events in the U.S. this year will be in Chicago in April/May and Bellevue, WA in June.

This means that each of our in-person Immersion Events will only be offered once in the U.S. this year (with an extra IEPTO1 in June in Bellevue). These classes are all filling up quickly, so please don’t wait to the last minute to register as you might miss out. If you haven’t checked our event list in a while, here’s what we’re offering:

  • IEPTO1: 5 days, Performance Tuning and Optimization, Part 1
  • IEPTO2: 5 days, Performance Tuning and Optimization, Part 2
  • IE0: 3 days, Junior/Accidental DBA
  • IEUpgrade: 3 days, Upgrading and New Features
  • IEAzure: 2 days, Azure SQL Database and Azure VMs
  • IECAG: 2 days, Clustering and Availability Groups
  • IEPS: 3 days, PowerShell for the SQL Server DBA
  • IEBIStrat: 3 days, Developing a BI and Analytics Strategy
  • IEPML: 5 days, Practical Machine Learning, AI, and Data Science
  • IESSIS1: 5 days, Learning SQL Server Integration Services

You can see the April/May/June U.S. class schedule here.

Thanks

PS Rest assured, there are no changes to our schedule for London in September (our two performance tuning classes, plus Azure and clustering/AGs) – we’re looking forward to seeing everyone!