(The Curious Case of… used to be part of our bi-weekly newsletter but we decided to make it a regular blog post instead so it can sometimes be more frequent. It covers something interesting one of us encountered when working with a client, doing some testing, or were asked in a random question from the community.)
I had a question in email last week on why a minimally-logged operation doesn’t make a log backup smaller.
The clue to the answer is in the description of the operation: minimally-logged.
When you perform one of a few operations in the bulk-logged or simple recovery models (e.g. offline index build/rebuild, or bulk load where all requirements for minimal logging have been met – see this whitepaper) then SQL Server vastly reduces the amount of transaction log that’s generated. It does this by only logging the allocation of pages and extents for that operation, and not the contents of the pages themselves.
Now imagine that a log backup after such an operation backed up *only* the transaction log produced. On restoration of that database, including that log backup, the result would be a table or index comprised of empty pages – which is not allowed in SQL Server.
Bottom line: a log backup after a minimally-logged operation has to also back up the data file pages that were populated by the operation, so they can also be restored during the restore of the log backup. This is why a minimally-logged operation does not result in the next log backup being a lot smaller (but it may be a little bit smaller, because of difference in size of 8K data pages vs. log records containing pages).
PS As a corollary question, you may wonder how the log backup gets those data pages, because a log backup does not start with a checkpoint, and so surely those dirty pages are still just in memory and not on disk? No – because for minimally-logged operations, data file pages are written to disk immediately. This mechanism is called ‘eager writing’ and guarantees that the results of a minimally-logged operation are persisted on disk when the operation completes.
5 thoughts on “The Curious Case of… why a minimally-logged operation doesn’t make a log backup smaller”
Very interesting, thanks! And hmm, a log backup does (sometimes?) include a checkpoint though, doesn’t it?
No – log backups don’t do checkpoints. And no need for them to do so, as they’re not backing up anything from the data file except the pages that were minimally logged. Even if those pages have since been changed in memory, the log records for those changes will be backed up by the log backup, so a data flush isn’t necessary.
I did some digging and found a case we had with Microsoft support about log backup performance, and we saw a relation to checkpoints. Quoting the support engineer:
“I have copied two snippets which shows a fast and slow run . when the safe checkpoint operation happens it is taking 10 seconds.
safe checkpoint is fired during log backup if there is some log space to be freed as a result of taking a checkpoint. If not, it skips the checkpoint and just truncates the log. This is decided when the log backup runs based on some internal checks.
Slow run : 10 seconds to complete
01/14/2019 06:33:43,spid131,Unknown,Backup([redacted]): BACKUP LOG finished
01/14/2019 06:33:43,spid131,Unknown,Backup([redacted]): Opening the backup media set
01/14/2019 06:33:43,spid131,Unknown,Backup([redacted]): SafeCheckpoint is complete
01/14/2019 06:33:33,spid131,Unknown,Backup([redacted]): SafeCheckpoint will write a checkpoint if needed
01/14/2019 06:33:33,spid131,Unknown,Backup([redacted]): Synchronizing with other operations on the database is complete
01/14/2019 06:33:33,spid131,Unknown,Backup([redacted]): Acquiring bulk-op lock on the database
01/14/2019 06:33:33,spid131,Unknown,Backup([redacted]): Acquiring S lock on the database
01/14/2019 06:33:33,spid131,Unknown,Backup([redacted]): BACKUP LOG started
Fast run : backup completed in no time and here safe checkpoint is ignored
01/14/2019 06:31:19,spid1609,Unknown,Backup([redacted]): BACKUP LOG finished
01/14/2019 06:31:19,spid1609,Unknown,Backup([redacted]): Skipped SafeCheckpoint
01/14/2019 06:31:19,spid1609,Unknown,Backup([redacted]): Synchronizing with other operations on the database is complete
01/14/2019 06:31:19,spid1609,Unknown,Backup([redacted]): Acquiring bulk-op lock on the database
01/14/2019 06:31:19,spid1609,Unknown,Backup([redacted]): Acquiring S lock on the database
01/14/2019 06:31:19,spid1609,Unknown,Backup([redacted]): BACKUP LOG started
“
Oh yeah – that’s some old code that checks whether releasing the log space reserved for writing the next checkpoint’s log messages will cause an extra VLF to be cleared during regular log clearing/truncation at the end of a log backup and forces a manual checkpoint to do that. I thought we got rid of that between 2000 and 2005 – obviously not. Thanks for sharing. However, I’m going to stick with my statement in the article as it’s way simpler for people to understand :-)
Wow, so much useful info in such a small post. And just when I get confused about the data pages, you end the post with a PS, clarifying that as well. Really excellent.
Thanks again for enlightening us!