Search Engine Q&A #25: Why isn’t my log backup the same size as my log?

I woke up this morning and someone had replaced my wife with someone who likes to blog :-). Kimberly’s turned over a new leaf and is going to blog much more often – in fact she’s blogged 4 times today already. Check out her blog here.

Continuing on the transaction log theme of the last few Search Engine Q&A posts, this one addresses a question I’ve heard a few times, most recently on an MVP discussion group. Let me paraphrase:

If I have a transaction that inserts a huge amount of data, the transaction log grows to 50-GB. I then rollback the transaction. When I take the next log backup, it’s way smaller than 50-GB. What’s going on?

Let’s see if we can repro the scenario. I’ve created a database with a 500-MB data file and a 1-MB log file, with 100-MB and 1-MB auto-growth intervals. I want the log to be as small as possible and to grow in small chunks so I can see just how much it *has* to grow by, rather than having a large growth size. Then I set the recovery mode to full and took a database backup to make sure the log won’t truncate until it’s backed up.

CREATE DATABASE LogSizeTest ON
   (NAME = N‘LogSizeTest’,
   FILENAME = N‘C:\SQLskills\LogSizeTest.mdf’,
   SIZE = 512MB,
   FILEGROWTH = 100MB)
LOG ON
   (NAME = N‘LogSizeTest_log’,
   FILENAME = N‘C:\SQLskills\LogSizeTest_log.ldf’,
   SIZE = 1MB,
   FILEGROWTH = 1MB);
GO

ALTER DATABASE LogSizeTest SET RECOVERY FULL;
GO

BACKUP DATABASE LogSizeTest TO DISK = ‘C:\SQLskills\LogSizeTest.bak’;
GO

Let’s check the size of the log:

DBCC SQLPERF (LOGSPACE);
GO

Database Name Log Size (MB) Log Space Used (%) Status
————– ————- —————— ———–
LogSizeTest 0.9921875 36.66339 0

This gives back info for all databases, I’ve trimmed down the output just for the LogSizeTest database.

Now I’m going to create a table, start an explicit transaction and add about 500-MB of info to the table.

USE LogSizeTest;
GO
SET NOCOUNT ON;
GO
CREATE TABLE Test (c1 INT IDENTITY, C2 CHAR (8000) DEFAULT (REPLICATE (‘a’, 8000)));
GO

BEGIN TRAN;
GO

DECLARE @count INT;
SELECT @count = 0;
WHILE (@count < 64000)
BEGIN
INSERT INTO Test DEFAULT VALUES;
SELECT @count = @count + 1;
END;
GO

Checking the log file size again gives:

DBCC SQLPERF (LOGSPACE);
GO

Database Name Log Size (MB) Log Space Used (%) Status
————– ————- —————— ———–
LogSizeTest 703.9922 99.98737 0

The log size has grown to about 700-MB, way more than the size of the data I was inserting, and it’s completely full. Now let’s rollback the transaction and check the log size again.

ROLLBACK TRAN;
GO

DBCC SQLPERF (LOGSPACE);
GO

Database Name Log Size (MB) Log Space Used (%) Status
————– ————- —————— ———–
LogSizeTest 703.9922 85.21268 0

The size of the log file is the same, but the percentage used has actually gone down! How can that happen? Let’s take a backup and checkout it’s size:

BACKUP LOG LogSizeTest TO DISK = ‘C:\SQLskills\LogSizeTest_log.bck’;
GO
RESTORE HEADERONLY FROM DISK = ‘C:\SQLskills\LogSizeTest_log.bck’;
GO

The BackupSize in the output from the RESTORE HEADERONLY is 631454208, which is 602.2-MB. Taking the numbers from the DBCC SQLPERF output above, 85.21268% of 703.9922-MB is 599.89-MB – so the backup is roughly the same size as the used transaction log. That’s what I’d expect, but why is it smaller than the total size of the transaction log?

So what’s going on? Why did the transaction log need to grow so much larger than it needed to, and why did the percentage used actually *drop* after the transaction rolled back?

The answer is in the way the transaction log works. Whenever a logged operation occurs in a transaction, there is some transaction log space reserved in case the transaction rolls back. The idea is that there’s always enough space available in the transaction log for a transaction to roll back, without having to grow the transaction log and potentially have that fail. If a transaction could not roll back successfully because the log didn’t have enough space, the database would become transactionally inconsistent, would be taken offline and the state changed to SUSPECT.

The behavior we saw was the Storage Engine reserving transaction log space for a potential roll back. When the roll back occured, the transaction log records necessary to undo the effects of the transaction (called compensation log records) are created and written to the log. The issue is that they usually don’t take up as much space as the Storage Engine reserved, as it tends to be very conservative in its estimates of how much log space to reserve, to avoid the potential for SUSPECT databases. This explains the difference between the various sizes and percentages we saw above.

The Storage Engine code to do the reservations is quite interesting – I remember fixing a couple of bugs in it during SQL Server 2000 development in 1999 while I was getting to know the internals of the logging and recovery system before tackling some of the (since removed) log-reading code in DBCC CHECKDB in SQL Server 2000.

Anyway, there you have it. Log space reservation is the answer, and is also one of the reasons why it can be tricky to estimate how large a transaction log should be when a database is created.

6 thoughts on “Search Engine Q&A #25: Why isn’t my log backup the same size as my log?”

Farhan Soomro says:

August 25, 2008 at 8:16 pm

Paul,
This is good stuff. Another thing I noticed even though there is no one access DB if I take frequent backup one after another, There is alway something to backup. No sure why?
Thanks

Reply
Julian says:

September 4, 2013 at 5:36 am

when was the start point of the first log back up, just after the first full backup? or start from the log begin?

Reply
1. Paul Randal says:
  
  September 4, 2013 at 8:09 am
  
  Whatever log is available will be in the first log backup.
  
  Reply
sateesh says:

January 2, 2014 at 10:34 am

Thank you

Reply
Suresh says:

March 27, 2015 at 8:14 am

Hi Paul,

can you please explain me Why isn’t my database backup the same size as my database?

thank you in advance.

regards,
Suresh.

Reply
1. Paul Randal says:
  
  March 27, 2015 at 10:11 am
  
  Because it only backs up the allocated extents in the data files.
  
  Reply

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

Search Engine Q&A #25: Why isn’t my log backup the same size as my log?

6 thoughts on “Search Engine Q&A #25: Why isn’t my log backup the same size as my log?”

Leave a Reply Cancel reply

Other articles

2024: the year in books

2023 (not a typo!): the year in books

Black Friday Super Sale!

Summer School Savings SALE is live!

The Curious Case of… missing log information

The Curious Case of… missing 832 errors during memory corruption

Imagine feeling confident enough to handle whatever your database throws at you.