I woke up this morning and someone had replaced my wife with someone who likes to blog :-). Kimberly’s turned over a new leaf and is going to blog much more often – in fact she’s blogged 4 times today already. Check out her blog here.


Continuing on the transaction log theme of the last few Search Engine Q&A posts, this one addresses a question I’ve heard a few times, most recently on an MVP discussion group. Let me paraphrase:



If I have a transaction that inserts a huge amount of data, the transaction log grows to 50-GB. I then rollback the transaction. When I take the next log backup, it’s way smaller than 50-GB. What’s going on?


Let’s see if we can repro the scenario. I’ve created a database with a 500-MB data file and a 1-MB log file, with 100-MB and 1-MB auto-growth intervals. I want the log to be as small as possible and to grow in small chunks so I can see just how much it *has* to grow by, rather than having a large growth size. Then I set the recovery mode to full and took a database backup to make sure the log won’t truncate until it’s backed up.



CREATE DATABASE LogSizeTest ON
   
(NAME = N‘LogSizeTest’,
   
FILENAME = N‘C:\SQLskills\LogSizeTest.mdf’,
   
SIZE = 512MB,
   
FILEGROWTH = 100MB)
LOG ON 
   
(NAME = N‘LogSizeTest_log’,
   
FILENAME = N‘C:\SQLskills\LogSizeTest_log.ldf’,
   SIZE = 1MB,
   
FILEGROWTH = 1MB);
GO


ALTER DATABASE LogSizeTest SET RECOVERY FULL;
GO


BACKUP DATABASE LogSizeTest TO DISK = ‘C:\SQLskills\LogSizeTest.bak’;
GO


Let’s check the size of the log:



DBCC SQLPERF (LOGSPACE);
GO


Database Name  Log Size (MB) Log Space Used (%) Status
————– ————- —————— ———–
LogSizeTest    0.9921875     36.66339           0


This gives back info for all databases, I’ve trimmed down the output just for the LogSizeTest database.


Now I’m going to create a table, start an explicit transaction and add about 500-MB of info to the table.



USE LogSizeTest;
GO
SET NOCOUNT ON;
GO
CREATE
TABLE Test (c1 INT IDENTITY, C2 CHAR (8000) DEFAULT (REPLICATE (‘a’, 8000)));
GO


BEGIN TRAN;
GO


DECLARE @count INT;
SELECT @count = 0;
WHILE (@count < 64000)
BEGIN
   
INSERT INTO Test DEFAULT VALUES;
   
SELECT @count = @count + 1;
END;
GO


Checking the log file size again gives:



DBCC SQLPERF (LOGSPACE);
GO


Database Name  Log Size (MB) Log Space Used (%) Status
————– ————- —————— ———–
LogSizeTest    703.9922      99.98737           0


The log size has grown to about 700-MB, way more than the size of the data I was inserting, and it’s completely full. Now let’s rollback the transaction and check the log size again.



ROLLBACK TRAN;
GO


DBCC SQLPERF (LOGSPACE);
GO


Database Name  Log Size (MB) Log Space Used (%) Status
————– ————- —————— ———–
LogSizeTest    703.9922      85.21268           0


The size of the log file is the same, but the percentage used has actually gone down! How can that happen? Let’s take a backup and checkout it’s size:



BACKUP LOG LogSizeTest TO DISK = ‘C:\SQLskills\LogSizeTest_log.bck’;
GO
RESTORE HEADERONLY FROM DISK = ‘C:\SQLskills\LogSizeTest_log.bck’;
GO


The BackupSize in the output from the RESTORE HEADERONLY is 631454208, which is 602.2-MB. Taking the numbers from the DBCC SQLPERF output above, 85.21268% of 703.9922-MB is 599.89-MB - so the backup is roughly the same size as the used transaction log. That’s what I’d expect, but why is it smaller than the total size of the transaction log?


So what’s going on? Why did the transaction log need to grow so much larger than it needed to, and why did the percentage used actually *drop* after the transaction rolled back?


The answer is in the way the transaction log works. Whenever a logged operation occurs in a transaction, there is some transaction log space reserved in case the transaction rolls back. The idea is that there’s always enough space available in the transaction log for a transaction to roll back, without having to grow the transaction log and potentially have that fail. If a transaction could not roll back successfully because the log didn’t have enough space, the database would become transactionally inconsistent, would be taken offline and the state changed to SUSPECT.


The behavior we saw was the Storage Engine reserving transaction log space for a potential roll back. When the roll back occured, the transaction log records necessary to undo the effects of the transaction (called compensation log records) are created and written to the log. The issue is that they usually don’t take up as much space as the Storage Engine reserved, as it tends to be very conservative in its estimates of how much log space to reserve, to avoid the potential for SUSPECT databases. This explains the difference between the various sizes and percentages we saw above.


The Storage Engine code to do the reservations is quite interesting – I remember fixing a couple of bugs in it during SQL Server 2000 development in 1999 while I was getting to know the internals of the logging and recovery system before tackling some of the (since removed) log-reading code in DBCC CHECKDB in SQL Server 2000.


Anyway, there you have it. Log space reservation is the answer, and is also one of the reasons why it can be tricky to estimate how large a transaction log should be when a database is created.