Every so often (well, ok, very frequently) my internal list of ‘I must blog about that’ gets too big and I have an almost uncontrollable urge. Luckily, these urges end up with me blogging several times, otherwise who knows what kind of trouble I’d get into? :-)
First up today is some pretty deep internals about how the transaction log works in a specific situation. This has come up a few times in the last few internals/maintenance classes I’ve taught in the transaction log module so I wanted to get a blog post out there that proves my answer is correct. The question is easier to frame with a picture, borrowed from my TechNet Magazine article from February (see here).
.gif)
The image shows the circular natue of the transaction log, and how VLFs can be overwritten once the log has wrapped around, as long as they are no longer active (red = active log/VLF, blue = inactive log/VLF). If none of this makes sense to you, go read the TechNet Magazine article and it should do.
The question is: what happens if log records continue being written to VLFs 1 and 2 in the picture, and eventually the end of VLF 2 has been reached, but VLF 3 is still active, and can’t be overwritten? Does the database just stop?
Let’s find out.
I’m going to create a database and then engineer the situation above.
CREATE DATABASE LogTestDB ON PRIMARY (
NAME = LogTestDB_data,
FILENAME = N’C:\SQLskills\LogTestDB_data.mdf’)
LOG ON (
NAME = ‘LogTestDB_log’,
FILENAME = N’C:\SQLskills\LogTestDB_log.ldf’,
SIZE = 5MB);
GO
The database is in the SIMPLE recovery model. Well, to be completely honest, it’s in the FULL recovery model, but there hasn’t been a full database backup yet, so as far as transaction log behavior is concerned, it behaves as if it was in SIMPLE (the log can be cleared by a CHECKPOINT) - again, if this makes no sense, go read the TechNet Magazine article.
And then see how many VLFs it has using the DBCC LOGINFO command (and this is the *only* way to see how many VLFs there are):
DBCC LOGINFO (‘LogTestDB’);
GOFileId FileSize StartOffset FSeqNo Status Parity CreateLSN
——– ———- ————- ——– ——– —— ———–
2 1245184 8192 31 2 64 0
2 1245184 1253376 0 0 0 0
2 1245184 2498560 0 0 0 0
2 1499136 3743744 0 0 0 0
We’ve got 4 VLFs. The Status column tells us whether the VLF is active or not. A status of 2 means the VLF is active, 0 means it’s not. The sequence number (FSeqNo) is the logical order of the VLFs within the log. The FileSize is in bytes (so each VLF is about 1.25 MB). Right now there’s only one active VLF.
Now I’ll engineer the situation in the image above. I’m going to fill the log so that VLFs 1, 2, and 3 are active. Then I’m going to start an explicit transaction that will hold VLF 3 and onwards active. Then I’ll continue filling the log so it wraps around and starts to fill up VLF 1 again.
USE LogTestDB;
GO
CREATE TABLE BigRows (c1 INT IDENTITY, c2 CHAR (8000) DEFAULT ‘a’);
GOSET NOCOUNT ON;
INSERT INTO BigRows DEFAULT VALUES;
GO 300
I’ve filled up VLFs 1 and 2, and started on 3. Let’s check with DBCC LOGINFO:
DBCC LOGINFO (‘LogTestDB’);
GOFileId FileSize StartOffset FSeqNo Status Parity CreateLSN
——– ———- ————- ——– ——– —— ———–
2 1245184 8192 31 2 64 0
2 1245184 1253376 32 2 64 0
2 1245184 2498560 33 2 64 0
2 1499136 3743744 0 0 0 0
As you can see, the first 3 VLFs now have a status of 2 (active), and they’re in sequence. Now I’ll create an explicit transaction that will prevent VLF 3 and onwards being cleared.
BEGIN TRAN
INSERT INTO BigRows DEFAULT VALUES;
GO
If I explicitly do a CHECKPOINT now, VLFs 1 and 2 will clear:
CHECKPOINT
GODBCC LOGINFO (‘LogTestDB’);
GOFileId FileSize StartOffset FSeqNo Status Parity CreateLSN
——– ———- ————- ——– ——– —— ———-
2 1245184 8192 31 0 64 0
2 1245184 1253376 32 0 64 0
2 1245184 2498560 33 2 64 0
2 1499136 3743744 0 0 0 0
Now I’ll carry on filling up the BigRows table so the log wraps around and starts filling up VLFs 1 and two again.
INSERT INTO BigRows DEFAULT VALUES;
GO 300DBCC LOGINFO (‘LogTestDB’);
GOFileId FileSize StartOffset FSeqNo Status Parity CreateLSN
——– ———- ————- ——– ——– —— ———-
2 1245184 8192 35 2 128 0
2 1245184 1253376 32 0 64 0
2 1245184 2498560 33 2 64 0
2 1499136 3743744 34 2 64 0
You can see that the log has wrapped around now, but VLFs 3 and 4 are still active. Look at the sequence numbers of the active VLFs… the active log is VLF 3 then 4 then 1, with sequence numbers 33 then 34 then 35. Now if I carry on filling up the table, what’s going to happen when the log bumps up against VLF 3 that is still active? Will it stop or will it grow?
INSERT INTO BigRows DEFAULT VALUES;
GO 300DBCC LOGINFO (‘LogTestDB’);
GOFileId FileSize StartOffset FSeqNo Status Parity CreateLSN
——– ———- ————- ——– ——– —— ——————
2 1245184 8192 35 2 128 0
2 1245184 1253376 36 2 128 0
2 1245184 2498560 33 2 64 0
2 1499136 3743744 34 2 64 0
2 253952 5242880 37 2 64 36000000049300052
2 270336 5496832 38 2 64 36000000049300052
2 253952 5767168 0 0 0 36000000107500066
2 335872 6021120 0 0 0 36000000107500066
2 253952 6356992 0 0 0 36000000190700020
2 401408 6610944 0 0 0 36000000190700020
2 253952 7012352 0 0 0 37000000037300040
2 466944 7266304 0 0 0 37000000037300040
The answer is that it grew, and kind of skipped the active VLFs! Look at the sequence numbers. The new sequence of the active log is VLF 3 then 4 then 1 then 2 then 5 then 6, as you can see from the sequence numbers. Once the active transaction I created commits or rolls back, VLFs 3, 4, 1, and 2 can clear and then the ‘normal’ sequencing of VLFs in the log will resume.
Every VLF has a small header which contains the sequence number of the VLF within the transaction log, so the log can kind of do contortions to work around active VLFs in the middle of the log. Very cool.
Ok – that was fun – now I feel better!
PS In the last couple of DBCC LOGINFO dumps, where the log has wrapped around, you can see that the parity bits for the log blocks in the VLFs have changed, as I explained in one of my previous posts Search Engine Q&A #24: Why can’t the transaction log use instant initialization?.
11 Responses to Inside the Storage Engine: More on the circular nature of the log
Hi Paul,
Thank you for the excellent article. A very interesting read that is clear and concise. I thoroughly enjoyed reading it.
Thanks for this post Paul… I now have a better understanding of VLF’s and the circular nature works!
[...] A pretty deep blog post I wrote about analyzing VLF sequences in the log: Inside the Storage Engine: More on the circular nature of the log [...]
[...] Inside the Storage Engine: More on the circular nature of the log [...]
[...] Why is the first sequence number (FSeqNo) 137 and not 1? (For an explanation of the various fields in the output of the undocumented DBCC LOGINFO, see the post Inside the Storage Engine: More on the circular nature of the log.) [...]
[...] made inactive (see TechNet Magazine: feature article on understanding logging and recovery and Inside the Storage Engine: More on the circular nature of the log). This process allows the log to wrap-around and overwrite itself without having to grow – [...]
[...] waiting until all the active VLFs (marked with status 2 in the output from DBCC LOGINFO – see here) are in the first log file and then simply doing an ALTER DATABASE and removing the extra file, and [...]
[...] or overwritten – it will only be overwritten when the log wraps and the VLFs are re-used (see Inside the Storage Engine: More on the circular nature of the log). This has led to further confusion about when checkpoint information is retrievable from the log, [...]
Hi Paul. I´ve read this post and your article in Technet Magazine and i have one doubt: I´ve read about write-ahead logging in books on line and it says that all log registers are written to a log cache before going to disk in the log files. When this records stored in this log cache are written to the log files? Is it in the checkpoints just before the data? If i have a crash , will this registers in log cache be lost if they are not written to disk yet?
Regards
Andre – log records are written out:
1) whenever a transaction commits or aborts
2) whenever a data page is written to disk, all log records up to and including the one that last changed the page being written must be written to disk first
3) whenever a log block hits 60K in size
Log blocks are copied to the log cache and then async flushed to disk. This is what Books Online means.
Data pages cannot be written to disk until the log that affects them is on disk first. Transactions cannot be acknowledged as committed or rolled back unless all their log records are on disk first.
Make sense?
Yes, now it´s clear
Thanks for the answer