Lock logging and fast recovery

One of the things I love teaching is how the transaction log and logging/recovery work. I presented a session on this at both PASS and SQL Connections in the last two weeks, and in both sessions I promised to write some blog posts about the deep internals of logging operations. This is the first one in the series. Previous blog posts that dive into logging operations are:

How do checkpoints work and what gets logged (this also explains how crash recovery starts, using the boot page and the most recent checkpoint log record)
Finding out who dropped a table using the transaction log
How expensive are page splits in terms of transaction log?
Inside the Storage Engine: More on the circular nature of the log
Ghost cleanup redux
Inside the Storage Engine: Ghost cleanup in depth

Ok, on with the show.

SQL Server 2005 introduced a feature called fast recovery in Enterprise Edition. This allows a database to become available for use after the first part of recovery (REDO) completes and before the (usually longer running) second part of recovery (UNDO) completes. See my TechNet Magazine article Understanding Logging and Recovery in SQL Server if you don’t know what I’m talking about. But how does SQL Server do this?

The answer is lock logging. A log record describes a single change made to a database. For log records describing changes that can be used as part of UNDO (yes, some changes to the database are one-way only – for instance PFS page changes), from 2005 onwards the log record also includes a description of which locks were being held at the time the change was made. These locks were necessary to protect the change being made when the original transaction was running (before the crash) and so the same locks will be necessary to protect the anti-operation which reverses the change. I’ll explain more about these anti-operations in one of the next in-depth logging blog posts.

The Storage Engine does three passes through the log as part of crash recovery. The first pass works out which transactions were committed, and issues readahead of the necessary data file pages into the buffer pool. The second pass does REDO and also reads the log records that will be processed as part of the third pass (UNDO), looking at the lock description and actually acquiring those locks. For fast recovery, at that point the database is brought online. This is possible because the recovery system knows that it already has the correct locks to guarantee that it can safely generate and perform the anti-operations necessary to perfom UNDO. One side-effect of this is that although the database is available for use, a query may bump into one of the locks being held to allow fast recovery – in which case it will have to wait for that lock to be dropped as UNDO progresses.

Cool eh?

That was the introduction to allow me to do some gratuitous spelunking around the internals :-) I’m going to create a few simple examples to show you lock logging in the log. Now, don’t get confused – it’s not logging actual locks (the memory used to hold the lock itself), it’s just logging a description of which locks were held and in which modes.

Here’s the script to create a database with a simple table. I’m using a LOB column and specifically setting it to be stored off row (see Importance of choosing the right LOB storage technique) so we can see some text page locks too. I’m using the SIMPLE recovery model for simplicity (ha ha) – so I can clear the log when a checkpoint occurs rather than having to muck around with log backups. I’ll insert the first row and then clear the log.

CREATE DATABASE LockLogging;
GO
USE LockLogging;
GO

CREATE TABLE LockLogTest (c1 INT, c2 INT, c3 VARCHAR (MAX));
GO
EXEC sp_tableoption 'LockLogtest', 'large value types out of row', 'on';
GO

INSERT INTO LockLogTest VALUES (1, 1, 'a');
GO

ALTER DATABASE LockLogging SET RECOVERY SIMPLE;
GO
CHECKPOINT;
GO

Now let’s try the first operation – a simple insert – and look at the log records using fn_dblog (and I’m skipping the checkpoint log records):

INSERT INTO LockLogTest VALUES (2, 2, 'b');
GO
SELECT [Operation], [Context], [Page ID], [Slot ID], [Number of Locks] AS Locks, [Lock Information]
FROM fn_dblog (NULL, NULL);
GO

Operation        Context       Page ID        Slot ID  Locks Lock Information
---------------- ------------- -------------- -------- ----- -------------------------------------------------------------------------------------------------------
LOP_BEGIN_XACT   LCX_NULL      NULL           NULL     NULL  NULL
LOP_INSERT_ROWS  LCX_TEXT_MIX  0001:00000098  1        2     ACQUIRE_LOCK_IX PAGE: 18:1:152; ACQUIRE_LOCK_X RID: 18:1:152:1
LOP_INSERT_ROWS  LCX_HEAP      0001:0000009a  1        3     ACQUIRE_LOCK_IX OBJECT: 18:2073058421:0;ACQUIRE_LOCK_IX PAGE: 18:1:154; ACQUIRE_LOCK_X RID: 18:1:154:1
LOP_COMMIT_XACT  LCX_NULL      NULL           NULL     NULL  NULL

We can see page IX and row X locks for the LOB value being inserted into the text page, plus table IX, page IX, and row X locks for the data record being inserted into the heap. The lock resources break out as follows:

18:1:152 is page 152 in file 1 of database ID 18
18:1:152:1 is slot 1 on page 152 in file 1 of database ID 18
18:2073058421:0 is object ID 2073058421 (the object ID of the table LockLogTest) in database ID 18

Notice also the LOP_BEGIN_XACT and LOP_COMMIT_XACT log records – even though I didn’t do an explicit transaction, SQL Server has to start one internally for me so that there’s a boundary for where to rollback if something goes wrong during the operation.

And now an update operation (with a checkpoint first to clear out the log):

CHECKPOINT;
GO
UPDATE LockLogTest SET c1 = 3;
GO
SELECT [Operation], [Context], [Page ID], [Slot ID], [Number of Locks] AS Locks, [Lock Information]
FROM fn_dblog (NULL, NULL);
GO

Operation        Context       Page ID        Slot ID  Locks Lock Information
---------------- ------------- -------------- -------- ----- -------------------------------------------------------------------------------------------------------
LOP_BEGIN_XACT   LCX_NULL      NULL           NULL     NULL  NULL
LOP_MODIFY_ROW   LCX_HEAP      0001:0000009a  0        3     ACQUIRE_LOCK_IX OBJECT: 18:2073058421:0;ACQUIRE_LOCK_IX PAGE: 18:1:154; ACQUIRE_LOCK_X RID: 18:1:154:0
LOP_MODIFY_ROW   LCX_HEAP      0001:0000009a  1        3     ACQUIRE_LOCK_IX OBJECT: 18:2073058421:0;ACQUIRE_LOCK_IX PAGE: 18:1:154; ACQUIRE_LOCK_X RID: 18:1:154:1
LOP_COMMIT_XACT  LCX_NULL      NULL           NULL     NULL  NULL

Just as we expected – a table IX lock, a page IX lock, and two row X locks on that page.

Now, what about something more complicated like a TRUNCATE TABLE? Have you heard the myth about it not being logged? Right – it’s a myth:

CHECKPOINT;
GO
TRUNCATE TABLE LockLogTest;
GO
SELECT [Operation], [Context], [Page ID], [Slot ID], [Number of Locks] AS Locks, [Lock Information]
FROM fn_dblog (NULL, NULL);
GO

Operation        Context       Page ID        Slot ID  Locks Lock Information
---------------- ------------- -------------- -------- ----- ----------------------------------------------------------------------------
LOP_BEGIN_XACT   LCX_NULL      NULL           NULL     NULL  NULL
LOP_LOCK_XACT    LCX_NULL      NULL           NULL     1     ACQUIRE_LOCK_SCH_M OBJECT: 18:2073058421:0
LOP_MODIFY_ROW   LCX_IAM       0001:0000009b  0        1     ACQUIRE_LOCK_X RID: 18:1:155:0
LOP_MODIFY_ROW   LCX_PFS       0001:00000001  0        1     ACQUIRE_LOCK_X PAGE: 18:1:154
LOP_MODIFY_ROW   LCX_PFS       0001:00000001  0        1     ACQUIRE_LOCK_X PAGE: 18:1:155
LOP_MODIFY_ROW   LCX_IAM       0001:00000099  0        1     ACQUIRE_LOCK_X RID: 18:1:153:0
LOP_MODIFY_ROW   LCX_PFS       0001:00000001  0        1     ACQUIRE_LOCK_X PAGE: 18:1:152
LOP_MODIFY_ROW   LCX_PFS       0001:00000001  0        1     ACQUIRE_LOCK_X PAGE: 18:1:153
LOP_SET_BITS     LCX_SGAM      0001:00000003  1        NULL  NULL
LOP_SET_BITS     LCX_GAM       0001:00000002  1        NULL  NULL
LOP_COUNT_DELTA  LCX_CLUSTERED 0001:00000014  89       NULL  NULL
LOP_COUNT_DELTA  LCX_CLUSTERED 0001:00000011  78       NULL  NULL
LOP_COUNT_DELTA  LCX_CLUSTERED 0001:00000014  90       NULL  NULL
LOP_COUNT_DELTA  LCX_CLUSTERED 0001:00000041  164      NULL  NULL
LOP_COUNT_DELTA  LCX_CLUSTERED 0001:00000041  165      NULL  NULL
LOP_COUNT_DELTA  LCX_CLUSTERED 0001:00000041  166      NULL  NULL
LOP_HOBT_DDL     LCX_NULL      NULL           NULL     NULL  NULL
LOP_MODIFY_ROW   LCX_CLUSTERED 0001:00000014  89       2     ACQUIRE_LOCK_IX OBJECT: 18:7:0;ACQUIRE_LOCK_X KEY: 18:458752 (0000c2681664)
LOP_HOBT_DDL     LCX_NULL      NULL           NULL     NULL  NULL
LOP_MODIFY_ROW   LCX_CLUSTERED 0001:00000014  90       2     ACQUIRE_LOCK_IX OBJECT: 18:7:0;ACQUIRE_LOCK_X KEY: 18:458752 (00007a581379)
LOP_MODIFY_ROW   LCX_CLUSTERED 0001:00000011  78       2     ACQUIRE_LOCK_IX OBJECT: 18:5:0;ACQUIRE_LOCK_X KEY: 18:327680 (00001df3833b)
LOP_COMMIT_XACT  LCX_NULL      NULL           NULL     NULL  NULL

Lots of logging and lots of locks. If you look at the Context column, you’ll see that the operation is modifying allocation bitmaps (LCX_IAM, LCX_PFS, LCX_SGAM, LCX_GAM) but taking locks on the table pages, not on the allocation bitmaps themselves – they’re only ever latched (an internal, much lighter-weight, synchronization mechanism). This is done as the pages comprising the table are deallocated – this is all done because the table’s small enough that the Storage Engine chooses to deallocate all the storage immediately, instead of pushing it all onto the task queue for the deferred drop background task. See my previous post Search Engine Q&A #10: When are pages from a truncated table reused? which discusses this too.

There are no actual row operations performed on the table itself. The only table row operations are down at the bottom on table with object IDs 7 and 5 (sys.sysallocunits and sys.sysrowsets, respectively) to update the page counts, first IAM, and first page entries for the table.

So – hopefully this has been useful to you. In the next post in the series, I’ll discuss compensation log records and how rollback operations work.

PS Send me an email or put in a comment if there’s something in particular about the log (or log records) you’d like to see explained.

5 thoughts on “Lock logging and fast recovery”

Rob says:

November 16, 2009 at 1:26 am

Thanks for the post – it’s nice to be able to see exactly what is going on in the log when you run commands.

I ran through this example on my own SQL Server (2005 SP3 & 2008 SP1). When I truncate the table, I don’t get any LOP_SET_BITS operations on the GAM/SGAM. Every other operation is exactly the same as in your example – do you know why this might be?

Reply
paul says:

November 16, 2009 at 3:50 pm

Rob – because on your system it decided to drop the table’s allocations using the background deferred drop mechanism – on mine it didn’t. Don’t worry about it. Cheers

Reply
Rob says:

November 17, 2009 at 1:13 am

Thanks Paul. I wasn’t aware of deferred drop so will go read up on it.

Reply
Glenn Wiens says:

November 23, 2009 at 8:26 am

The SQLServer Log has always fascinated me, and I suspect that it holds quite a bit of information if you know where to look. I have been through some of Kalen Delaney’s material on SQQLServer internals — it gets dense very quickly.

One thing I have spent ages looked for is the Last Modified date of a database. I would think it could be found in the log file, but have never come up with a method I could validate. Reason for this would be to determine what the extend of risk a database might be in (ie, unbacked-up transactions).

Any thoughts on this?

Thanks.

Reply
paul says:

November 27, 2009 at 11:10 am

Glenn – the only way to tell would be to correlate log records with time. There’s no documented way to do that – maybe I’ll do some playing around and blog a script. Cheers

Reply

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

Lock logging and fast recovery

5 thoughts on “Lock logging and fast recovery”

Leave a Reply Cancel reply

Other articles

The Curious Case of… missing 832 errors during memory corruption

The Curious Case of… recovering from a ransomware attack

30%-off health checks and self-service via credit card – no waiting!

The Curious Case of…. alcoholism

The Mentoring Class of 2023

Mentoring 2023 – want some help from me?

Imagine feeling confident enough to handle whatever your database throws at you.