One of the things I love teaching is how the transaction log and logging/recovery work. I presented a session on this at both PASS and SQL Connections in the last two weeks, and in both sessions I promised to write some blog posts about the deep internals of logging operations. This is the first one in the series. Previous blog posts that dive into logging operations are:

Ok, on with the show. 

SQL Server 2005 introduced a feature called 'fast recovery' in Enterprise Edition. This allows a database to become available for use after the first part of recovery (REDO) completes and before the (usually longer running) second part of recovery (UNDO) completes. See my TechNet Magazine article Understanding Logging and Recovery in SQL Server if you don't know what I'm talking about. But how does SQL Server do this?

The answer is lock logging. A log record describes a single change made to a database. For log records describing changes that can be used as part of UNDO (yes, some changes to the database are one-way only – for instance PFS page changes), from 2005 onwards the log record also includes a description of which locks were being held at the time the change was made. These locks were necessary to protect the change being made when the original transaction was running (before the crash) and so the same locks will be necessary to protect the anti-operation which reverses the change. I'll explain more about these anti-operations in one of the next in-depth logging blog posts.

The Storage Engine does two passes through the log as part of crash recovery. The first pass does REDO and also reads the log records that will be processed as part of the second pass (UNDO), looking at the lock description and actually acquiring those locks. For fast recovery, at that point the database is brought online. This is possible because the recovery system knows that it already has the correct locks to guarantee that it can safely generate and perform the anti-operations necessary to perfom UNDO. One side-effect of this is that although the database is available for use, a query may bump into one of the locks being held to allow fast recovery – in which case it will have to wait for that lock to be dropped as UNDO progresses.

Cool eh?

That was the introduction to allow me to do some gratuitous spelunking around the internals :-) I'm going to create a few simple examples to show you lock logging in the log. Now, don't get confused – it's not logging actual locks (the memory used to hold the lock itself), it's just logging a description of which locks were held and in which modes.

Here's the script to create a database with a simple table. I'm using a LOB column and specifically setting it to be stored off row (see Importance of choosing the right LOB storage technique) so we can see some text page locks too. I'm using the SIMPLE recovery model for simplicity (ha ha) – so I can clear the log when a checkpoint occurs rather than having to muck around with log backups. I'll insert the first row and then clear the log.

CREATE DATABASE LockLogging;
GO
USE LockLogging;
GO

CREATE TABLE LockLogTest (c1 INT, c2 INT, c3 VARCHAR (MAX));
GO
EXEC sp_tableoption 'LockLogtest', 'large value types out of row', 'on';
GO

INSERT INTO LockLogTest VALUES (1, 1, 'a');
GO

ALTER DATABASE LockLogging SET RECOVERY SIMPLE;
GO
CHECKPOINT;
GO

Now let's try the first operation – a simple insert – and look at the log records using fn_dblog (and I'm skipping the checkpoint log records):

INSERT INTO LockLogTest VALUES (2, 2, 'b');
GO
SELECT [Operation], [Context], [Page ID], [Slot ID], [Number of Locks] AS Locks, [Lock Information]
FROM fn_dblog (NULL, NULL);
GO

Operation        Context       Page ID        Slot ID  Locks Lock Information
—————- ————- ————– ——– —– ——————————————————————-
LOP_BEGIN_XACT   LCX_NULL      NULL           NULL     NULL  NULL
LOP_INSERT_ROWS  LCX_TEXT_MIX  0001:00000098  1        2     ACQUIRE_LOCK_IX PAGE: 18:1:152; ACQUIRE_LOCK_X RID: 18:1:152:1
LOP_INSERT_ROWS  LCX_HEAP      0001:0000009a  1        3     ACQUIRE_LOCK_IX OBJECT: 18:2073058421:0;
                                                                 ACQUIRE_LOCK_IX PAGE: 18:1:154; ACQUIRE_LOCK_X RID: 18:1:154:1
LOP_COMMIT_XACT  LCX_NULL      NULL           NULL     NULL  NULL

We can see page IX and row X locks for the LOB value being inserted into the text page, plus table IX, page IX, and row X locks for the data record being inserted into the heap. The lock resources break out as follows:

  • 18:1:152 is page 152 in file 1 of database ID 18
  • 18:1:152:1 is slot 1 on page 152 in file 1 of database ID 18
  • 18:2073058421:0 is object ID 2073058421 (the object ID of the table LockLogTest) in database ID 18

Notice also the LOP_BEGIN_XACT and LOP_COMMIT_XACT log records - even though I didn't do an explicit transaction, SQL Server has to start one internally for me (called an implicit transaction) so that there's a boundary for where to rollback if something goes wrong during the operation. 

And now an update operation (with a checkpoint first to clear out the log):

CHECKPOINT;
GO
UPDATE LockLogTest SET c1 = 3;
GO
SELECT [Operation], [Context], [Page ID], [Slot ID], [Number of Locks] AS Locks, [Lock Information]
FROM fn_dblog (NULL, NULL);
GO

Operation        Context       Page ID        Slot ID  Locks Lock Information
—————- ————- ————– ——– —– ——————————————————————-
LOP_BEGIN_XACT   LCX_NULL      NULL           NULL     NULL  NULL
LOP_MODIFY_ROW   LCX_HEAP      0001:0000009a  0        3     ACQUIRE_LOCK_IX OBJECT: 18:2073058421:0;
                                                                 
ACQUIRE_LOCK_IX PAGE: 18:1:154; ACQUIRE_LOCK_X RID: 18:1:154:0
LOP_MODIFY_ROW   LCX_HEAP      0001:0000009a  1        3     ACQUIRE_LOCK_IX OBJECT: 18:2073058421:0;
                                                                 ACQUIRE_LOCK_IX PAGE: 18:1:154; ACQUIRE_LOCK_X RID: 18:1:154:1
LOP_COMMIT_XACT  LCX_NULL      NULL           NULL     NULL  NULL

Just as we expected – a table IX lock, a page IX lock, and two row X locks on that page.

Now, what about something more complicated like a TRUNCATE TABLE? Have you heard the myth about it not being logged? Right – it's a myth:

CHECKPOINT;
GO
TRUNCATE TABLE LockLogTest;
GO
SELECT [Operation], [Context], [Page ID], [Slot ID], [Number of Locks] AS Locks, [Lock Information]
FROM fn_dblog (NULL, NULL);
GO

Operation        Context       Page ID        Slot ID  Locks Lock Information
—————- ————- ————– ——– —– ——————————————————————-
LOP_BEGIN_XACT   LCX_NULL      NULL           NULL     NULL  NULL
LOP_LOCK_XACT    LCX_NULL      NULL           NULL     1     ACQUIRE_LOCK_SCH_M OBJECT: 18:2073058421:0
LOP_MODIFY_ROW   LCX_IAM       0001:0000009b  0        1     ACQUIRE_LOCK_X RID: 18:1:155:0
LOP_MODIFY_ROW   LCX_PFS       0001:00000001  0        1     ACQUIRE_LOCK_X PAGE: 18:1:154
LOP_MODIFY_ROW   LCX_PFS       0001:00000001  0        1     ACQUIRE_LOCK_X PAGE: 18:1:155
LOP_MODIFY_ROW   LCX_IAM       0001:00000099  0        1     ACQUIRE_LOCK_X RID: 18:1:153:0
LOP_MODIFY_ROW   LCX_PFS       0001:00000001  0        1     ACQUIRE_LOCK_X PAGE: 18:1:152
LOP_MODIFY_ROW   LCX_PFS       0001:00000001  0        1     ACQUIRE_LOCK_X PAGE: 18:1:153
LOP_SET_BITS     LCX_SGAM      0001:00000003  1        NULL  NULL
LOP_SET_BITS     LCX_GAM       0001:00000002  1        NULL  NULL
LOP_COUNT_DELTA  LCX_CLUSTERED 0001:00000014  89       NULL  NULL
LOP_COUNT_DELTA  LCX_CLUSTERED 0001:00000011  78       NULL  NULL
LOP_COUNT_DELTA  LCX_CLUSTERED 0001:00000014  90       NULL  NULL
LOP_COUNT_DELTA  LCX_CLUSTERED 0001:00000041  164      NULL  NULL
LOP_COUNT_DELTA  LCX_CLUSTERED 0001:00000041  165      NULL  NULL
LOP_COUNT_DELTA  LCX_CLUSTERED 0001:00000041  166      NULL  NULL
LOP_HOBT_DDL     LCX_NULL      NULL           NULL     NULL  NULL
LOP_MODIFY_ROW   LCX_CLUSTERED 0001:00000014  89       2     ACQUIRE_LOCK_IX OBJECT: 18:7:0;
                                                                
ACQUIRE_LOCK_X KEY: 18:458752 (0000c2681664)
LOP_HOBT_DDL     LCX_NULL      NULL           NULL     NULL  NULL
LOP_MODIFY_ROW   LCX_CLUSTERED 0001:00000014  90       2     ACQUIRE_LOCK_IX OBJECT: 18:7:0;
                                                                 ACQUIRE_LOCK_X KEY: 18:458752 (00007a581379)
LOP_MODIFY_ROW   LCX_CLUSTERED 0001:00000011  78       2     ACQUIRE_LOCK_IX OBJECT: 18:5:0;
                                                                 ACQUIRE_LOCK_X KEY: 18:327680 (00001df3833b)
LOP_COMMIT_XACT  LCX_NULL      NULL           NULL     NULL  NULL

Lots of logging and lots of locks. If you look at the Context column, you'll see that the operation is modifying allocation bitmaps (LCX_IAM, LCX_PFS, LCX_SGAM, LCX_GAM) but taking locks on the table pages, not on the allocation bitmaps themselves – they're only ever latched (an internal, much lighter-weight, synchronization mechanism). This is done as the pages comprising the table are deallocated – this is all done because the table's small enough that the Storage Engine chooses to deallocate all the storage immediately, instead of pushing it all onto the task queue for the deferred drop background task. See my previous post Search Engine Q&A #10: When are pages from a truncated table reused? which discusses this too.

There are no actual row operations performed on the table itself. The only table row operations are down at the bottom on table with object IDs 7 and 5 (sysallocunits and sysrowsets, respectively) to update the page counts, first IAM, and first page entries for the table.

So – hopefully this has been useful to you. In the next post in the series, I'll discuss compensation log records and how rollback operations work.

PS Send me an email or put in a comment if there's something in particular about the log (or log records) you'd like to see explained.