Tracking page splits using the transaction log

Whenever I’m teaching about index fragmentation I get asked how to track page splits proactively. This can be useful to discover fragmentation occurring in indexes you didn’t know had fragmentation problems, without running the sys.dm_db_index_physical_stats DMV (see here for how that works) against all the indexes in your databases. Today this came up multiple times, both in class and in email, so it’s time to bubble this blog post up to the top of the list.

You might think this is easy, as there’s a page split counter in sys.dm_db_index_operational_stats and in the Access Methods perfmon object. However, neither of these distinguish between ‘good’ splits and ‘nasty’ splits, which are my terms :-). A ‘nasty’ split is what we think of a just a page split – a data or index page having to split into two pages to make space for a record to be inserted or an existing record to expand. A ‘good’ split is what the Storage Engine calls adding a page on the right-hand side of the index leaf level as part of inserting new records in an ascending key index (e.g. a clustered index with a bigint identity column as the cluster key).

This is a really annoying as it makes both these methods of tracking page splits essentially useless.

If you’re running SQL Server 2012 or later, the solution is to use Extended Events, based on the new sqlserver.transaction_log event. Jonathan wrote a great post here that gives you the Extended Events sessions to use. Be careful of doing this on a product system though as there’s a lot of overhead from using that event.

If you’re not running SQL Server 2012 or later, read on.

Before the sqlserver.transaction_log event was added, there was (and still is) the sqlserver.page_split event but that does not distinguish between ‘good’ splits and ‘nasty’ splits either, so some post processing is involved (essentially reading the page referenced in the event to see if it really split or not).

So what’s the answer?

Scanning the log for page splits

The easiest way to proactively see page splits occurring is to look in the transaction log. Whenever a page splits, an LOP_DELETE_SPLIT log record is generated so querying the transaction log can let you know what’s going on.

Some simple code to do this is:

SELECT
    [AllocUnitName] AS N'Index',
    (CASE [Context]
        WHEN N'LCX_INDEX_LEAF' THEN N'Nonclustered'
        WHEN N'LCX_CLUSTERED' THEN N'Clustered'
        ELSE N'Non-Leaf'
    END) AS [SplitType],
    COUNT (1) AS [SplitCount]
FROM
    fn_dblog (NULL, NULL)
WHERE
    [Operation] = N'LOP_DELETE_SPLIT'
GROUP BY [AllocUnitName], [Context];
GO

However, I don’t recommend doing this, for two reasons:

  1. Running fn_dblog will cause read I/Os on the transaction log, which can cause performance issues, especially if you’re running the scanner regularly and it happens to coincide with a log backup, for instance.
  2. Log clearing is disabled while fn_dblog is running, so on a system with a large amount of log to scan, this could interrupt the ability of the log to clear and cause log growth.

If you’re running in the full or bulk-logged recovery model, I recommend scanning your log backups for page splits instead of your actual log. If you’re only running in the simple recovery model, and you *really* want to run the script regularly, you’re going to have to run the script just before each checkpoint operation clears the log. But still, be careful you don’t interrupt the log clearing process.

Scanning a log backup for page splits

There are two options for this, using the fn_dump_dblog function I blogged about here:

  • Scanning a log backup on a system other than the production system.
  • Scanning a log backup on the production system.

If  you choose to use a system other than the production system, then unless you have a restored copy of the database, you will not be able to get the index name, as fn_dump_dblog does not give you the name and you will not have the metadata to allow looking up the index name from the allocation unit ID in the log.

Edit 8/15/13: Beware – we just found out from a customer system that uses this extensively that every time fn_dump_dblog is called, it creates a new hidden SQLOS scheduler and up to three threads, which will never go away and never be reused. Use with caution.

Edit 5/15/15: It’s fixed in SQL Server 2012 SP2+ and SQL Server 2014. The fix won’t be back-ported any earlier.

So I’ve created two scripts for you, for when the database is and isn’t available on the server where the backup is located. I’ll extend these in future posts.

Have fun!

Scanning a log backup where the database is not available

SELECT
    [AllocUnitId],
    (CASE [Context]
        WHEN N'LCX_INDEX_LEAF' THEN N'Nonclustered'
        WHEN N'LCX_CLUSTERED' THEN N'Clustered'
        ELSE N'Non-Leaf'
    END) AS [SplitType],
    COUNT (1) AS [SplitCount]
FROM
    fn_dump_dblog (NULL, NULL, N'DISK', 1, N'C:\SQLskills\SplitTest_log.bck',
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT)
WHERE
    [Operation] = N'LOP_DELETE_SPLIT'
GROUP BY [AllocUnitId], [Context];
GO

Scanning a log backup where the database is available

SELECT
    CAST ([s].[name] AS VARCHAR) + '.' + CAST ([o].[name] AS VARCHAR) + '.' + CAST ([i].[name] AS VARCHAR) AS [Index],
    [f].[SplitType],
    [f].[SplitCount]
FROM
    (SELECT
        [AllocUnitId],
        (CASE [Context]
            WHEN N'LCX_INDEX_LEAF' THEN N'Nonclustered'
            WHEN N'LCX_CLUSTERED' THEN N'Clustered'
            ELSE N'Non-Leaf'
        END) AS [SplitType],
        COUNT (1) AS [SplitCount]
    FROM
        fn_dump_dblog (NULL, NULL, N'DISK', 1, N'C:\SQLskills\SplitTest_log.bck',
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT)
    WHERE
        [Operation] = N'LOP_DELETE_SPLIT'
    GROUP BY [AllocUnitId], [Context]) f
JOIN sys.system_internals_allocation_units [a]
    ON [a].[allocation_unit_id] = [f].[AllocUnitId]
JOIN sys.partitions [p]
    ON [p].[partition_id] = [a].[container_id]
JOIN sys.indexes [i]
    ON [i].[index_id] = [p].[index_id] AND [i].[object_id] = [p].[object_id]
JOIN sys.objects [o]
    ON [o].[object_id] = [p].[object_id]
JOIN sys.schemas [s]
    ON [s].[schema_id] = [o].[schema_id];
GO

DBCC WRITEPAGE: an introduction

(Check out my online training courses: SQL Server: Detecting and Correcting Database Corruption and SQL Server: Advanced Corruption Recovery Techniques. We can also help you with disaster recovery.)

SQL Server’s undocumented commands, options, and trace flags are sometimes carefully-guarded secrets. I’ve been a major secret-keeper of lots of these, having written/re-written a bunch of the undocumented DBCC commands. I’ve been assiduously avoiding writing this blog post for 14 years but now my hand has been forced.

There’s one command in particular, DBCC WRITEPAGE, that has been effectively hidden since it was introduced pre-SQL Server 2000. However, more and more people are learning about it since some of the exam training books publicized trace flag 2588 that allows DBCC HELP (‘?’) to show all the undocumented DBCC commands, and their syntax (it used to be trace flag 2520, but I changed it for SQL Server 2005 to help obfuscate access to the undocumented DBCCs).

Some people are even starting to blog and present about the more obscure commands, including DBCC WRITEPAGE. I asked, in vain, for DBCC WRITEPAGE to not be publicized so I now feel compelled to blog about it so there is authoritative information available about it (I rewrote and extended its functionality for SQL Server 2005). I discussed this issue with the SQL product team and CSS last week, and they agree with me doing this. You won’t see a Microsoft post about this, but my great friend Bob Ward from CSS wanted me to quote him saying “Microsoft does not support the usage of DBCC WRITEPAGE and its use would invalidate the support for a customer database on which it is used.”

I present to you DBCC WRITEPAGE – the most dangerous command you can use in SQL Server.

*** DISCLAIMER: The information in this blog post should not be used on a production SQL Server system. Any problem, corruption, damage, or loss you cause by using the information presented here is entirely your own responsibility. Use at your own risk. Danger of death. ***

Well, not danger of death :-)

DBCC WRITEPAGE allows you alter any byte on any page in any database, as long as you have sysadmin privileges. It also allows you to completely circumvent the buffer pool, in other words you can force page checksum failures – this was the part I wrote from scratch, and I’m rather proud of its sneakiness.

The purposes of DBCC WRITEPAGE are:

  1. To allow automated testing of DBCC CHECKDB and repair by the SQL Server team.
  2. To engineer corruptions for demos and testing.
  3. To allow for last-ditch disaster recovery by manually editing a live, corrupt database.

I use it for #2 and #3 (and obviously used to do #1). I most recently used it on a live client system in October 2012 after a massive, multi-site SAN failure trashed databases and backups. I do not advise that you attempt #3 unless you’re confident you know what you’re doing and the side-effects on the Storage Engine from the byte(s) that you’re changing.

It is a very dangerous command because:

  1. It is an entirely physical change to the page – nothing is logged in the transaction log, and it cannot be rolled back.
  2. You can change any page in any database. For instance, you could use it to modify a page in master so that the instance immediately shuts down and will not start until master is rebuilt and restored.
  3. Anyone with sysadmin privileges can use it and there is no way to stop it.
  4. It breaks the support of your database.

You can very easily shoot yourself in the foot very badly playing around with this command. This isn’t hyperbole – it’s just the truth.

I’m going to do a series of posts showing the use of this command – as there are some cool things you can do with it for demo purposes.

(Toolbars to easily copy and paste code have been deliberately disabled in this post.)

The syntax is:

DBCC TRACEON (2588);
GO
DBCC HELP ('WRITEPAGE');
GO

 

dbcc WRITEPAGE ({'dbname' | dbid}, fileid, pageid, offset, length, data [, directORbufferpool])

The parameters mean:

  • ‘dbname’ | dbid : self-explanatory
  • fileid : file ID containing the page to change
  • pageid : zero-based page number within that file
  • offset : zero-based offset in bytes from the start of the page
  • length : number of bytes to change, from 1 to 8
  • data : the new data to insert (in hex, in the form ‘0xAABBCC’ – example three-byte string)
  • directORbufferpool : whether to bypass the buffer pool or not (0/1)

I’m actually a bit dismayed that someone added the final option to the syntax print-out, as it’s the most dangerous option – and the part I added in SQL Server 2005. If you specify ‘1’ for the directORbufferpool parameter, DBCC WRITEPAGE does the following:

  • Checkpoints the database and flushes all its pages out of the buffer pool
  • Unhooks SQL Server’s FCB (File Control Block) from the data file
  • Creates its own FCB for the file
  • Does a direct read of the page into DBCC’s memory
  • Modifies the page directly
  • Writes the page directly to disk, bypassing the buffer pool and any page protection generation (i.e. not recalculating the page checksum)
  • Fixes up the FCBs again

So the buffer pool knows nothing about the change to the page – it’s as if the I/O subsystem changed/corrupted the page. If you don’t specify that parameter, or specify ‘0’, the change occurs on the page in the buffer pool and the page checksum will be calculated correctly when the page is written to disk (albeit with other corruption issues maybe).

This means that you can force page checksum failures from within SQL Server – great for demos and testing. This is how I create corrupt databases.

To finish off this introductory post I’ll show you how to do just that, using a non-production server please…

CREATE DATABASE [CorruptDB];
GO
USE [CorruptDB];
GO

CREATE TABLE [Test] (
    [c1] INT IDENTITY,
	[c2] CHAR (8000) DEFAULT 'a');
GO

INSERT INTO [Test] DEFAULT VALUES;
GO

DBCC IND (N'CorruptDB', N'Test', -1);
GO

 

PageFID PagePID     IAMFID IAMPID      ObjectID    ...
------- ----------- ------ ----------- ----------- ...
1       154         NULL   NULL        101575400   ...
1       153         1      154         101575400   ...

I’m going to corrupt the data page – page (1:153):

ALTER DATABASE [CorruptDB] SET SINGLE_USER;
GO
DBCC WRITEPAGE (N'CorruptDB', 1, 153, 4000, 1, 0x45, 1);
GO

Now the database is corrupt and I’ve circumvented the buffer pool so the page checksum is incorrect now – watch…

SELECT * FROM [Test];
GO

 

c1          c2
----------- ---------- ...
Msg 824, Level 24, State 2, Line 1
SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x41fb2e55; actual: 0x41fb4b55). It occurred during a read of page (1:153) in database ID 29 at offset 0x00000000132000 in file 'C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\DATA\CorruptDB.mdf'.  Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

Nice eh? Yes, if that’s what you’re trying to do.

In the error log:

EXEC xp_readerrorlog;
GO

 

2013-02-05 14:32:56.760 spid55       Starting up database 'CorruptDB'.
2013-02-05 14:36:37.320 spid55       Setting database option SINGLE_USER to ON for database CorruptDB.
2013-02-05 14:36:37.320 spid55       User "APPLECROSS\paul" is modifying bytes 4000 to 4001 of page (1:153) in database "CorruptDB".
2013-02-05 14:37:55.160 spid55       Error: 824, Severity: 24, State: 2.
2013-02-05 14:37:55.160 spid55       SQL Server detected a logical consistency-based I/O error: blah blah blah

I’m not going to discuss how the use of DBCC WRITEPAGE is tracked within the database itself.

I’m going to blog a bunch more about DBCC WRITEPAGE over the next few months. If you’re going to play with it, please do not use it on a production system, and be very, very careful. It’s very easy to do something disastrous. And remember, if you use DBCC WRITEPAGE, you do so entirely at your own risk.

PS I really struggled with whether to post this or not, but in the end, given that it has and will continue to be posted by others, I want some authoritative information out there with warnings. There are all kinds of things you can do to destroy SQL Server data using Windows and the I/O subsystem (e.g. formatting an array) – doesn’t mean you go try it on a production system though without realizing the consequences. The same principle applies here. It isn’t irresponsible to post this information in the face of it becoming public anyway – better to have the complete story IMHO.

SQLintersection: your new SQL Server conference

This is a quick post to announce that we’ve designed a cool new SQL Server conference, where we control all aspects of the show (including the speakers, the sessions, the workshops, the format, the schedule, and everything else) to make sure you get the best possible conference experience and the best possible return on your time and budget investment.

[Just to be clear, because there’s some confusion, this conference hass nothing to do with Penton Media or Connections – this conference is owned by us. Connections has disappeared as far as we know.]

It’s going to be twice per year, and the first one is in Las Vegas, April 9th to 11th, with two pre-con days and one post-con day. We have an incredible line-up of speakers for the first show:

  • Aaron Bertrand, Sr. Consultant, SQL Sentry, Inc. (blog | twitter)
  • Andrew J. Kelly, Mentor, SolidQ (blog | twitter)
  • Bob Ward, Principal Architect Escalation Engineer, Microsoft (blog | twitter)
  • Brent Ozar, Brent Ozar Unlimited (blog | twitter)
  • Conor Cunningham, Principal Architect, SQL Server, Microsoft (blog)
  • Grant Fritchey, Product Evangelist, Red Gate Software (blog | twitter)
  • Jeremiah Peschka, Brent Ozar Unlimited (blog | twitter)
  • Joe Sack, Principal Consultant, SQLskills.com (blog | twitter)
  • Kendra Little, Managing Director, Brent Ozar Unlimited (blog | twitter)
  • Kimberly L. Tripp, President/Founder, SQLskills.com (blog | twitter)
  • Paul S. Randal, CEO / Owner, SQLskills.com (blog | twitter)
  • Paul White, SQL Kiwi Limited (blog | twitter)
  • Steve Jones, Editor, SQLServerCentral.com (blog | twitter)

Kimberly wrote a long blog post with all the details about the conference so rather than duplicate all that on my blog, I’ll point you at her post here.

ill_be_thereWe’ll be there – hopefully you will too!