Tracking page splits using the transaction log

Whenever I’m teaching about index fragmentation I get asked how to track page splits proactively. This can be useful to discover fragmentation occurring in indexes you didn’t know had fragmentation problems, without running the sys.dm_db_index_physical_stats DMV (see here for how that works) against all the indexes in your databases. Today this came up multiple times, both in class and in email, so it’s time to bubble this blog post up to the top of the list.

You might think this is easy, as there’s a page split counter in sys.dm_db_index_operational_stats and in the Access Methods perfmon object. However, neither of these distinguish between ‘good’ splits and ‘nasty’ splits, which are my terms :-). A ‘nasty’ split is what we think of a just a page split – a data or index page having to split into two pages to make space for a record to be inserted or an existing record to expand. A ‘good’ split is what the Storage Engine calls adding a page on the right-hand side of the index leaf level as part of inserting new records in an ascending key index (e.g. a clustered index with a bigint identity column as the cluster key).

This is a really annoying as it makes both these methods of tracking page splits essentially useless.

If you’re running SQL Server 2012 or later, the solution is to use Extended Events, based on the new sqlserver.transaction_log event. Jonathan wrote a great post here that gives you the Extended Events sessions to use. Be careful of doing this on a product system though as there’s a lot of overhead from using that event.

If you’re not running SQL Server 2012 or later, read on.

Before the sqlserver.transaction_log event was added, there was (and still is) the sqlserver.page_split event but that does not distinguish between ‘good’ splits and ‘nasty’ splits either, so some post processing is involved (essentially reading the page referenced in the event to see if it really split or not).

So what’s the answer?

Scanning the log for page splits

The easiest way to proactively see page splits occurring is to look in the transaction log. Whenever a page splits, an LOP_DELETE_SPLIT log record is generated so querying the transaction log can let you know what’s going on.

Some simple code to do this is:

SELECT
    [AllocUnitName] AS N'Index',
    (CASE [Context]
        WHEN N'LCX_INDEX_LEAF' THEN N'Nonclustered'
        WHEN N'LCX_CLUSTERED' THEN N'Clustered'
        ELSE N'Non-Leaf'
    END) AS [SplitType],
    COUNT (1) AS [SplitCount]
FROM
    fn_dblog (NULL, NULL)
WHERE
    [Operation] = N'LOP_DELETE_SPLIT'
GROUP BY [AllocUnitName], [Context];
GO

However, I don’t recommend doing this, for two reasons:

Running fn_dblog will cause read I/Os on the transaction log, which can cause performance issues, especially if you’re running the scanner regularly and it happens to coincide with a log backup, for instance.
Log clearing is disabled while fn_dblog is running, so on a system with a large amount of log to scan, this could interrupt the ability of the log to clear and cause log growth.

If you’re running in the full or bulk-logged recovery model, I recommend scanning your log backups for page splits instead of your actual log. If you’re only running in the simple recovery model, and you *really* want to run the script regularly, you’re going to have to run the script just before each checkpoint operation clears the log. But still, be careful you don’t interrupt the log clearing process.

Scanning a log backup for page splits

There are two options for this, using the fn_dump_dblog function I blogged about here:

Scanning a log backup on a system other than the production system.
Scanning a log backup on the production system.

If you choose to use a system other than the production system, then unless you have a restored copy of the database, you will not be able to get the index name, as fn_dump_dblog does not give you the name and you will not have the metadata to allow looking up the index name from the allocation unit ID in the log.

Edit 8/15/13: Beware – we just found out from a customer system that uses this extensively that every time fn_dump_dblog is called, it creates a new hidden SQLOS scheduler and up to three threads, which will never go away and never be reused. Use with caution.

Edit 5/15/15: It’s fixed in SQL Server 2012 SP2+ and SQL Server 2014. The fix won’t be back-ported any earlier.

So I’ve created two scripts for you, for when the database is and isn’t available on the server where the backup is located. I’ll extend these in future posts.

Have fun!

Scanning a log backup where the database is not available

SELECT
    [AllocUnitId],
    (CASE [Context]
        WHEN N'LCX_INDEX_LEAF' THEN N'Nonclustered'
        WHEN N'LCX_CLUSTERED' THEN N'Clustered'
        ELSE N'Non-Leaf'
    END) AS [SplitType],
    COUNT (1) AS [SplitCount]
FROM
    fn_dump_dblog (NULL, NULL, N'DISK', 1, N'C:\SQLskills\SplitTest_log.bck',
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
        DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT)
WHERE
    [Operation] = N'LOP_DELETE_SPLIT'
GROUP BY [AllocUnitId], [Context];
GO

Scanning a log backup where the database is available

SELECT
    CAST ([s].[name] AS VARCHAR) + '.' + CAST ([o].[name] AS VARCHAR) + '.' + CAST ([i].[name] AS VARCHAR) AS [Index],
    [f].[SplitType],
    [f].[SplitCount]
FROM
    (SELECT
        [AllocUnitId],
        (CASE [Context]
            WHEN N'LCX_INDEX_LEAF' THEN N'Nonclustered'
            WHEN N'LCX_CLUSTERED' THEN N'Clustered'
            ELSE N'Non-Leaf'
        END) AS [SplitType],
        COUNT (1) AS [SplitCount]
    FROM
        fn_dump_dblog (NULL, NULL, N'DISK', 1, N'C:\SQLskills\SplitTest_log.bck',
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
            DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT)
    WHERE
        [Operation] = N'LOP_DELETE_SPLIT'
    GROUP BY [AllocUnitId], [Context]) f
JOIN sys.system_internals_allocation_units [a]
    ON [a].[allocation_unit_id] = [f].[AllocUnitId]
JOIN sys.partitions [p]
    ON [p].[partition_id] = [a].[container_id]
JOIN sys.indexes [i]
    ON [i].[index_id] = [p].[index_id] AND [i].[object_id] = [p].[object_id]
JOIN sys.objects [o]
    ON [o].[object_id] = [p].[object_id]
JOIN sys.schemas [s]
    ON [s].[schema_id] = [o].[schema_id];
GO

11 thoughts on “Tracking page splits using the transaction log”

ashish says:

February 8, 2013 at 9:13 am

excellent word is very small for any of your post here or video on pluralsight. Everytime I watch video from sqlskills team on pluralsight, I realise I am worth of human being.

Reply
Tom says:

December 5, 2014 at 8:51 am

Hey Paul,
Thanks for the article, very informative.
I used the script to scan 1 log backup file (covering 15 min) in prod against my main db, and the split counts in the output are very low (total about 1200, of which only 2 indexes have 600 and 400) compared to what i see in perfmon (about 100 per sec, so about 9000 in 15 minutes). Does this mean most my page splits are “good” splits ?
TIA,
Tom

Reply
1. Paul Randal says:
  
  December 7, 2014 at 1:31 am
  
  Assuming the log covers the same 15 mins as the perfmon trace, yes, most of the splits are ‘good’ splits.
  
  Reply
Alex Friedman says:

February 5, 2018 at 8:48 am

What’s the reason for “good splits” still being called page splits by SQL server?

Reply
1. Paul Randal says:
  
  February 5, 2018 at 5:25 pm
  
  It’s always been that way since 7.0.
  
  Reply
Alex Friedman says:

February 5, 2018 at 8:34 pm

I understand it’s for historical reasons, I’m just curious how it started :)

Reply
1. Paul Randal says:
  
  February 6, 2018 at 6:57 am
  
  No idea – I wasn’t around when they wrote 7.0.
  
  Reply
ravi says:

March 12, 2018 at 12:27 am

What is meant by of skewed pagesplit?

Reply
1. Paul Randal says:
  
  March 13, 2018 at 3:54 pm
  
  A page split where the split isn’t at the 50% mark in the page being split.
  
  Reply
Jeff Moden says:

July 2, 2018 at 12:19 pm

I know this article is a half decade old now but I’ve only run across it recently. “Thank You” for posting this and about a bazillion other great tips. You and your team are just incredible and I thank you for taking the time to help us all make our lives easier. And, yeah… I’m setting up to automate some of this.

Thanks for what you do, Paul.

Reply
1. Paul Randal says:
  
  July 2, 2018 at 12:24 pm
  
  Thanks for the kind words Jeff!
  
  Reply

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

Tracking page splits using the transaction log

Scanning the log for page splits

Scanning a log backup for page splits

11 thoughts on “Tracking page splits using the transaction log”

Leave a Reply Cancel reply

Other articles

2024: the year in books

2023 (not a typo!): the year in books

Black Friday Super Sale!

Summer School Savings SALE is live!

The Curious Case of… missing log information

The Curious Case of… missing 832 errors during memory corruption

Imagine feeling confident enough to handle whatever your database throws at you.