An XEvent a Day (28 of 31) – Tracking Page Compression Operations

The Database Compression feature in SQL Server 2008 Enterprise Edition can provide some significant reductions in storage requirements for SQL Server databases, and in the right implementations and scenarios performance improvements as well.  There isn’t really a whole lot of information about the operations of database compression that is documented as being available in the DMV’s or SQL Trace.  Paul Randal pointed out on Twitter today that sys.dm_db_index_operational_stats() provides the page_compression_attempt_count and page_compression_success_count available.  Beyond that the only other documented information for monitoring Data Compression are the Page Compression Attempts/sec and Pages Compressed/sec Performance Counters of the SQL Server:Access Methods object in Perfmon (http://msdn.microsoft.com/en-us/library/cc280449.aspx). 

There is one thing in common about the documented methods of monitoring Data Compression, and that is they all only deal with Page compression, and not Row compression, and in Extended Events we find the same commonality as there are no Row compression Events in Extended Events.  There are two Page compression Events in Extended Events; sqlserver.page_compression_attempt_failed and sqlserver.page_compression_tracing.  These two Events can be used to track Page compression operations at multiple levels, including database, object, index, and even down to the individual page. The sqlserver.page_compression_tracing Event provides Start and End tracing of Page compression operations inside of the Database Engine and returns the database_id, index_id, rowset_id, page_id, and duration of the compression operation.  The sqlserver.page_compression_attempt_failed is really poorly named, and doesn’t provide information about failures in the sense that something broke, but provides information for why a page compression attempt did not actually change the compression of the data in the page.  It also returns the database_id, index_id, rowset_id, and page_id for the compression attempt, and it also includes a failure_reason column which correlates to the page_compression_failure_reason Map Value.

-- Get the payload information for the Events 
SELECT 
    object_name, 
    column_id, 
    name, 
    type_name
FROM sys.dm_xe_object_columns
WHERE object_name IN ('page_compression_tracing', 
                      'page_compression_attempt_failed')
  AND column_type = 'data'

image

To demonstrate how these Events function, I am going to use the LineItem table from the TPC-H Benchmark that was created by Quest Benchmark Factory using Level 2 for the table sizing, which makes the table just at 1.8GB in size.  All of the indexes on the table will be rebuilt using PAGE compression, and then 10,000 rows will be added to the table.  To setup the environment, first load the TPC-H LineItem table with the appropriate seed of data, this can be done with the free trial version of Benchmark Factory.  Then rebuild all of the indexes on the LineItem table using PAGE compression, and review the PAGE compression statistics from sys.dm_db_index_operational_stats for the database and object.

USE [TPCH]
GO
-- Rebuild the indexes with Page compression 
ALTER INDEX ALL ON dbo.H_Lineitem REBUILD WITH (DATA_COMPRESSION = PAGE)
GO
-- Look at the compression information in sys.dm_db_index_operational_stats
SELECT 
    database_id, 
    object_id, 
    index_id, 
    page_compression_attempt_count, 
    page_compression_success_count,
    (page_compression_attempt_count - page_compression_success_count) as page_compression_failure_count
FROM sys.dm_db_index_operational_stats(db_id('TPCH'), object_id('H_Lineitem'), null, null)
GO

image

Once the table and its indexes have been rebuilt using PAGE compression, we can then create our Event Session, start it, and add 10,000 rows to the LineItem table.  After we add the rows, we can then check the page compression statistics in sys.dm_db_index_operational_stats, and drop our Event Session from the server.

-- Create an Event Session to Track the Failed attempts
CREATE EVENT SESSION PageCompressionTracing
ON SERVER
ADD EVENT sqlserver.page_compression_attempt_failed,
ADD EVENT sqlserver.page_compression_tracing
ADD TARGET package0.asynchronous_file_target(
     SET filename='C:\SQLBlog\PageCompressionTracing.xel',
         metadatafile='C:\SQLBlog\PageCompressionTracing.xem')
WITH (MAX_MEMORY = 8MB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY=5SECONDS)
GO
-- Start the Event Session
ALTER EVENT SESSION PageCompressionTracing
ON SERVER
STATE=START
GO
-- Insert 10000 rows into the H_Lineitem table
INSERT INTO H_Lineitem
    (l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, 
     l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, 
     l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, 
     l_comment)
SELECT TOP 10000 
     l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, 
     l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, 
     l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, 
     l_comment
FROM H_Lineitem
GO
-- Look at the compression information in sys.dm_db_index_operational_stats
SELECT 
    database_id, 
    object_id, 
    index_id, 
    page_compression_attempt_count, 
    page_compression_success_count,
    (page_compression_attempt_count - page_compression_success_count) as page_compression_failure_count
FROM sys.dm_db_index_operational_stats(db_id('TPCH'), object_id('H_Lineitem'), null, null)
GO
-- Drop the Event Session
DROP EVENT SESSION PageCompressionTracing
ON SERVER
GO

image

Now we can parse the Events that were captured by our Event Session and compare the information presented by sys.dm_db_index_operational_stats() with what was collected by Extended Events.

-- Create our result Analysis database
CREATE DATABASE [PageCompTestResults]
GO
USE [PageCompTestResults]
GO
-- Create intermediate temp table for raw event data
CREATE TABLE RawEventData
(Rowid int identity primary key, event_data xml)
GO
-- Read the file data into intermediate temp table
INSERT INTO RawEventData (event_data)
SELECT
    CAST(event_data AS XML) AS event_data
FROM sys.fn_xe_file_target_read_file('C:\SQLBlog\PageCompressionTracing*.xel', 
                                     'C:\SQLBlog\PageCompressionTracing*.xem', 
                                     null, null)
GO
-- Fetch the Event Data from the Event Session Target
SELECT 
    RowID,
    event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
    COALESCE(event_data.value('(event/data[@name="database_id"]/value)[1]', 'int'), 
             event_data.value('(event/action[@name="database_id"]/value)[1]', 'int')) AS database_id,
    event_data.value('(event/data[@name="file_id"]/value)[1]', 'int') AS [file_id],
    event_data.value('(event/data[@name="page_id"]/value)[1]', 'int') AS [page_id],
    event_data.value('(event/data[@name="rowset_id"]/value)[1]', 'bigint') AS [rowset_id],
    event_data.value('(event/data[@name="failure_reason"]/text)[1]', 'nvarchar(150)') AS [failure_reason],
    event_data.value('(event/action[@name="system_thread_id"]/value)[1]', 'int') AS [system_thread_id],
    event_data.value('(event/action[@name="scheduler_id"]/value)[1]', 'int') AS [scheduler_id],
    event_data.value('(event/action[@name="cpu_id"]/value)[1]', 'int') AS [cpu_id]
INTO ParsedResults
FROM RawEventData
GO

After parsing out the data, we can begin to really leverage the information we’ve gathered.  If we join the ParsedResults table to sys.partitions for our TPCH database by rowset_id = hobt_id, we can get the object_id and index_id and aggregate the failure reasons up to the object and index level.

SELECT 
    pr.database_id, 
    p.object_id, 
    p.index_id,
    failure_reason, 
    COUNT(*) as failure_count
FROM TPCH.sys.partitions p
JOIN ParsedResults pr
    ON pr.rowset_id = p.hobt_id
WHERE event_name = 'page_compression_attempt_failed'
GROUP BY     pr.database_id, 
    p.object_id, 
    p.index_id,
    failure_reason
GO
-- Look at the compression information in sys.dm_db_index_operational_stats
SELECT 
    database_id, 
    object_id, 
    index_id, 
    page_compression_attempt_count, 
    page_compression_success_count,
    (page_compression_attempt_count - page_compression_success_count) as page_compression_failure_count
FROM sys.dm_db_index_operational_stats(db_id('TPCH'), object_id('H_Lineitem'), null, null)
GO

image

With this we can se that the Extended Events sqlserver.page_compression_attempt_failed Event tracks failures and attempts that are not counted in sys.dm_db_index_operational_stats().  The PageModCountBelowThreshold failure isn’t really a failed attempt at compression.  This reason shows that the page was evaluated for recalculation, and the modified counter for the page hadn’t passed the internal threshold for recalculation so the actual compression operation wasn’t performed.  If we look at the sqlserver.page_compression_tracing Event information, we can see how the numbers begin to come together to match what is output by sys.dm_db_index_operational_stats().

SELECT 
    pr.database_id, 
    p.object_id, 
    p.index_id,
    COUNT(*) as attempt_count
FROM TPCH.sys.partitions p
JOIN ParsedResults pr
    ON pr.rowset_id = p.hobt_id
WHERE event_name = 'page_compression_tracing'
  AND opcode = 'Begin'
GROUP BY     pr.database_id, 
    p.object_id, 
    p.index_id
GO
-- Look at the compression information in sys.dm_db_index_operational_stats
SELECT 
    database_id, 
    object_id, 
    index_id, 
    page_compression_attempt_count, 
    page_compression_success_count,
    (page_compression_attempt_count - page_compression_success_count) as page_compression_failure_count
FROM sys.dm_db_index_operational_stats(db_id('TPCH'), object_id('H_Lineitem'), null, null)
GO

image

We have 193 attempts by this Event, and we have 72 PageModCountBelowThreshold failures, matching our actual attempts of 121 from the DMF.  We can then subtract out the other failures and get the 93 successful operations matching the DMF as well.

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.