New Article Online – Great SQL Server Debates: Lock Pages in Memory

After a recent discussion about Lock Pages in Memory on Twitter, I wrote an article that talks about the history of Lock Pages in Memory and the differing opinions on the appropriate usage of Lock Pages in Memory for 64 bit instances of SQL Server. 

Great SQL Server Debates: Lock Pages in Memory

Thanks to Brent Ozar (Blog|Twitter) and Glenn Berry (Blog|Twitter) for their part in discussing this on twitter, by email, and for reviewing the article before I submitted it.

Tracking Problematic Pages Splits in SQL Server 2012 Extended Events – No Really This Time!

Just over a year ago I blogged about the enhancements that were made to the sqlserver.page_split Event in SQL Server 2012 to make it easier to identify what the splitting object was and the type of split that was being performed.  Sadly what I discovered writing that post was that even with the extra information about the split type, the event didn’t give you enough information to really focus on the problematic splits that lead to fragmentation and page density issues in the database.  I didn’t do a whole lot with this again until recently when a question was posted by Ami Levin (Blog | Twitter) on the MVP email list that commented that the page_split event was broken in SQL Server 2012 based on a presentation he’d seen by Guy Glantser (Blog | Twitter).

Let me start off by saying, the event isn’t broken, it tracks page splits, but it doesn’t differentiate between an end page split that occurs for an ever increasing index, versus a mid-page split for a random index that leads to fragmentation and page density issues in the database.  Both of these are technically splits inside the storage engine, even if we as DBA’s don’t really care about the end-page split for a increasing key value like an IDENTITY column in the database.  I had Ami pass my information along to the presenter and we traded a few emails on the subject of tracking splits with the specific focus on trying to pull out the mid-page, fragmenting splits.  While going through things for the third time, it dawned on me that this is incredibly simple, based one of the demo’s that was sent to me.  Just over a year ago, I also blogged about tracking transaction log activity in SQL Server 2012 using the sqlserver.transaction_log event, which can be used to track mid-page splits in a database.

Last year when I wrote about the sqlserver.transaction_log event, there were 10 columns output by the event in CTP1, but as of RC0, the events output has changed and only 9 columns are output by the event.

SELECT 
    oc.name, 
    oc.type_name, 
    oc.description
FROM sys.dm_xe_packages AS p
INNER JOIN sys.dm_xe_objects AS o
    ON p.guid = o.package_guid
INNER JOIN sys.dm_xe_object_columns AS oc
    ON oc.object_name = o.name
        AND oc.object_package_guid = o.package_guid
WHERE o.name = 'transaction_log'
  AND oc.column_type = 'data';

image

For the purposes of identifying the mid-page splits, we want to look at the operation column that is output by the event, which contains the specific operation being logged.  In the case of a mid-page split occurring, the operation will be a LOP_DELETE_SPLIT, which marks the delete of rows from a page as a result of the split.  To build our event session, we are going to need the map_key for the LOP_DELETE_SPLIT log_op map.  This can be obtained from the sys.dm_xe_map_values DMV:

SELECT *
FROM sys.dm_xe_map_values
WHERE name = 'log_op'
  AND map_value = 'LOP_DELETE_SPLIT';

With the map_key value, we have a couple of ways to collect the information with our targets.  We could collect everything into an event_file, but that doesn’t really make sense for this event.  Instead the best target for this type of information is the histogram target which will bucket our results based on how we configure the target and tell us how frequently the event fires based on our bucketing criteria.  If we don’t know anything about the server in question, we can start off with a very general event session that has a predicate on the operation only, and then aggregate the information in the histogram target based on the database_id to find the databases that have the most mid-page splits occurring in them in the instance.

-- If the Event Session exists DROP it
IF EXISTS (SELECT 1 
            FROM sys.server_event_sessions 
            WHERE name = 'SQLskills_TrackPageSplits')
    DROP EVENT SESSION [SQLskills_TrackPageSplits] ON SERVER

-- Create the Event Session to track LOP_DELETE_SPLIT transaction_log operations in the server
CREATE EVENT SESSION [SQLskills_TrackPageSplits]
ON    SERVER
ADD EVENT sqlserver.transaction_log(
    WHERE operation = 11  -- LOP_DELETE_SPLIT 
)
ADD TARGET package0.histogram(
    SET filtering_event_name = 'sqlserver.transaction_log',
        source_type = 0, -- Event Column
        source = 'database_id');
GO
        
-- Start the Event Session
ALTER EVENT SESSION [SQLskills_TrackPageSplits]
ON SERVER
STATE=START;
GO

This event session will allow you to track the worst splitting database on the server, and the event data can be parsed out of the histogram target.  To demonstrate this, we can create a database that has tables and indexes prone to mid-page splits and run a default workload to test the event session:

USE [master];
GO
-- Drop the PageSplits database if it exists
IF DB_ID('PageSplits') IS NOT NULL
BEGIN
    ALTER DATABASE PageSplits SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE PageSplits;
END
GO
-- Create the database
CREATE DATABASE PageSplits
GO
USE [PageSplits]
GO
-- Create a bad splitting clustered index table
CREATE TABLE BadSplitsPK
( ROWID UNIQUEIDENTIFIER NOT NULL DEFAULT NEWID() PRIMARY KEY,
  ColVal INT NOT NULL DEFAULT (RAND()*1000),
  ChangeDate DATETIME2 NOT NULL DEFAULT CURRENT_TIMESTAMP);
GO
--  This index should mid-split based on the DEFAULT column value
CREATE INDEX IX_BadSplitsPK_ColVal ON BadSplitsPK (ColVal);
GO
--  This index should end-split based on the DEFAULT column value
CREATE INDEX IX_BadSplitsPK_ChangeDate ON BadSplitsPK (ChangeDate);
GO
-- Create a table with an increasing clustered index
CREATE TABLE EndSplitsPK
( ROWID INT IDENTITY NOT NULL PRIMARY KEY,
  ColVal INT NOT NULL DEFAULT (RAND()*1000),
  ChangeDate DATETIME2 NOT NULL DEFAULT DATEADD(mi, RAND()*-1000, CURRENT_TIMESTAMP));
GO
--  This index should mid-split based on the DEFAULT column value
CREATE INDEX IX_EndSplitsPK_ChangeDate ON EndSplitsPK (ChangeDate);
GO
-- Insert the default values repeatedly into the tables
WHILE 1=1
BEGIN
    INSERT INTO dbo.BadSplitsPK DEFAULT VALUES;
    INSERT INTO dbo.EndSplitsPK DEFAULT VALUES;
    WAITFOR DELAY '00:00:00.005';
END
GO

If we startup this workload and allow it to run for a couple of minutes, we can then query the histogram target for our session to find the database that has the mid-page splits occurring.

-- Query the target data to identify the worst splitting database_id
SELECT 
    n.value('(value)[1]', 'bigint') AS database_id,
    DB_NAME(n.value('(value)[1]', 'bigint')) AS database_name,
    n.value('(@count)[1]', 'bigint') AS split_count
FROM
(SELECT CAST(target_data as XML) target_data
 FROM sys.dm_xe_sessions AS s 
 JOIN sys.dm_xe_session_targets t
     ON s.address = t.event_session_address
 WHERE s.name = 'SQLskills_TrackPageSplits'
  AND t.target_name = 'histogram' ) as tab
CROSS APPLY target_data.nodes('HistogramTarget/Slot') as q(n)

image

With the database_id of the worst splitting database, we can then change our event session configuration to only look at this database, and then change our histogram target configuration to bucket on the alloc_unit_id so that we can then track down the worst splitting indexes in the database experiencing the worst mid-page splits.

-- Drop the Event Session so we can recreate it 
-- to focus on the highest splitting database
DROP EVENT SESSION [SQLskills_TrackPageSplits] 
ON SERVER

-- Create the Event Session to track LOP_DELETE_SPLIT transaction_log operations in the server
CREATE EVENT SESSION [SQLskills_TrackPageSplits]
ON    SERVER
ADD EVENT sqlserver.transaction_log(
    WHERE operation = 11  -- LOP_DELETE_SPLIT 
      AND database_id = 8 -- CHANGE THIS BASED ON TOP SPLITTING DATABASE!
)
ADD TARGET package0.histogram(
    SET filtering_event_name = 'sqlserver.transaction_log',
        source_type = 0, -- Event Column
        source = 'alloc_unit_id');
GO

-- Start the Event Session Again
ALTER EVENT SESSION [SQLskills_TrackPageSplits]
ON SERVER
STATE=START;
GO

With the new event session definition, we can now rerun our problematic workload for a 2 minute period and look at the worst splitting indexes based on the alloc_unit_id’s that are in the histogram target:

 

-- Query Target Data to get the top splitting objects in the database:
SELECT
    o.name AS table_name,
    i.name AS index_name,
    tab.split_count,
    i.fill_factor
FROM (    SELECT 
            n.value('(value)[1]', 'bigint') AS alloc_unit_id,
            n.value('(@count)[1]', 'bigint') AS split_count
        FROM
        (SELECT CAST(target_data as XML) target_data
         FROM sys.dm_xe_sessions AS s 
         JOIN sys.dm_xe_session_targets t
             ON s.address = t.event_session_address
         WHERE s.name = 'SQLskills_TrackPageSplits'
          AND t.target_name = 'histogram' ) as tab
        CROSS APPLY target_data.nodes('HistogramTarget/Slot') as q(n)
) AS tab
JOIN sys.allocation_units AS au
    ON tab.alloc_unit_id = au.allocation_unit_id
JOIN sys.partitions AS p
    ON au.container_id = p.partition_id
JOIN sys.indexes AS i
    ON p.object_id = i.object_id
        AND p.index_id = i.index_id
JOIN sys.objects AS o
    ON p.object_id = o.object_id
WHERE o.is_ms_shipped = 0;

image

With this information we can now go back and change our FillFactor specifications and retest/monitor the impact to determine whether we’ve had the appropriate reduction in mid-page splits to accommodate the time between our index rebuild operations:

-- Change FillFactor based on split occurences
ALTER INDEX PK__BadSplit__97BD02EB726FCA55 ON BadSplitsPK REBUILD WITH (FILLFACTOR=70)
ALTER INDEX IX_BadSplitsPK_ColVal ON BadSplitsPK REBUILD WITH (FILLFACTOR=70)
ALTER INDEX IX_EndSplitsPK_ChangeDate ON EndSplitsPK REBUILD WITH (FILLFACTOR=80)
GO

-- Stop the Event Session to clear the target
ALTER EVENT SESSION [SQLskills_TrackPageSplits]
ON SERVER
STATE=STOP;
GO

-- Start the Event Session Again
ALTER EVENT SESSION [SQLskills_TrackPageSplits]
ON SERVER
STATE=START;
GO

With the reset performed we can again start up our workload generation and begin monitoring the effect of the FillFactor specifications on the indexes with our code.  After another 2 minute period, the following splits were noted.

image

With this information we can go back and again attempt to tune our FillFactor values for the worst splitting indexes and rinse/repeat until we determine the best FillFactor for each of the indexes to minimize splits.  This is an incredibly powerful tool for the DBA moving into SQL Server 2012, and will definitely change how we perform index fragmentation analysis and troubleshoot problems with excessive log generation in SQL Server 2012 onwards.

Cheers!

Finding Replication Schema Options

This afternoon, Orson Weston (Twitter), asked how to find the difference between two binary sets of schema options for replication on the #SQLHelp hash tag on twitter.  The valid values for the @schema_options parameter in replication are documented in the BOL Topic for sp_addarticle (http://msdn.microsoft.com/en-us/library/ms173857.aspx).  However, just having a table of values doesn’t really help you figure out what is different between two binary @schema_options values without doing some bitwise operations on the values.  To do this, we can create a table variable to hold the valid values and descriptions for the schema options and then use the & (Bitwise AND) operator to find which options are set for each of the binary values.

DECLARE @options TABLE
(Value VARBINARY(8), [Description] VARCHAR(1000))

INSERT INTO @options (Value, Description)
VALUES 
    (0x00, 'Disables scripting by the Snapshot Agent and uses creation_script.'),
    (0x01, 'Generates the object creation script (CREATE TABLE, CREATE PROCEDURE, and so on). This value is the default for stored procedure articles.'),
    (0x02, 'Generates the stored procedures that propagate changes for the article, if defined.'),
    (0x04, 'Identity columns are scripted using the IDENTITY property.'),
    (0x08, 'Replicate timestamp columns. If not set, timestamp columns are replicated as binary.'),
    (0x10, 'Generates a corresponding clustered index. Even if this option is not set, indexes related to primary keys and unique constraints are generated if they are already defined on a published table.'),
    (0x20, 'Converts user-defined data types (UDT) to base data types at the Subscriber. This option cannot be used when there is a CHECK or DEFAULT constraint on a UDT column, if a UDT column is part of the primary key, or if a computed column references a UDT column. Not supported for Oracle Publishers.'),
    (0x40, 'Generates corresponding nonclustered indexes. Even if this option is not set, indexes related to primary keys and unique constraints are generated if they are already defined on a published table.'),
    (0x80, 'Replicates primary key constraints. Any indexes related to the constraint are also replicated, even if options 0x10 and 0x40 are not enabled.'),
    (0x100, 'Replicates user triggers on a table article, if defined. Not supported for Oracle Publishers.'),
    (0x200, 'Replicates foreign key constraints. If the referenced table is not part of a publication, all foreign key constraints on a published table are not replicated. Not supported for Oracle Publishers.'),
    (0x400, 'Replicates check constraints. Not supported for Oracle Publishers.'),
    (0x800, 'Replicates defaults. Not supported for Oracle Publishers.'),
    (0x1000, 'Replicates column-level collation.'),
    (0x2000, 'Replicates extended properties associated with the published article source object. Not supported for Oracle Publishers.'),
    (0x4000, 'Replicates UNIQUE constraints. Any indexes related to the constraint are also replicated, even if options 0x10 and 0x40 are not enabled.'),
    (0x8000, 'This option is not valid for SQL Server 2005 Publishers.'),
    (0x10000, 'Replicates CHECK constraints as NOT FOR REPLICATION so that the constraints are not enforced during synchronization.'),
    (0x20000, 'Replicates FOREIGN KEY constraints as NOT FOR REPLICATION so that the constraints are not enforced during synchronization.'),
    (0x40000, 'Replicates filegroups associated with a partitioned table or index.'),
    (0x80000, 'Replicates the partition scheme for a partitioned table.'),
    (0x100000, 'Replicates the partition scheme for a partitioned index.'),
    (0x200000, 'Replicates table statistics.'),
    (0x400000, 'Default Bindings'),
    (0x800000, 'Rule Bindings'),
    (0x1000000, 'Full-text index'),
    (0x2000000, 'XML schema collections bound to xml columns are not replicated.'),
    (0x4000000, 'Replicates indexes on xml columns.'),
    (0x8000000, 'Create any schemas not already present on the subscriber.'),
    (0x10000000, 'Converts xml columns to ntext on the Subscriber.'),
    (0x20000000, 'Converts large object data types (nvarchar(max), varchar(max), and varbinary(max)) introduced in SQL Server 2005 to data types that are supported on SQL Server 2000. For information about how these types are mapped, see the "Mapping New Data Types for Earlier Versions" section in Using Multiple Versions of SQL Server in a Replication Topology.'),
    (0x40000000, 'Replicate permissions.'),
    (0x80000000, 'Attempt to drop dependencies to any objects that are not part of the publication.'),
    (0x100000000, 'Use this option to replicate the FILESTREAM attribute if it is specified on varbinary(max) columns. Do not specify this option if you are replicating tables to SQL Server 2005 Subscribers. Replicating tables that have FILESTREAM columns to SQL Server 2000 Subscribers is not supported, regardless of how this schema option is set. '),
    (0x200000000, 'Converts date and time data types (date, time, datetimeoffset, and datetime2) introduced in SQL Server 2008 to data types that are supported on earlier versions of SQL Server. For information about how these types are mapped, see the "Mapping New Data Types for Earlier Versions" section in Using Multiple Versions of SQL Server in a Replication Topology.'),
    (0x400000000, 'Replicates the compression option for data and indexes. For more information, see Creating Compressed Tables and Indexes.'),
    (0x800000000, 'Set this option to store FILESTREAM data on its own filegroup at the Subscriber. If this option is not set, FILESTREAM data is stored on the default filegroup. Replication does not create filegroups, therefore, if you set this option, you must create the filegroup before you apply the snapshot at the Subscriber. For more information about how to create objects before you apply the snapshot, see Executing Scripts Before and After the Snapshot Is Applied.'),
    (0x1000000000, 'Converts common language runtime (CLR) user-defined types (UDTs) that are larger than 8000 bytes to varbinary(max) so that columns of type UDT can be replicated to Subscribers that are running SQL Server 2005.'),
    (0x2000000000, 'Converts the hierarchyid data type to varbinary(max) so that columns of type hierarchyid can be replicated to Subscribers that are running SQL Server 2005. For more information about how to use hierarchyid columns in replicated tables, see hierarchyid (Transact-SQL).'),
    (0x4000000000, 'Replicates any filtered indexes on the table. For more information about filtered indexes, see Filtered Index Design Guidelines.'),
    (0x8000000000, 'Converts the geography and geometry data types to varbinary(max) so that columns of these types can be replicated to Subscribers that are running SQL Server 2005.'),
    (0x10000000000, 'Replicates indexes on columns of type geography and geometry.'),
    (0x20000000000, 'Replicates the SPARSE attribute for columns. For more information about this attribute, see Using Sparse Columns.')

DECLARE @schema_option VARBINARY(8) = 0x000000000807509F;

SELECT CONVERT(INT, @schema_option, 1) & CONVERT(INT,value,1), *
FROM @options
WHERE CONVERT(INT, @schema_option, 1) & CONVERT(INT,value,1) <> 0;

SET @schema_option = 0x000000000803509F;

SELECT CONVERT(INT, @schema_option, 1) & CONVERT(INT,value,1), *
FROM @options
WHERE CONVERT(INT, @schema_option, 1) & CONVERT(INT,value,1) <> 0;

Using Orson’s two values we can see that the difference between the two is, the first value includes the “Replicates filegroups associated with a partitioned table or index.” option where the second value does not.