In a word; YES! In a lot more words, not always in the way that we want it to, but there are plenty of cases where it actually works and changes are made to the product as a result.

Now with that said, it doesn’t work all the time, and it helps to realize that what is important to us as an individual user might not be important to the product as a whole.  Before you post comments, I am sure that there are plenty of cases out there where people can say that Microsoft Connect for SQL Server is broken.  I have personally been irritated to the point of posting negative comments on Twitter about the whole Connect process.  I feel that it is about time that I show the other side of the story as well and talk about some Connect successes that have occurred in the past year, and of course what better topic to do this with than Extended Events.  Over the next few weeks, I’ll post a couple of different examples of Connect actually working and bringing about changes to the product that are beneficial to the community, starting with this post.

Extended Events does not track insert statements

This Connect item is actually incorrectly titled and is based on some confusion about what the sqlserver.plan_handle action actually returns when executed in the engine.  I blogged about this with much more detail last year in my blog post; What plan_handle is Extended Events sqlserver.plan_handle action returning?

If we revisit the Connect item there is a note that the sql_statement_completed event in SQL Server 2012 now includes a parameterized_plan_handle customizable column that can be used to retrieve the parameterized plan handle for statements that are auto-parameterized by SQL Server during their execution.  Taking the same original demo code from my previous blog post, we can now see how this Connect item has improved the ability to find information about plan caching in SQL Server 2012:

-- Create the Event Session
IF EXISTS(SELECT * 
          FROM sys.server_event_sessions 
          WHERE name='SQLStmtEvents')
    DROP EVENT SESSION SQLStmtEvents 
    ON SERVER;
GO
CREATE EVENT SESSION SQLStmtEvents
ON SERVER
ADD EVENT sqlserver.sql_statement_completed(
    SET collect_parameterized_plan_handle = 1
    ACTION (sqlserver.client_app_name,
            sqlserver.plan_handle,
            sqlserver.sql_text,
            sqlserver.tsql_stack,
            package0.callstack,
            sqlserver.request_id)
--Change this to match the AdventureWorks, 
--AdventureWorks2008 or AdventureWorks2008 SELECT DB_ID('AdventureWorks2008R2')
WHERE sqlserver.database_id=9
)
ADD TARGET package0.ring_buffer
WITH (MAX_DISPATCH_LATENCY=5SECONDS, TRACK_CAUSALITY=ON)
GO
 
-- Start the Event Session
ALTER EVENT SESSION SQLStmtEvents 
ON SERVER 
STATE = START;
GO
 
-- Change database contexts and insert one row
USE AdventureWorks2008R2;
GO
INSERT INTO [dbo].[ErrorLog]([ErrorTime],[UserName],[ErrorNumber],[ErrorSeverity],[ErrorState],[ErrorProcedure],[ErrorLine],[ErrorMessage])
VALUES(getdate(),SYSTEM_USER,-1,-1,-1,'ErrorProcedure',-1,'An error occurred')
GO 10
 
-- Drop the Event
ALTER EVENT SESSION SQLStmtEvents
ON SERVER
DROP EVENT sqlserver.sql_statement_completed;
GO

-- Retrieve the Event Data from the Event Session Target
SELECT
    event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    event_data.value('xs:hexBinary((event/data[@name="parameterized_plan_handle"]/value)[1])', 'varbinary(64)') as parameterized_plan_handle,
    event_data.value('xs:hexBinary((event/action[@name="plan_handle"]/value)[1])', 'varbinary(64)') as plan_handle,
    event_data.value('(event/action[@name="sql_text"]/value)[1]', 'varchar(max)') AS sql_text
FROM(    SELECT evnt.query('.') AS event_data
        FROM
        (   SELECT CAST(target_data AS xml) AS TargetData
            FROM sys.dm_xe_sessions AS s
            JOIN sys.dm_xe_session_targets AS t
                ON s.address = t.event_session_address
            WHERE s.name = 'SQLStmtEvents'
              AND t.target_name = 'ring_buffer'
        ) AS tab
        CROSS APPLY TargetData.nodes ('RingBufferTarget/event') AS split(evnt) 
     ) AS evts(event_data)

If we look at the output of this, we will get the parameterized plan handle for each subsequent call of the statement after the initial call caches the parameterized plan into the cache.

image

If we plug one of the original plan_handle values from the sqlserver.plan_handle action into a query of sys.dm_exec_cached_plans() it will return nothing, but using the new parameterized_plan_handle value from the customizable column will give us the appropriate cached plan for the statement from cache:

-- Use the plan_handle from one of the Events action to get the query plan
DECLARE @plan_handle varbinary(64) = 0x06000900DFC9DD12608B18EE0100000001000000000000000000000000000000000000000000000000000000
SELECT * 
FROM sys.dm_exec_query_plan(@plan_handle)
GO

-- Use the parameterized_plan_handle from the same Events to get the query plan
DECLARE @plan_handle varbinary(64) = 0x06000900DD8D6D08601E70EE01000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
SELECT * 
FROM sys.dm_exec_query_plan(@plan_handle)
GO

image

Now, you might point out the different lengths of the plan handles in the above two queries.  If you look back at the source, the same code is being used to perform the xhexBinary conversion in the XML so the values are exactly the same as what was originally provided by the event and the action.  The non-parameterized plan is not cached because it is not likely to be reused, which is why we have the auto-parameterized plan in cache.

In SQL Server 2012 RC0 there are a number of event session templates provided that make creating a commonly used session easier using the Event Session Wizard in SQL Server Management Studio.  One of these has a bug in it’s definition XML file that was filed in the following connect item:

https://connect.microsoft.com/SQLServer/feedback/details/705840/the-object-sqlserver-event-sequence-does-not-exist#tabs

If you attempt to pick the Activity Tracking template you will get the following error:

image

The error is occurring because the event_sequence action is provided by package0 and not sqlserver. To work around this, you can edit the template file and replace the sqlserver package references with package0. The template is saved in the following location:

C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Templates\sql\xevent\xe_activity.xml

If you do a find for:

<action package="sqlserver" name="event_sequence" />

and replace it with:

<action package="package0" name="event_sequence" />

th template will work correctly once saved. This will at least let you play around with this while Microsoft works out the bug in the template XML.  A copy of the corrected file is attached to this blog post as well.

xe_activity.xml (18.28 kb)

Just over a year ago I blogged about the enhancements that were made to the sqlserver.page_split Event in SQL Server 2012 to make it easier to identify what the splitting object was and the type of split that was being performed.  Sadly what I discovered writing that post was that even with the extra information about the split type, the event didn’t give you enough information to really focus on the problematic splits that lead to fragmentation and page density issues in the database.  I didn’t do a whole lot with this again until recently when a question was posted by Ami Levin (Blog | Twitter) on the MVP email list that commented that the page_split event was broken in SQL Server 2012 based on a presentation he’d seen by Guy Glantser (Blog | Twitter).

Let me start off by saying, the event isn’t broken, it tracks page splits, but it doesn’t differentiate between an end page split that occurs for an ever increasing index, versus a mid-page split for a random index that leads to fragmentation and page density issues in the database.  Both of these are technically splits inside the storage engine, even if we as DBA’s don’t really care about the end-page split for a increasing key value like an IDENTITY column in the database.  I had Ami pass my information along to the presenter and we traded a few emails on the subject of tracking splits with the specific focus on trying to pull out the mid-page, fragmenting splits.  While going through things for the third time, it dawned on me that this is incredibly simple, based one of the demo’s that was sent to me.  Just over a year ago, I also blogged about tracking transaction log activity in SQL Server 2012 using the sqlserver.transaction_log event, which can be used to track mid-page splits in a database.

Last year when I wrote about the sqlserver.transaction_log event, there were 10 columns output by the event in CTP1, but as of RC0, the events output has changed and only 9 columns are output by the event.

SELECT 
    oc.name, 
    oc.type_name, 
    oc.description
FROM sys.dm_xe_packages AS p
INNER JOIN sys.dm_xe_objects AS o
    ON p.guid = o.package_guid
INNER JOIN sys.dm_xe_object_columns AS oc
    ON oc.object_name = o.name
        AND oc.object_package_guid = o.package_guid
WHERE o.name = 'transaction_log'
  AND oc.column_type = 'data';

image

For the purposes of identifying the mid-page splits, we want to look at the operation column that is output by the event, which contains the specific operation being logged.  In the case of a mid-page split occurring, the operation will be a LOP_DELETE_SPLIT, which marks the delete of rows from a page as a result of the split.  To build our event session, we are going to need the map_key for the LOP_DELETE_SPLIT log_op map.  This can be obtained from the sys.dm_xe_map_values DMV:

SELECT *
FROM sys.dm_xe_map_values
WHERE name = 'log_op'
  AND map_value = 'LOP_DELETE_SPLIT';

With the map_key value, we have a couple of ways to collect the information with our targets.  We could collect everything into an event_file, but that doesn’t really make sense for this event.  Instead the best target for this type of information is the histogram target which will bucket our results based on how we configure the target and tell us how frequently the event fires based on our bucketing criteria.  If we don’t know anything about the server in question, we can start off with a very general event session that has a predicate on the operation only, and then aggregate the information in the histogram target based on the database_id to find the databases that have the most mid-page splits occurring in them in the instance.

-- If the Event Session exists DROP it
IF EXISTS (SELECT 1 
            FROM sys.server_event_sessions 
            WHERE name = 'SQLskills_TrackPageSplits')
    DROP EVENT SESSION [SQLskills_TrackPageSplits] ON SERVER

-- Create the Event Session to track LOP_DELETE_SPLIT transaction_log operations in the server
CREATE EVENT SESSION [SQLskills_TrackPageSplits]
ON    SERVER
ADD EVENT sqlserver.transaction_log(
    WHERE operation = 11  -- LOP_DELETE_SPLIT 
)
ADD TARGET package0.histogram(
    SET filtering_event_name = 'sqlserver.transaction_log',
        source_type = 0, -- Event Column
        source = 'database_id');
GO
        
-- Start the Event Session
ALTER EVENT SESSION [SQLskills_TrackPageSplits]
ON SERVER
STATE=START;
GO

This event session will allow you to track the worst splitting database on the server, and the event data can be parsed out of the histogram target.  To demonstrate this, we can create a database that has tables and indexes prone to mid-page splits and run a default workload to test the event session:

USE [master];
GO
-- Drop the PageSplits database if it exists
IF DB_ID('PageSplits') IS NOT NULL
BEGIN
    ALTER DATABASE PageSplits SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE PageSplits;
END
GO
-- Create the database
CREATE DATABASE PageSplits
GO
USE [PageSplits]
GO
-- Create a bad splitting clustered index table
CREATE TABLE BadSplitsPK
( ROWID UNIQUEIDENTIFIER NOT NULL DEFAULT NEWID() PRIMARY KEY,
  ColVal INT NOT NULL DEFAULT (RAND()*1000),
  ChangeDate DATETIME2 NOT NULL DEFAULT CURRENT_TIMESTAMP);
GO
--  This index should mid-split based on the DEFAULT column value
CREATE INDEX IX_BadSplitsPK_ColVal ON BadSplitsPK (ColVal);
GO
--  This index should end-split based on the DEFAULT column value
CREATE INDEX IX_BadSplitsPK_ChangeDate ON BadSplitsPK (ChangeDate);
GO
-- Create a table with an increasing clustered index
CREATE TABLE EndSplitsPK
( ROWID INT IDENTITY NOT NULL PRIMARY KEY,
  ColVal INT NOT NULL DEFAULT (RAND()*1000),
  ChangeDate DATETIME2 NOT NULL DEFAULT DATEADD(mi, RAND()*-1000, CURRENT_TIMESTAMP));
GO
--  This index should mid-split based on the DEFAULT column value
CREATE INDEX IX_EndSplitsPK_ChangeDate ON EndSplitsPK (ChangeDate);
GO
-- Insert the default values repeatedly into the tables
WHILE 1=1
BEGIN
    INSERT INTO dbo.BadSplitsPK DEFAULT VALUES;
    INSERT INTO dbo.EndSplitsPK DEFAULT VALUES;
    WAITFOR DELAY '00:00:00.005';
END
GO

If we startup this workload and allow it to run for a couple of minutes, we can then query the histogram target for our session to find the database that has the mid-page splits occurring.

-- Query the target data to identify the worst splitting database_id
SELECT 
    n.value('(value)[1]', 'bigint') AS database_id,
    DB_NAME(n.value('(value)[1]', 'bigint')) AS database_name,
    n.value('(@count)[1]', 'bigint') AS split_count
FROM
(SELECT CAST(target_data as XML) target_data
 FROM sys.dm_xe_sessions AS s 
 JOIN sys.dm_xe_session_targets t
     ON s.address = t.event_session_address
 WHERE s.name = 'SQLskills_TrackPageSplits'
  AND t.target_name = 'histogram' ) as tab
CROSS APPLY target_data.nodes('HistogramTarget/Slot') as q(n)

image

With the database_id of the worst splitting database, we can then change our event session configuration to only look at this database, and then change our histogram target configuration to bucket on the alloc_unit_id so that we can then track down the worst splitting indexes in the database experiencing the worst mid-page splits.

-- Drop the Event Session so we can recreate it 
-- to focus on the highest splitting database
DROP EVENT SESSION [SQLskills_TrackPageSplits] 
ON SERVER

-- Create the Event Session to track LOP_DELETE_SPLIT transaction_log operations in the server
CREATE EVENT SESSION [SQLskills_TrackPageSplits]
ON    SERVER
ADD EVENT sqlserver.transaction_log(
    WHERE operation = 11  -- LOP_DELETE_SPLIT 
      AND database_id = 8 -- CHANGE THIS BASED ON TOP SPLITTING DATABASE!
)
ADD TARGET package0.histogram(
    SET filtering_event_name = 'sqlserver.transaction_log',
        source_type = 0, -- Event Column
        source = 'alloc_unit_id');
GO

-- Start the Event Session Again
ALTER EVENT SESSION [SQLskills_TrackPageSplits]
ON SERVER
STATE=START;
GO

With the new event session definition, we can now rerun our problematic workload for a 2 minute period and look at the worst splitting indexes based on the alloc_unit_id’s that are in the histogram target:

 

-- Query Target Data to get the top splitting objects in the database:
SELECT
    o.name AS table_name,
    i.name AS index_name,
    tab.split_count,
    i.fill_factor
FROM (    SELECT 
            n.value('(value)[1]', 'bigint') AS alloc_unit_id,
            n.value('(@count)[1]', 'bigint') AS split_count
        FROM
        (SELECT CAST(target_data as XML) target_data
         FROM sys.dm_xe_sessions AS s 
         JOIN sys.dm_xe_session_targets t
             ON s.address = t.event_session_address
         WHERE s.name = 'SQLskills_TrackPageSplits'
          AND t.target_name = 'histogram' ) as tab
        CROSS APPLY target_data.nodes('HistogramTarget/Slot') as q(n)
) AS tab
JOIN sys.allocation_units AS au
    ON tab.alloc_unit_id = au.allocation_unit_id
JOIN sys.partitions AS p
    ON au.container_id = p.partition_id
JOIN sys.indexes AS i
    ON p.object_id = i.object_id
        AND p.index_id = i.index_id
JOIN sys.objects AS o
    ON p.object_id = o.object_id
WHERE o.is_ms_shipped = 0;

image

With this information we can now go back and change our FillFactor specifications and retest/monitor the impact to determine whether we’ve had the appropriate reduction in mid-page splits to accommodate the time between our index rebuild operations:

-- Change FillFactor based on split occurences
ALTER INDEX PK__BadSplit__97BD02EB726FCA55 ON BadSplitsPK REBUILD WITH (FILLFACTOR=70)
ALTER INDEX IX_BadSplitsPK_ColVal ON BadSplitsPK REBUILD WITH (FILLFACTOR=70)
ALTER INDEX IX_EndSplitsPK_ChangeDate ON EndSplitsPK REBUILD WITH (FILLFACTOR=80)
GO

-- Stop the Event Session to clear the target
ALTER EVENT SESSION [SQLskills_TrackPageSplits]
ON SERVER
STATE=STOP;
GO

-- Start the Event Session Again
ALTER EVENT SESSION [SQLskills_TrackPageSplits]
ON SERVER
STATE=START;
GO

With the reset performed we can again start up our workload generation and begin monitoring the effect of the FillFactor specifications on the indexes with our code.  After another 2 minute period, the following splits were noted.

image

With this information we can go back and again attempt to tune our FillFactor values for the worst splitting indexes and rinse/repeat until we determine the best FillFactor for each of the indexes to minimize splits.  This is an incredibly powerful tool for the DBA moving into SQL Server 2012, and will definitely change how we perform index fragmentation analysis and troubleshoot problems with excessive log generation in SQL Server 2012 onwards.

Cheers!

While setting up my new Availability Group using SQL Server 2012 RC0 tonight, I noticed an interesting new addition to Extended Events associated with Availability Group configuration in the Release Candidate.  When you setup an Availability Group in RC0, another default Event Session is created on the servers that participate in the Availability Group to provide monitoring of the health of the Availability Group overall.  The definition of the monitoring session is as follows:

CREATE EVENT SESSION [AlwaysOn_health] ON SERVER
ADD EVENT sqlserver.alwayson_ddl_executed,
ADD EVENT sqlserver.availability_group_lease_expired,
ADD EVENT sqlserver.availability_replica_automatic_failover_validation,
ADD EVENT sqlserver.availability_replica_manager_state_change,
ADD EVENT sqlserver.availability_replica_state_change,
ADD EVENT sqlserver.error_reported(
    WHERE ([error_number]=(9691) OR [error_number]=(35204) OR [error_number]=(9693) OR [error_number]=(26024) OR [error_number]=(28047) OR [error_number]=(26023) OR [error_number]=(9692) OR [error_number]=(28034) OR [error_number]=(28036) OR [error_number]=(28048) OR [error_number]=(28080) OR [error_number]=(28091) OR [error_number]=(26022) OR [error_number]=(9642) OR [error_number]=(35201) OR [error_number]=(35202) OR [error_number]=(35206) OR [error_number]=(35207) OR [error_number]=(26069) OR [error_number]=(26070) OR [error_number]>(41047) AND [error_number]<(41056) OR [error_number]=(41142) OR [error_number]=(41144) OR [error_number]=(1480) OR [error_number]=(823) OR [error_number]=(824) OR [error_number]=(829) OR [error_number]=(35264) OR [error_number]=(35265))),
ADD EVENT sqlserver.lock_redo_blocked
ADD TARGET package0.event_file(SET filename=N'AlwaysOn_health.xel',max_file_size=(5),max_rollover_files=(4))
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=ON)
GO

This Extended Event Session monitors a number of critical events in the system but one of the problems with figuring out what exactly this Event Session is monitoring is to figure out what all the predicate values on the sqlserver.error_reported event are actually firing on.  To that aspect of things, we can do a quick reuse of the predicate on this event be doing a replace SSMS on the [error_number] value with a replace for an alias to a query to sys.messages, on the message_id column from the DMV as follows:

SELECT message_id, severity, is_event_logged, text
FROM sys.messages AS m
WHERE m.language_id = SERVERPROPERTY('LCID')
  AND  (m.message_id=(9691)
        OR m.message_id=(35204)
        OR m.message_id=(9693)
        OR m.message_id=(26024)
        OR m.message_id=(28047)
        OR m.message_id=(26023)
        OR m.message_id=(9692)
        OR m.message_id=(28034)
        OR m.message_id=(28036)
        OR m.message_id=(28048)
        OR m.message_id=(28080)
        OR m.message_id=(28091)
        OR m.message_id=(26022)
        OR m.message_id=(9642)
        OR m.message_id=(35201)
        OR m.message_id=(35202)
        OR m.message_id=(35206)
        OR m.message_id=(35207)
        OR m.message_id=(26069)
        OR m.message_id=(26070)
        OR m.message_id>(41047)
        AND m.message_id<(41056)
        OR m.message_id=(41142)
        OR m.message_id=(41144)
        OR m.message_id=(1480)
        OR m.message_id=(823)
        OR m.message_id=(824)
        OR m.message_id=(829)
        OR m.message_id=(35264)
        OR m.message_id=(35265)
)

This will give us a list of the error messages that the Event Session will actually fire events for:

message_id

severity is_event_logged text
823 24 1 The operating system returned error %ls to SQL Server during a %S_MSG at offset %#016I64x in file '%ls'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
824 24 1 SQL Server detected a logical consistency-based I/O error: %ls. It occurred during a %S_MSG of page %S_PGID in database ID %d at offset %#016I64x in file '%ls'.  Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
829 21 1 Database ID %d, Page %S_PGID is marked RestorePending, which may indicate disk corruption. To recover from this state, perform a restore.
1480 10 0 The %S_MSG database "%.*ls" is changing roles from "%ls" to "%ls" because the mirroring session or availability group failed over due to %S_MSG. This is an informational message only. No user action is required.
9642 16 0 An error occurred in a Service Broker/Database Mirroring transport connection endpoint, Error: %i, State: %i. (Near endpoint role: %S_MSG, far endpoint address: '%.*hs')
9691 10 0 The %S_MSG endpoint has stopped listening for connections.
9692 16 0 The %S_MSG endpoint cannot listen on port %d because it is in use by another process.
9693 16 0 The %S_MSG endpoint cannot listen for connections due to the following error: '%.*ls'.
26022 10 1 Server is listening on [ %hs <%hs> %d].
26023 16 1 Server TCP provider failed to listen on [ %hs <%hs> %d]. Tcp port is already in use.
26024 16 1 Server failed to listen on %hs <%hs> %d. Error: %#x. To proceed, notify your system administrator.
26069 10 1 Started listening on virtual network name '%ls'. No user action is required.
26070 10 1 Stopped listening on virtual network name '%ls'. No user action is required.
28034 10 0 Connection handshake failed. The login '%.*ls' does not have CONNECT permission on the endpoint. State %d.
28036 10 0 Connection handshake failed. The certificate used by this endpoint was not found: %S_MSG. Use DBCC CHECKDB in master database to verify the metadata integrity of the endpoints. State %d.
28047 10 1 %S_MSG login attempt failed with error: '%.*ls'. %.*ls
28048 10 1 %S_MSG login attempt by user '%.*ls' failed with error: '%.*ls'. %.*ls
28080 10 0 Connection handshake failed. The %S_MSG endpoint is not configured. State %d.
28091 10 0  Starting endpoint for %S_MSG with no authentication is not supported.
35201 10 0 A connection timeout has occurred while attempting to establish a connection to availability replica '%ls' with id [%ls]. Either a networking or firewall issue exists, or the endpoint address provided for the replica is not the database mirroring endpoint of the host server instance.
35202 10 0 A connection for availability group '%ls' from availability replica '%ls' with id  [%ls] to '%ls' with id [%ls] has been successfully established.  This is an informational message only. No user action is required.
35204 10 0 The connection between server instances '%ls' with id [%ls] and '%ls' with id [%ls] has been disabled because the database mirroring endpoint was either disabled or stopped. Restart the endpoint by using the ALTER ENDPOINT Transact-SQL statement with STATE = STARTED.
35206 10 0 A connection timeout has occurred on a previously established connection to availability replica '%ls' with id [%ls].  Either a networking or a firewall issue exists or the availability replica has transitioned to the resolving role.
35207 16 0 Connection attempt on availability group id '%ls' from replica id '%ls' to replica id '%ls' failed because of error %d, severity %d, state %d.
35264 10 0 AlwaysOn Availability Groups data movement for database '%.*ls' has been suspended for the following reason: "%S_MSG" (Source ID %d; Source string: '%.*ls'). To resume data movement on the database, you will need to resume the database manually. For information about how to resume an availability database, see SQL Server Books Online.
35265 10 0 AlwaysOn Availability Groups data movement for database '%.*ls' has been resumed. This is an informational message only. No user action is required.
41048 10 1 AlwaysOn Availability Groups: Local Windows Server Failover Clustering service has become unavailable. This is an informational message only. No user action is required.
41049 10 1 AlwaysOn Availability Groups: Local Windows Server Failover Clustering node is no longer online. This is an informational message only. No user action is required.
41050 10 1 AlwaysOn Availability Groups: Waiting for local Windows Server Failover Clustering service to start. This is an informational message only. No user action is required.
41051 10 1 AlwaysOn Availability Groups: Local Windows Server Failover Clustering service started. This is an informational message only. No user action is required.
41052 10 1 AlwaysOn Availability Groups: Waiting for local Windows Server Failover Clustering node to start. This is an informational message only. No user action is required.
41053 10 1 AlwaysOn Availability Groups: Local Windows Server Failover Clustering node started. This is an informational message only. No user action is required.
41054 10 1 AlwaysOn Availability Groups: Waiting for local Windows Server Failover Clustering node to come online. This is an informational message only. No user action is required.
41055 10 1 AlwaysOn Availability Groups: Local Windows Server Failover Clustering node is online. This is an informational message only. No user action is required.
41142 16 0 The availability replica for availability group '%.*ls' on this instance of SQL Server cannot become the primary replica. One or more databases are not synchronized or have not joined the availability group, or the WSFC cluster was started in Force Quorum mode. If the cluster was started in Force Quorum mode or the availability replica uses the asynchronous-commit mode, consider performing a forced manual failover (with possible data loss). Otherwise, once all local secondary databases are joined and synchronized, you can perform a planned manual failover to this secondary replica (without data loss). For more information, see SQL Server Books Online.
41144 16 0 The local availability replica of availability group '%.*ls' is in a failed state.  The replica failed to read or update the persisted configuration data (SQL Server error: %d).  To recover from this failure, either restart the local Windows Server Failover Clustering (WSFC) service or restart the local instance of SQL Server.

Based on this output, and the output of the following query:

SELECT name, description
FROM sys.dm_xe_objects
WHERE NAME IN (
'alwayson_ddl_executed',
'availability_group_lease_expired',
'availability_replica_automatic_failover_validation',
'availability_replica_manager_state_change',
'availability_replica_state_change',
'error_reported',
'lock_redo_blocked')

We can deduce the following about the Event Session:

CREATE EVENT SESSION [AlwaysOn_health] ON SERVER
--Occurs when AlwaysOn DDL is executed including CREATE, ALTER or DROP
ADD EVENT sqlserver.alwayson_ddl_executed,
--Occurs when there is a connectivity issue between the cluster and the Availability Group resulting
--in a failure to renew the lease
ADD EVENT sqlserver.availability_group_lease_expired,
--Occurs when the failover validates the readiness of replica as a primary. For instance, the failover
--validation will return false when not all databases are synchronized or not joined
ADD EVENT sqlserver.availability_replica_automatic_failover_validation,
--Occurs when the state of the Availability Replica Manager has changed.
ADD EVENT sqlserver.availability_replica_manager_state_change,
--Occurs when the state of the Availability Replica has changed.
ADD EVENT sqlserver.availability_replica_state_change,
--Occurs when an error is reported based on the previously listed table
ADD EVENT sqlserver.error_reported(
    WHERE ([error_number]=(9691) OR [error_number]=(35204) OR [error_number]=(9693) OR [error_number]=(26024) OR [error_number]=(28047) OR [error_number]=(26023) OR [error_number]=(9692) OR [error_number]=(28034) OR [error_number]=(28036) OR [error_number]=(28048) OR [error_number]=(28080) OR [error_number]=(28091) OR [error_number]=(26022) OR [error_number]=(9642) OR [error_number]=(35201) OR [error_number]=(35202) OR [error_number]=(35206) OR [error_number]=(35207) OR [error_number]=(26069) OR [error_number]=(26070) OR [error_number]>(41047) AND [error_number]<(41056) OR [error_number]=(41142) OR [error_number]=(41144) OR [error_number]=(1480) OR [error_number]=(823) OR [error_number]=(824) OR [error_number]=(829) OR [error_number]=(35264) OR [error_number]=(35265))),
--Occurs when the redo thread blocks when trying to acquire a lock.
ADD EVENT sqlserver.lock_redo_blocked
--Writes to the file target for persistence in the system beyond failovers and service restarts
ADD TARGET package0.event_file(SET filename=N'AlwaysOn_health.xel',max_file_size=(5),max_rollover_files=(4))
WITH (MAX_MEMORY=4096 KB, EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY=30 SECONDS, MAX_EVENT_SIZE=0 KB, MEMORY_PARTITION_MODE=NONE, TRACK_CAUSALITY=OFF, STARTUP_STATE=ON)
GO

What is really cool is that this Event Session is used by the Availability Groups Dashboard to provide an overall status of the health of the Availability Group in Management Studio.

In the first post in this blog series on using SQL Server 2012 Distributed Replay, Installing and Configuring SQL Server 2012 Distributed Replay, we looked at how to configure a Distributed Replay environment using multiple clients and a dedicated replay controller.  In this post we’ll actually make use of the previously configured servers to perform a distributed replay using a random workload that has been generated against the AdventureWorks2008R2 database installed on our Replay SQL Server.

Collecting the Replay Trace Data

For the purposes of generating a random workload against AdventureWorks2008R2, I created a workload generator that can be found on my blog post The AdventureWorks2008R2 Books Online Random Workload Generator.  I used this with 2 different PowerShell Windows from SQL2012-DRU1 and SQL2012-DRU2 to run a random workload across multiple sessions against the SQL2012-DB1 server.  To capture the trace data required for performing the replay, SQL Server Profiler was used along with the TSQL_Replay template to create the capture.

image

For production systems, the best way to go about capturing a Replay Trace is to script the trace definition to a file, and then create the trace as a server side trace that is writing to a trace file on local disks for the server.  This has a significantly lower impact that tracing directly from Profiler, which uses the rowset provider for Trace.  With the replay trace running, and the workload generating events I waited for the trace to collect around 80000 rows of data and then shutdown the trace so that I could access the trace file to copy it from the SQL2012-DB1 server to the SQL2012-DRU server where the Distributed Replay Controller is installed.

Preprocessing the Trace File(s)

At the point that I went to perform the preprocessing of the trace file for replay, I realized a difference in my environment using multiple servers to build this blog series versus my original setup using a single server for learning how to use Distributed Replay.  In order to preprocess the trace file for replay, you have to have the Management Tools Basic installed on the server that will be used for preprocessing the trace data.  If you have been following this blog series to learn how to use Distributed Replay, you will need to run Setup on the SQL2012-DRU server to add this feature before it can be used for pre-processing the trace file.  This is necessary to administer Distributed Replay.

image

Once the Management Tools Basic have been installed the server will have to be restarted and then it is possible to make use of the DReplay.Exe executable to administer the Distributed Replay components on the controller server. The DReplay executable has multiple options that can be discovered by using a –? from the command line as follows:

C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Binn>dreplay -?
Info DReplay    Usage:
DReplay.exe {preprocess|replay|status|cancel} [options] [-?]}

Verbs:
preprocess Apply filters and prepare trace data for intermediate file on controller.
replay     Transfer the dispatch files to the clients, launch and synchronize replay.
status     Query and display the current status of the controller.
cancel     Cancel the current operation on the controller.
-?         Display the command syntax summary.

Options:
dreplay preprocess [-m controller] -i input_trace_file -d controller_working_dir [-c config_file] [-f status_interval]
dreplay replay [-m controller] -d controller_working_dir [-o] [-s target_server] -w clients [-c config_file] [-f status_interval]
dreplay status [-m controller] [-f status_interval]
dreplay cancel [-m controller] [-q]
Run dreplay <verb> -? for detailed help on each verb.

To perform the preprocessing, you will need to do a couple of different steps.  The first thing you need to do is edit any options that you want to set for the pre-processing by editing the DReplay.Exe.Preproces.config file in the C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Binn path on the server.  There are two configuration files for DReplay.Exe as highlighted below.  At this time make sure that you are only editing the Preprocess.config file.

image

The DReplay.Exe.Preproces.config file contains a schema defined XML document that controls the configuration of the preprocessing.  In general the options set for preprocessing should not need to be changed but if you want to include system sessions as a part of the replay, you can change the options in the XML, which is listed below.

<?xml version="1.0" encoding="utf-8"?>
<Options>
    <PreprocessModifiers>
        <IncSystemSession>No</IncSystemSession>
        <MaxIdleTime>-1</MaxIdleTime>
    </PreprocessModifiers>
</Options>

To preprocess the trace data, open a new command prompt window and change directories to the C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Binn path.  The trace file has been copied onto the SQL2012-DRU server as C:\DReplay\SQL2012_ReplayTrace.trc.  To preprocess this file first start the “SQL Server Distributed Replay Controller” service by using NET START:

NET START "SQL Server Distributed Replay Controller"

Then execute the following command from within the Binn path to actually preprocess the trace file and output:

dreplay preprocess -i "C:\DReplay\SQL2012_ReplayTrace.trc" -d "C:\DReplay"

This will process the trace file and output the working files for performing the Distributed Replay to the C:\DReplay path.  Below is a screenshot of the full window for preprocessing the trace file.

image

Note: The dreplay executable can be called from any path within the server because the Binn path is a part of the Path Environmental variables.  However, the executable has to be called from within the Binn folder to access the necessary .config files and .xsd schema files for the configuration.  If you want to be able to run this executable from another location on the server, you will need to copy the .config and .xsd files out of the Binn folder to the folder that you want to be able to run dreplay within for it to work.

Performing the Replay

The first step in performing the replay is to start the “SQL Server Distributed Replay Client” service on each of the replay clients using NET START.

NET START "SQL Server Distributed Replay Client"

You will want to verify that each of the clients was able to successfully connect to the controller in the logs as shown in the previous post in this series.  Once this has been done, your environment is almost ready for replay.  For the purposes of this blog series, a SELECT only workload has been generated for replay against AdventureWorks2008R2.  However, in most environments you won’t have a SELECT only workload, so you will have to plan for and prepare your replay environment using a BACKUP/RESTORE of the production database from a point within the captured workload so that the database can be replayed against without having problems associated with Primary Key constraint violations during the replay.

If you want to change any of the parameters associated with the replay operation, you can edit the DReplay.Exe.Replay.config file in the C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Binn path.  The default contents of the configuration file are shown below:

<?xml version="1.0" encoding="utf-8"?>
<Options>
    <ReplayOptions>
        <Server></Server>
        <SequencingMode>stress</SequencingMode>
        <ConnectTimeScale>100</ConnectTimeScale>
        <ThinkTimeScale>100</ThinkTimeScale>
        <HealthmonInterval>60</HealthmonInterval>
        <QueryTimeout>3600</QueryTimeout>
        <ThreadsPerClient>255</ThreadsPerClient>
        <EnableConnectionPooling>No</EnableConnectionPooling>
        <StressScaleGranularity>SPID</StressScaleGranularity>
    </ReplayOptions>
    <OutputOptions>
        <ResultTrace>
            <RecordRowCount>Yes</RecordRowCount>
            <RecordResultSet>No</RecordResultSet>
        </ResultTrace>
    </OutputOptions>
</Options>

Before performing the actual replay, make sure that the account being used to run the SQL Server Distributed Replay Client service has been granted appropriate access to the target SQL Server and database to be able to perform the replay operations.  Once this has been done replay can be performed using the command line options for DReplay.Exe by providing the appropriate switches, or you can alternately provide the –c command line switch to specify the configuration file that should be used for performing the replay.  If you change any of the default values listed above in the DReplay.Exe.Replay.config file, you will need to specify the –c command line switch for those to take effect.  To perform a replay with the defaults, the following command line execution can be run:

dreplay replay -s "SQL2012-DB1" -d "C:\DReplay" -w "SQL2012-DRU1, SQL2012-DRU2"

Once this is executed, the Distributed Replay Controller will take read in the preprocessed replay file, and then synchronize the replay across all of the clients specified with the –w command line parameter.  While the replay operation occurs, the command window for the controller will output periodic updates about the current status of the replay process.

image

The frequency of the status updates can be controlled using the –f command line switch to specify the number of seconds between each of the updates.  Each of the status updates will provide information about each of the clients including the total number of events that have been replayed, the success rate of the replay operations per client, as well as an estimate for the total amount of time remaining to complete the replay operation.  When the replay completes the total elapsed time and pass rate for the events is output.

image

In the next and final post in this series, we’ll look at some of the common problems with using Distributed Replay and how to resolve them, including manually configuring the Controller and add additional Client Service accounts to the environment after Setup has been completed.

Over time, I’ve had a number of reasons to need to run a random workload against SQL Server to be able to demonstrate troubleshooting, how SQLOS works, and most recently how to capture a Replay Trace for my series on the Distributed Replay Utility in SQL Server 2012.  For a while I’ve maintained a large workload script that I would run using multiple sqlcmd command line windows to fire off the workload, but one of the problems with this has been that it was incredibly predictable, and it didn’t scale the way I really wanted it to.

When I was working with Distributed Replay, this became somewhat problematic with generating a randomized workload to capture a Replay Trace off of, so I took a few hours and went back to the drawing board with my idea.  What I came up with was a large script file that contains all of the SELECT statement examples from the SQL Server Books Online (http://msdn.microsoft.com/en-us/library/ms187731.aspx).  This script is divided into separate sections using a delimiter, and then I wrote a PowerShell script that reads the file and breaks it down into individual scripts that are randomly executed against the configured SQL Server using SMO.

The two files required to make use of this are attached to this blog post and can be used with minimal modifications against any SQL Server 2008+ system that has the AdventureWorks2008R2 database attached to it.  To make use of the PowerShell script, you will either have to sign it, or if you work like I do in my VMs, allow unsigned script execution with Set-ExecutionPolicy Unrestricted.

The PowerShell script is incredibly simple code wise.  It loads the SMO assembly, splits the file contents on the delimiter, then inside a infinite loop, it picks a random query and executes it against the SQL Server.

# Load the SMO assembly
[void][reflection.assembly]::LoadWithPartialName("Microsoft.SqlServer.Smo");

# Set the server to run the workload against
$ServerName = "SQL2012-DB1";

# Split the input on the delimeter
$Queries = Get-Content -Delimiter "------" -Path "AdventureWorks BOL Workload.sql"

WHILE(1 -eq 1)
{
    # Pick a Random Query from the input object
    $Query = Get-Random -InputObject $Queries;

    #Get a server object which corresponds to the default instance
    $srv = New-Object -TypeName Microsoft.SqlServer.Management.SMO.Server $ServerName

    # Use the AdventureWorks2008R2 database
    $srv.ConnectionContext.set_DatabaseName("AdventureWorks2008R2")

    # Execute the query with ExecuteNonQuery
    $srv.ConnectionContext.ExecuteNonQuery($Query);

    # Disconnect from the server
    $srv.ConnectionContext.Disconnect();
   
    # Sleep for 100 miliseconds between loops
    Start-Sleep -Milliseconds 100
}

To generate random workloads, I generally fire up 3-5 copies of this script on a client concurrently and leave it running in the background to generate the load. 

AdventureWorks BOL Workload.zip (6.35 kb)

This blog post is the first in a series of posts that will cover how to install, configure, and use the new Distributed Replay Utility in SQL Server 2012.  The distributed replay utility can be used to assess the impact of changes and upgrades by replaying workload activity against a test environment based on a replay trace captured from the current production SQL Server environment.  Prior to SQL Server 2012, two features existed that provided replay capabilities for performance benchmarking and stress testing SQL Server based on a replay workload; SQL Server Profiler trace replay and RML Utilities.  Unlike these two tools, Distributed Replay is not limited to replaying events from a single computer, which allows you to drive higher loads against the test environment to better simulate mission-critical workloads by driving load from multiple clients concurrently.

This blog post will show how to configure a Distributed Replay Controller, using a dedicated server named SQL2012-DRU, and two Distributed Replay Clients using dedicated servers named SQL2012-DRU1 and SQL2012-DRU2.  For the purposes of performing an actual replay using Distributed Replay in another post, a separate server named SQL2012-DB1 with the Database Engine Services and the AdventureWorks2008R2 database installed on it will be used.

Create Replay Service Accounts in Active Directory

The first step in installing and configuring Distributed Replay is to setup the necessary Active Directory service accounts to run the Distributed Replay Client and Distributed Replay Controller under.  For the purposes of service isolation, the Distributed Replay Clients and Distributed Replay Controller will make use of separate Active Directory service Accounts.  For the purposes of this blog post, two different accounts will be created.  The Distributed Replay Client will use the SQLskillsDemos\DRUClientSvcAcct as shown in the below screenshots.

image

The Distributed Replay Controller will use the SQLskillsDemos\DRUCtrlSvcAcct as shown in the below screenshots:

image

Install the Replay Controller

With the service accounts setup in Active Directory, we can begin the installation of our Distributed Replay Controller.  To do this run SQL Server 2012 Setup on the controller and perform a SQL Server Feature Installation.  On the Feature Selection page select the Distributed Replay Client feature.

image

On the Server Configuration page set the Service Account credentials to the previously created Active Directory account for the client; in this case SQLSKILLSDEMOS\DRUCtrlSvcAcct.

image

On the Distributed Replay Controller page, click the Add button and search directory services for the client service account previous created in Active Directory; in this case SQLSKILLSDEMOS\DRUClientSvcAcct.

image

Finish the installation and then close SQL Server Setup.

Install the Replay Clients

With the service accounts setup in Active Directory, we can begin the installation of our Distributed Replay Client machines.  To do this run SQL Server 2012 Setup on each of the clients to be configured and perform a SQL Server Feature Installation.  On the Feature Selection page select the Distributed Replay Client feature.

image

On the Server Configuration page set the Service Account credentials to the previously created Active Directory account for the client.  In this case SQLSKILLSDEMOS\DRUClientSvcAcct.

image

On the Distributed Replay Client page, type in the name of the server that you previously installed the Distributed Replay Controller service on in the Controller Name box.

image

Finish the installation and then close SQL Server Setup.

Configure the Windows Firewall for the Services

In order for the Distributed Replay Clients (SQL2012-DRU1 and SQL2012-DRU2) to connect and register with the Distributed Replay Controller, in this case SQL2012-DRU, the firewall must be configured to allow inbound connections for the DReplayClient.exe application on each of the client machines, as well as for the DReplayController.exe application on the Distributed Replay Controller.  To do this, firewall rules need to be added through the use of the Windows Firewall with Advanced Security snapin, available through Start | Administrative Tools | Windows Firewall with Advanced Security, or through the use of NETSH command line statement executions (detailed later in this blog post). 

To add a new firewall rule for the DReplay Client using the Windows Firewall with Advanced Security snapin, Right-Click on Inbound Rules and click on the New Rule menu item.  Specify Program for the Rule Type and click Next.

image

On the Program page, click Browse and navigate to C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient and select the DReplayClient.exe executable.

image

On the Action page select the option Allow the connection and then click Next.

image

On the Profile page, select the appropriate network profiles/locations for the environment, generally Domain would be sufficient for most corporate domains, and then click Next.

image

Finally provide a Name for the rule and click Finish to make the firewall changes.

image

To setup the firewall rule for the Distributed Replay Controller, follow the same steps but instead of selecting the DReplayClient folder and DReplayClient.exe application, select the DReplayController folder and DReplayController.exe application

image

These rules can also be added using NETSH from the command line using the following commands:

NETSH advfirewall firewall add rule name="Allow DReplay Client" dir=in program="C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\DReplayClient.exe" action=allow

NETSH advfirewall firewall add rule name="Allow DReplay Controller" dir=in program="C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayController\DReplayController.exe" action=allow

Start the Services and Verify Client Registrations

The final step in the process of installing and configuring SQL Server 2012 Distributed Replay is to start the services and verify that the Distributed Replay Clients register with the Distributed Replay Controller correctly.  To do this start the “Distributed Replay Controller” service from the Services.msc snap-in or from the command line using NET START on the controller server.  Then start the “Distributed Replay Client” service on from the Services.msc snap-in or from the command line using NET START on each of the client machines.  As each of the client services starts, verify that the client was able to successfully register itself with the controller by reading the log file, which is located in the following path:

C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\Log

What you want to see in the DReplay Client Log file is:

2011-11-09 00:41:09:162 OPERATIONAL  [Client Service]      Microsoft SQL Server Distributed Replay Client - 11.0.1440.19.
2011-11-09 00:41:09:162 OPERATIONAL  [Client Service]      (c) Microsoft Corporation.
2011-11-09 00:41:09:162 OPERATIONAL  [Client Service]      All rights reserved.
2011-11-09 00:41:09:178 OPERATIONAL  [Client Service]      Current edition is: [Enterprise Evaluation Edition].
2011-11-09 00:41:09:178 OPERATIONAL  [Common]              Initializing dump support.
2011-11-09 00:41:09:178 OPERATIONAL  [Common]              Dump support is ready.
2011-11-09 00:41:09:193 OPERATIONAL  [Client Service]      Windows service "Microsoft SQL Server Distributed Replay Client" has started under service account "SQLSKILLSDEMOS\DRUClientSvcAcct". Process ID is 2928.
2011-11-09 00:41:09:193 OPERATIONAL  [Client Service]      Time Zone: Eastern Standard Time.
2011-11-09 00:41:09:193 OPERATIONAL  [Client Service]      Controller name is "SQL2012-DRU".
2011-11-09 00:41:09:193 OPERATIONAL  [Client Service]      Working directory is "C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\WorkingDir".
2011-11-09 00:41:09:193 OPERATIONAL  [Client Service]      Result directory is "C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\ResultDir".
2011-11-09 00:41:09:193 OPERATIONAL  [Client Service]      Heartbeat Frequency(ms): 3000
2011-11-09 00:41:09:193 OPERATIONAL  [Client Service]      Heartbeats Before Timeout: 3
2011-11-09 00:41:09:367 OPERATIONAL  [Client Service]      Registered with controller "SQL2012-DRU".

What you don’t want to see in the DReplay Client Log file is:

2011-11-09 00:40:50:207 OPERATIONAL  [Client Service]      Microsoft SQL Server Distributed Replay Client - 11.0.1440.19.
2011-11-09 00:40:50:223 OPERATIONAL  [Client Service]      (c) Microsoft Corporation.
2011-11-09 00:40:50:223 OPERATIONAL  [Client Service]      All rights reserved.
2011-11-09 00:40:50:223 OPERATIONAL  [Client Service]      Current edition is: [Enterprise Evaluation Edition].
2011-11-09 00:40:50:223 OPERATIONAL  [Common]              Initializing dump support.
2011-11-09 00:40:50:239 OPERATIONAL  [Common]              Dump support is ready.
2011-11-09 00:40:50:239 OPERATIONAL  [Client Service]      Windows service "Microsoft SQL Server Distributed Replay Client" has started under service account "SQLSKILLSDEMOS\DRUClientSvcAcct". Process ID is 776.
2011-11-09 00:40:50:239 OPERATIONAL  [Client Service]      Time Zone: Eastern Standard Time.
2011-11-09 00:40:50:239 OPERATIONAL  [Client Service]      Controller name is "SQL2012-DRU".
2011-11-09 00:40:50:239 OPERATIONAL  [Client Service]      Working directory is "C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\WorkingDir".
2011-11-09 00:40:50:239 OPERATIONAL  [Client Service]      Result directory is "C:\Program Files (x86)\Microsoft SQL Server\110\Tools\DReplayClient\ResultDir".
2011-11-09 00:40:50:239 OPERATIONAL  [Client Service]      Heartbeat Frequency(ms): 3000
2011-11-09 00:40:50:239 OPERATIONAL  [Client Service]      Heartbeats Before Timeout: 3
2011-11-09 00:40:50:476 CRITICAL     [Client Service]     [0xC8100013] Failed to register with controller SQL2012-DRU.

If the client fails to register, and the controller name is correct, you need to verify first that the firewall rules have been established for the client to allow the appropriate inbound connections from the controller service and that the firewall rules have been established for the controller to allow the appropriate inbound connections from the client services.  If this all checks out, you need to go troubleshoot the permissions and setup of the services in the environment, which I’ll cover in a later post.

Tonight, a question was asked on the #sqlhelp tag on Twitter about how to capture all of the output from a stored procedures execution, to include the informational and error message outputs that may be returned by using PRINT or RAISERROR in the code.  This was a problem I ran into in the past with C# and the way you do it in .NET is to handle the InfoMessage Event for the SqlConnection object using a SqlInfoMessageEventHandler in your code.  The same thing can be done in PowerShell to have these messages written out to the output of a scripts execution.

$conn = New-Object System.Data.SqlClient.SqlConnection "Server=.\LAB1;Database=tempdb;Integrated Security=SSPI;";
$conn.Open();

## Standard default connection with no EventHandler
$cmd = $conn.CreateCommand();
$cmd.CommandText = "PRINT 'This is the message from the PRINT statement'";
$res = $cmd.ExecuteNonQuery();
$cmd.CommandText = "RAISERROR('This is the message from the RAISERROR statement', 10, 1)"; 
$res = $cmd.ExecuteNonQuery();
$conn.Close();

## Attach the InfoMessage Event Handler to the connection to write out the messages
$handler = [System.Data.SqlClient.SqlInfoMessageEventHandler] {param($sender, $event) Write-Host $event.Message };
$conn.add_InfoMessage($handler);
$conn.FireInfoMessageEventOnUserErrors = $true;

$conn.Open();

$cmd = $conn.CreateCommand();
$cmd.CommandText = "PRINT 'This is the message from the PRINT statement'";
$res = $cmd.ExecuteNonQuery();
$cmd.CommandText = "RAISERROR('This is the message from the RAISERROR statement', 10, 1)"; 
$res = $cmd.ExecuteNonQuery();
$conn.Close();

The bold code region above creates the EventHandler so that it writes the Message out to the host and then adds the handler to the InfoMessage Event for the connection.  The last line of code specifies for the event to fire on lower priority user errors as well.

While working on validating my demos for the 24 Hours of PASS and my PASS Summit 2011 Precon – Extended Events Deep Dive, I noticed a significant, and breaking change to the Event XML output for the raw event data in the ring_buffer and file_target in SQL Server Denali.  In SQL Server 2008 and 2008R2, the Event XML represented the output of XML data elements differently than in SQL Server Denali CTP3.  A good example of this is the xml_deadlock_report output, which I previously discussed in my SQL Server Central article,

In SQL Server 2008 and 2008R2, the query to retrieve the deadlock graph from the system_health session was (excluding the work around that was included in the original article since the xml_deadlock_report was fixed in a later Cumulative Update and the latest Service Pack for SQL Server 2008 and 2008 R2).

SELECT
    CAST(event_data.value('(data/value)[1]', 'varchar(max)')) AS XML) AS DeadlockGraph
FROM
(   SELECT XEvent.query('.') AS event_data
    FROM
    (    -- Cast the target_data to XML
        SELECT CAST(target_data AS XML) AS TargetData
        FROM sys.dm_xe_session_targets st
        JOIN sys.dm_xe_sessions s
            ON s.address = st.event_session_address
        WHERE name = 'system_health'
          AND target_name = 'ring_buffer'
    ) AS Data
    -- Split out the Event Nodes
    CROSS APPLY TargetData.nodes ('RingBufferTarget/event[@name="xml_deadlock_report"]') AS XEventData (XEvent)  
) AS tab (event_data)

If you run this same code in SQL Server Denali CTP3, the output will not be the xml_deadlock_report but instead the textual data that was included in the sub-nodes of the value node for the xml_deadlock_report events in the target.  This unfortunately has broken a number of my scripts that were initially written for SQL Server 2008 and 2008R2 that expect the XML output as text in the value element as follows:

<event name="xml_deadlock_report" package="sqlserver" id="123" version="1" timestamp="2011-09-15T23:29:02.851Z">
  <data name="xml_report">
    <type name="unicode_string" package="package0" />
    <value>&lt;deadlock-list&gt;
&lt;victim-list&gt;
  &lt;victimProcess id="process806e2088"/&gt;
  &lt;process-list&gt;
   &lt;process id="process806e2088" taskpriority="0" logused="10000" waitresource="DATABASE: 15 " waittime="1477" schedulerid="2" kpid="3720" status="suspended" spid="57" sbid="0" ecid="0" priority="0" trancount="1" lastbatchstarted="2011-09-15T19:29:01.370" lastbatchcompleted="2011-09-15T19:27:21.193" clientapp="Microsoft SQL Server Management Studio - Query" hostname="SQL2K8R2-IE2" hostpid="4464" loginname="SQLSKILLSDEMOS\administrator" isolationlevel="read committed (2)" xactid="68641" currentdb="15" lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200"&gt;
    &lt;executionStack&gt;
     &lt;frame procname="" line="1" sqlhandle="0x01000100721ac42240ff1285000000000000000000000000"&gt;
     &lt;/frame&gt;
    &lt;/executionStack&gt;
    &lt;inputbuf&gt;
ALTER DATABASE DemoNCIndex SET MULTI_USER
    &lt;/inputbuf&gt;
   &lt;/process&gt;
   &lt;process id="process469b88" taskpriority="0" logused="10000" waitresource="DATABASE: 15 " waittime="1892" schedulerid="2" kpid="4188" status="suspended" spid="58" sbid="0" ecid="0" priority="0" trancount="0" lastbatchstarted="2011-09-15T19:29:00.957" lastbatchcompleted="2011-09-15T19:29:00.947" clientapp="Microsoft SQL Server Management Studio - Transact-SQL IntelliSense" hostname="SQL2K8R2-IE2" hostpid="4464" loginname="SQLSKILLSDEMOS\administrator" isolationlevel="read committed (2)" xactid="68638" currentdb="15" lockTimeout="4294967295" clientoption1="671088672" clientoption2="128056"&gt;
    &lt;executionStack&gt;
     &lt;frame procname="" line="1" sqlhandle="0x010001008af5b714605a1f85000000000000000000000000"&gt;
     &lt;/frame&gt;
    &lt;/executionStack&gt;
    &lt;inputbuf&gt;
use [DemoNCIndex]    &lt;/inputbuf&gt;
   &lt;/process&gt;
  &lt;/process-list&gt;
  &lt;resource-list&gt;
   &lt;databaselock subresource="FULL" dbid="15" dbname="" id="lock83168d80" mode="S"&gt;
    &lt;owner-list&gt;
     &lt;owner id="process469b88" mode="S"/&gt;
    &lt;/owner-list&gt;
    &lt;waiter-list&gt;
     &lt;waiter id="process806e2088" mode="X" requestType="wait"/&gt;
    &lt;/waiter-list&gt;
   &lt;/databaselock&gt;
   &lt;databaselock subresource="FULL" dbid="15" dbname="" id="lock83168d80" mode="S"&gt;
    &lt;owner-list&gt;
     &lt;owner id="process806e2088" mode="S"/&gt;
     &lt;owner id="process806e2088" mode="S"/&gt;
    &lt;/owner-list&gt;
    &lt;waiter-list&gt;
     &lt;waiter id="process469b88" mode="X" requestType="wait"/&gt;
    &lt;/waiter-list&gt;
   &lt;/databaselock&gt;
  &lt;/resource-list&gt;
&lt;/deadlock&gt;
&lt;/deadlock-list&gt;
</value>
    <text />
  </data>
</event>

Instead in SQL Server Denali CTP3, the event output is as follows:

<event name="xml_deadlock_report" package="sqlserver" timestamp="2011-09-17T18:49:03.654Z">
  <data name="xml_report">
    <type name="xml" package="package0" />
    <value>
      <deadlock>
        <victim-list>
          <victimProcess id="processf7034a18" />
        </victim-list>
        <process-list>
          <process id="processf7034a18" taskpriority="0" logused="144" waitresource="RID: 2:1:281:0" waittime="2394" ownerId="162349" transactionname="user_transaction" lasttranstarted="2011-09-17T11:48:48.410" XDES="0xff047120" lockMode="S" schedulerid="2" kpid="692" status="suspended" spid="58" sbid="0" ecid="0" priority="0" trancount="1" lastbatchstarted="2011-09-17T11:49:01.247" lastbatchcompleted="2011-09-17T11:48:48.410" lastattention="2011-09-17T11:39:47.393" clientapp="Microsoft SQL Server Management Studio - Query" hostname="WIN-QSTGAPD63IN" hostpid="3004" loginname="WIN-QSTGAPD63IN\Administrator" isolationlevel="read committed (2)" xactid="162349" currentdb="2" lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200">
            <executionStack>
              <frame procname="adhoc" line="1" sqlhandle="0x02000000303b01237c6994b4eab30fb77cbb5a8e46f2b2540000000000000000000000000000000000000000">
SELECT Column2
FROM TableB    </frame>
            </executionStack>
            <inputbuf>
SELECT Column2
FROM TableB   </inputbuf>
          </process>
          <process id="processf7035168" taskpriority="0" logused="144" waitresource="RID: 2:1:271:0" waittime="7494" ownerId="162369" transactionname="user_transaction" lasttranstarted="2011-09-17T11:48:53.693" XDES="0xf7044dd0" lockMode="S" schedulerid="2" kpid="3244" status="suspended" spid="60" sbid="0" ecid="0" priority="0" trancount="1" lastbatchstarted="2011-09-17T11:48:56.150" lastbatchcompleted="2011-09-17T11:48:53.693" lastattention="1900-01-01T00:00:00.693" clientapp="Microsoft SQL Server Management Studio - Query" hostname="WIN-QSTGAPD63IN" hostpid="3004" loginname="WIN-QSTGAPD63IN\Administrator" isolationlevel="read committed (2)" xactid="162369" currentdb="2" lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200">
            <executionStack>
              <frame procname="adhoc" line="2" stmtstart="4" sqlhandle="0x020000002e8952007a6c36a78a2aa436877a27f57a0725c80000000000000000000000000000000000000000">
SELECT Column1
FROM TableA    </frame>
            </executionStack>
            <inputbuf>

SELECT Column1
FROM TableA   </inputbuf>
          </process>
        </process-list>
        <resource-list>
          <ridlock fileid="1" pageid="281" dbid="2" objectname="tempdb.dbo.TABLEB" id="lockf7d4ff80" mode="X" associatedObjectId="2161727822326792192">
            <owner-list>
              <owner id="processf7035168" mode="X" />
            </owner-list>
            <waiter-list>
              <waiter id="processf7034a18" mode="S" requestType="wait" />
            </waiter-list>
          </ridlock>
          <ridlock fileid="1" pageid="271" dbid="2" objectname="tempdb.dbo.TABLEA" id="lockf7d51380" mode="X" associatedObjectId="2089670228247904256">
            <owner-list>
              <owner id="processf7034a18" mode="X" />
            </owner-list>
            <waiter-list>
              <waiter id="processf7035168" mode="S" requestType="wait" />
            </waiter-list>
          </ridlock>
        </resource-list>
      </deadlock>
    </value>
  </data>
</event>

If you compare the two bold sections to each other you will notice the difference.  In SQL Server 2008 and 2008R2, the value element is XML escaped entirely as text, but in SQL Server Denali CTP3, the value attribute contains a valid XML document as a child node in the XML itself.  This has a significant impact to how you actually access the XML data in Denali CTP3.  To read the XML Document, you have to switch from using the .value() XML function along with a CAST() operation to using a .query() operation on the Event XML specifying the deadlock node as a part of the .query() XPATH for it as shown in the following code example:

SELECT
    event_data.query('(event/data/value/deadlock)[1]') AS DeadlockGraph
FROM
(   SELECT XEvent.query('.') AS event_data
    FROM
    (    -- Cast the target_data to XML
        SELECT CAST(target_data AS XML) AS TargetData
        FROM sys.dm_xe_session_targets st
        JOIN sys.dm_xe_sessions s
            ON s.address = st.event_session_address
        WHERE name = 'system_health'
          AND target_name = 'ring_buffer'
    ) AS Data
    -- Split out the Event Nodes
    CROSS APPLY TargetData.nodes ('RingBufferTarget/event[@name="xml_deadlock_report"]') AS XEventData (XEvent)  
) AS tab (event_data)

This same thing applies to all of the XML data elements including the sqlserver.tsql_stack and sqlserver.tsql_frame Actions.  In addition other actions such as the sqlserver.plan_handle have similar changes that require changing the code to process the Event XML to capture the values being output.

For the last few years, I have always had a personal virtual playground for SQL Server setup on my laptop that allowed me to not only learn about new features in SQL Server, but also demonstrate complex topics hands on during presentations.  I recently had to rebuild my playground from the ground up and as a part of doing this; I decided that I would blog about all the steps I took to build a completely FREE environment to play with SQL Server, including a virtual iSCSI “SAN” to support failover clustering.  I have been using VMware Workstation personally for the last few years as the platform for my playground, but only because I received a free license a few years ago when I spoke about virtualizing SQL Server using ESX at the VMware Open Forum in Orlando, FL.  One of the requirements for building this playground is that the virtual machines have to support 64 bit guests, which is not supported by Microsoft Virtual PC but is supported by VMware Workstation.  However, as much as I love my copy of VMware Workstation, it isn’t free and you may not have the budget to purchase it as a base platform, so for the purposes of this series I decided to go with VirtualBox, which is a completely free alternative that also supports 64 bit guest VMs.

To get started you are going to have to download and install VirtualBox from their site:

http://www.virtualbox.org/

While we are covering downloads, the following additional items will be needed to setup the environment:

Windows Server 2008 R2 with Service Pack 1 Evaluation Edition
SQL Server 2008 R2 Evaluation Edition
SQL Server Denali CTP3
Microsoft iSCSI Software Target 3.3 for Windows Server 2008 R2
VMware ESXi VSphere 4.1 Evaluation 

With VirtualBox installed, I start out by building a “base” VM installation that I clone to create all of the other VMs that I need in my playground.  By creating a standardized “base” or template VM I save a ton of time in the long term by not having to install an patch Windows Server 2008 R2 a half dozen times, I can do it once and then by using the sysprep functionality in Windows, I can reuse the image for each of my servers.  I can also add the basic set of features to the template, for example .NET Framework 3.5.1, Failover Clustering, and Multipath I/O to keep from having to manually add them to my VMs that specifically need them.  Some of the VMs that get created from the clone will not need these features specifically, for example the Active Directory Domain Controller we are going to create, but these features don’t impact the ability of the machine to perform as a Domain Controller for the environment.

To get started building the base VM template, create a new VM in VirtualBox (Machine > New) and configure the VM options.  For my template, I chose to allocate 1024MB RAM and create a dynamically expanding virtual hard disk that was 30GB in size, using the Windows Server 2008 x64 template for the VM.  Once the VM is created, you will need to edit the VMs settings to configure the additional networking, and if appropriate additional vCPUs.   For the Networking configuration, I leave NAT enabled for Adapter 1, and then configure internal networking for Adapter 2, 3 and 4 with Adapter 2 as a Internal Network named “Domain Network” and Adapter 3 and 4 on another Internal Network named “iSCSI Network”. 

image      image

Once the virtual networking has been configured for the VM, the last thing to do before powering the VM on is to set the CD/DVD device to the Windows Server 2008 R2 Evaluation Edition ISO file that was downloaded from Microsoft.

image

Once the VM boots it will immediately begin running the Windows Server 2008 Setup off the ISO and you will be on your way to setting up the template VM to base the rest of the servers needed in the environment off of.  Once Windows Server 2008 R2 is installed, I add the three features I mentioned previously in this post to the VM; .NET Framework 3.5.1, Failover Clustering, and Multipath I/O.  Once these have been installed, I run Windows Update to download and apply the most recent updates for Windows to the Server and then allow it to reboot.  Once the VM reboots, depending on how anal retentive I intend to be with the environment, I may or may not disable the Windows Firewall, but as a best practice it should be left turned on.  After I complete the configuration of the “base” template I want for the OS, it is time to run sysprep and reset the VM image for cloning to create multiple VMs based on its image.  Sysprep can be executed from Start > Run.  The options for sysprep should be set to “Enter System Out-of-Box Experience (OOBE)” and the checkbox for Generalize should be checked.

image

Once the VM shuts down it can be used to create clone VMs to build out the necessary infrastructure for the playground.  In the next post in this series, we’ll build the VM clones for the environments Active Directory Domain Controller and iSCSI “virtual” SAN using the Windows Server iSCSI Target.

At SQL Connections, I presented a session titled “Learn SQL Server Internals with Extended Events” where I demonstrated a number ways to use Extended Events to learn about the internal workings of the database engine for SQL Server.  The morning of the session I was chatting with someone about a problem they had seen and the topic of proportional fill came up and how the database engine stripes data across multiple files in a user database.  From this discussion, I got the idea to play around with multiple database files and built a demo using Extended Events that showed how proportional fill worked inside of the database engine.  This wasn’t a planned demo for my presentation, and I had plenty of other demo’s that showed various SQL Server Internals, but it became a really good demo that I decided to throw into the mix, which put me way over budget for time.  I decided to leave it up the audience which demo they wanted to see for the last demo of the session, an originally planned one, or the one I wrote that morning for proportional fill; the majority wanted to see the one on proportional fill and it turned out to be the best demo of the entire session based on the crowd interest and feedback.  What was most interesting to me was the number of people attending the session that had never heard of the concept of proportional fill with SQL Server.  Proportional fill is the algorithm used by SQL Server to determine how much information is written to each of the files in a multi-file filegroup based on the proportion of free space within each file; which allows the files to become full at approximately the same time.  Proportional fill has nothing to do with the actual file sizes, it is strictly based on the free space within a file. 

To demonstrate proportional fill in SQL Server, we’ll take a look at how the page writes and I/O operations are distributed across varying configurations for a test database using the same test and event session for each configuration

Basic Example

To look at how proportional fill functions, we’ll start with a basic example, using a database with a separate filegroup for user objects that has been marked as the default filegroup for the database.  The UserObjects filegroup will have four data files, each 32MB in size.

-- Create a multi-file database with evenly sized files
CREATE DATABASE [MultipleDataFiles] ON  PRIMARY 
( NAME = N'MultipleDataFiles', 
    FILENAME = N'H:\SQLData\MultipleDataFiles.mdf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
FILEGROUP [UserObjects] 
( NAME = N'MultipleDataFiles_UserObjects1', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects1.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
( NAME = N'MultipleDataFiles_UserObjects2', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects2.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
( NAME = N'MultipleDataFiles_UserObjects3', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects3.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
( NAME = N'MultipleDataFiles_UserObjects4', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects4.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB )
 LOG ON 
( NAME = N'MultipleDataFiles_log', 
    FILENAME = N'L:\SQLLogs\MultipleDataFiles_log.ldf' , 
    SIZE = 131072KB , 
    FILEGROWTH = 32768KB )
GO
ALTER DATABASE [MultipleDataFiles] 
MODIFY FILEGROUP [UserObjects] DEFAULT
GO

With the database created, we’ll create a table to load data into for our tests based on the AdventureWorks SalesOrderHeader table which will be the source for the test data:

- Create a table to load data into for the tests
USE [MultipleDataFiles]
GO
IF OBJECT_ID('SalesOrderHeader') IS NOT NULL
    DROP TABLE [dbo].[SalesOrderHeader]
GO
SET NOCOUNT ON
GO
CREATE TABLE [dbo].[SalesOrderHeader](
    [SalesOrderID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    [RevisionNumber] [tinyint] NOT NULL,
    [OrderDate] [datetime] NOT NULL,
    [DueDate] [datetime] NOT NULL,
    [ShipDate] [datetime] NULL,
    [Status] [tinyint] NOT NULL,
    [OnlineOrderFlag] [bit] NOT NULL,
    [SalesOrderNumber] [nvarchar](25) NOT NULL,
    [PurchaseOrderNumber] [nvarchar](25) NULL,
    [AccountNumber] [nvarchar](15) NULL,
    [CustomerID] [int] NOT NULL,
    [SalesPersonID] [int] NULL,
    [TerritoryID] [int] NULL,
    [BillToAddressID] [int] NOT NULL,
    [ShipToAddressID] [int] NOT NULL,
    [ShipMethodID] [int] NOT NULL,
    [CreditCardID] [int] NULL,
    [CreditCardApprovalCode] [varchar](15) NULL,
    [CurrencyRateID] [int] NULL,
    [SubTotal] [money] NOT NULL,
    [TaxAmt] [money] NOT NULL,
    [Freight] [money] NOT NULL,
    [TotalDue] [money] NOT NULL,
    [Comment] [nvarchar](128) NULL,
    [rowguid] [uniqueidentifier] NOT NULL,
    [ModifiedDate] [datetime] NOT NULL
) 
GO

Now we’ll create our event session using dynamic SQL to add the database_id for our test database to the predicate for each of the events, restricting them to firing only for our test database to minimize the need to filter through the event session data later on.  The event session is going to collect the sqlserver.checkpoint_begin, sqlserver.checkpoint_end. sqlserver.file_written, sqlserver.file_write_completed, sqlserver.physical_page_write, sqlos.async_io_requested, and sqlos.async_io_completed events.  The checkpoint events are included in the event session to show that writes don’t immediately begin to occur to the data files, but instead occur in response to the checkpoint operations in the database engine.

-- Create our Event Session dynamically
IF EXISTS(SELECT * FROM sys.server_event_sessions WHERE name='MultipleDataFiles')
    DROP EVENT SESSION [MultipleDataFiles] ON SERVER;
DECLARE @sqlcmd nvarchar(4000) = '
CREATE EVENT SESSION MultipleDataFiles
ON SERVER
ADD EVENT sqlserver.checkpoint_begin
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.checkpoint_end
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.file_written
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.file_write_completed
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.physical_page_write
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlos.async_io_requested
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlos.async_io_completed
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+'))--,
ADD TARGET package0.asynchronous_file_target(
     SET filename=''C:\SQLskills\MultipleDataFiles.xel'',
         metadatafile=''C:\SQLskills\MultipleDataFiles.xem'')
WITH (MAX_MEMORY = 8MB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, TRACK_CAUSALITY = ON, MAX_DISPATCH_LATENCY=5SECONDS)'
EXEC (@sqlcmd)

With the event session created we can run our data load:

-- Start the Event Session
ALTER EVENT SESSION MultipleDataFiles
ON SERVER
STATE=START
GO
 
-- Load data into the test database
INSERT INTO dbo.SalesOrderHeader
SELECT RevisionNumber, 
    DATEADD(DD, 1126+number, OrderDate), 
    DATEADD(DD, 1126+number, DueDate), 
    DATEADD(DD, 1126+number, ShipDate), 
    soh.Status, OnlineOrderFlag, SalesOrderNumber, 
    PurchaseOrderNumber, AccountNumber, 
    CustomerID, SalesPersonID, TerritoryID, 
    BillToAddressID, ShipToAddressID, 
    ShipMethodID, CreditCardID, CreditCardApprovalCode, 
    CurrencyRateID, SubTotal, TaxAmt, Freight, 
    TotalDue, Comment, rowguid, 
    DATEADD(DD, 1126+number, ModifiedDate)
FROM AdventureWorks2008R2.Sales.SalesOrderHeader AS soh
CROSS JOIN master.dbo.spt_values AS sv
WHERE sv.type = N'P'
  AND sv.number > 0 AND sv.number < 6
GO 3
 
-- Flush all dirty pages to disk
CHECKPOINT
GO
 
-- Stop the Event Session
ALTER EVENT SESSION MultipleDataFiles
ON SERVER
STATE=STOP
GO

The above script will load roughly 92MB of data into the test table which makes the insert operation smaller than the size of all four data files.  Note that there is an explicit checkpoint in the test to force all dirty pages to be written to the appropriate data files on disk.  Without this checkpoint, the file writes may appear to be imbalanced incorrectly due to the timing of the last automatic checkpoint and the dirty pages in cache when it occurred.  By manually checkpointing the database before ending the event session we ensure we have the file writes captured accurately.  To view the information captured by the event session we will read the event data into a table and then shred the XML into another intermediate table to allow for further analysis of the detailed information if you so desire.  Finally we’ll aggregate the events and pivot the results based on the file_id to see how SQL Server wrote to the database files for the database. 

-- Drop Results tables if they exist
IF OBJECT_ID('MultipleDataFileResults') IS NOT NULL
    DROP TABLE MultipleDataFileResults 
GO
IF OBJECT_ID('MultipleDataFileResultsParsed') IS NOT NULL
    DROP TABLE MultipleDataFileResultsParsed 
GO
 
-- Create results table to load data from XE files
CREATE TABLE MultipleDataFileResults
(RowID int identity primary key, event_data XML)
GO
 
-- Load the event data from the file target
INSERT INTO MultipleDataFileResults(event_data)
SELECT
    CAST(event_data AS XML) AS event_data
FROM sys.fn_xe_file_target_read_file('C:\SQLskills\MultipleDataFiles*.xel', 
                                     'C:\SQLskills\MultipleDataFiles*.xem', 
                                     null, null)
GO
 
-- Parse the event data
SELECT 
    RowID,
    event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
    COALESCE(event_data.value('(event/data[@name="database_id"]/value)[1]', 'int'), 
             event_data.value('(event/action[@name="database_id"]/value)[1]', 'int')) AS database_id,
    event_data.value('(event/data[@name="mode"]/text)[1]', 'nvarchar(4000)') AS [mode],
    event_data.value('(event/data[@name="file_handle"]/value)[1]', 'nvarchar(4000)') AS [file_handle],
    event_data.value('(event/data[@name="offset"]/value)[1]', 'bigint') AS [offset],
    event_data.value('(event/data[@name="page_id"]/value)[1]', 'int') AS [page_id],
    event_data.value('(event/data[@name="file_id"]/value)[1]', 'int') AS [file_id],
    event_data.value('(event/data[@name="file_group_id"]/value)[1]', 'int') AS [file_group_id],
    event_data.value('(event/data[@name="wait_type"]/text)[1]', 'nvarchar(100)') AS [wait_type],
    event_data.value('(event/data[@name="duration"]/value)[1]', 'bigint') AS [duration],
    event_data.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(4000)') AS [sql_text],
    event_data.value('(event/data[@name="cpu"]/value)[1]', 'int') AS [cpu],
    event_data.value('(event/data[@name="reads"]/value)[1]', 'bigint') AS [reads],
    event_data.value('(event/data[@name="writes"]/value)[1]', 'bigint') AS [writes],
    CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 1, 36) AS uniqueidentifier) as activity_id,
    CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 38, 10) AS int) as event_sequence
INTO MultipleDataFileResultsParsed
FROM MultipleDataFileResults
ORDER BY Rowid
 
-- Aggregate the results by the event name and file_id
SELECT file_id, [file_write_completed],[file_written], [physical_page_write]
FROM
(    SELECT event_name, file_id, COUNT(*) AS occurences
    FROM MultipleDataFileResultsParsed
    WHERE event_name IN ('file_write_completed', 'file_written', 'physical_page_write')
    GROUP BY event_name, file_id
) AS tab
PIVOT
(    MAX(occurences) 
    FOR event_name IN ([file_write_completed],[file_written], [physical_page_write])) AS pvt

The output of our pivot operation shows that the four data files in the UserObjects filegroup are written to relatively evenly, which should be expected based on the amount of free space being equal across the four data files.  One item that should become incredibly apparent is the importance of the transaction log which has twenty-three times the file writes occurring to it than the nearest data file. 

image

Different file size but same free space

As previously stated in the intro to this blog post, proportional fill is based on the amount of free space in each file in relation to the other files and not the actual file sizes themselves.  To demonstrate this, we’ll create a single file database with our table and then load approximately 31MB of data into the table.  Then we’ll increase the size of the first file to 63MB and add three additional files that are 32MB each to the database.

-- Delete target files from previous tests
EXECUTE sp_configure 'show advanced options', 1; RECONFIGURE;
EXECUTE sp_configure 'xp_cmdshell', 1; RECONFIGURE; 
EXEC xp_cmdshell 'DEL C:\SQLskills\MultipleDataFiles*';
EXECUTE sp_configure 'xp_cmdshell', 0; RECONFIGURE;
EXECUTE sp_configure 'show advanced options', 0; RECONFIGURE;
 
-- Drop the test database from the server
USE [master]
GO
IF DB_ID('MultipleDataFiles') IS NOT NULL
BEGIN
    ALTER DATABASE [MultipleDataFiles] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE [MultipleDataFiles];
END
GO
-- Create a single-file database 
CREATE DATABASE [MultipleDataFiles] ON  PRIMARY 
( NAME = N'MultipleDataFiles', 
    FILENAME = N'H:\SQLData\MultipleDataFiles.mdf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
FILEGROUP [UserObjects] 
( NAME = N'MultipleDataFiles_UserObjects1', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects1.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB )
LOG ON 
( NAME = N'MultipleDataFiles_log', 
    FILENAME = N'L:\SQLLogs\MultipleDataFiles_log.ldf' , 
    SIZE = 131072KB , 
    FILEGROWTH = 32768KB )
GO
ALTER DATABASE [MultipleDataFiles] 
MODIFY FILEGROUP [UserObjects] DEFAULT
GO
 
-- Create a table to load data into for the tests
USE [MultipleDataFiles]
GO
CREATE TABLE [dbo].[SalesOrderHeader](
    [SalesOrderID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    [RevisionNumber] [tinyint] NOT NULL,
    [OrderDate] [datetime] NOT NULL,
    [DueDate] [datetime] NOT NULL,
    [ShipDate] [datetime] NULL,
    [Status] [tinyint] NOT NULL,
    [OnlineOrderFlag] [bit] NOT NULL,
    [SalesOrderNumber] [nvarchar](25) NOT NULL,
    [PurchaseOrderNumber] [nvarchar](25) NULL,
    [AccountNumber] [nvarchar](15) NULL,
    [CustomerID] [int] NOT NULL,
    [SalesPersonID] [int] NULL,
    [TerritoryID] [int] NULL,
    [BillToAddressID] [int] NOT NULL,
    [ShipToAddressID] [int] NOT NULL,
    [ShipMethodID] [int] NOT NULL,
    [CreditCardID] [int] NULL,
    [CreditCardApprovalCode] [varchar](15) NULL,
    [CurrencyRateID] [int] NULL,
    [SubTotal] [money] NOT NULL,
    [TaxAmt] [money] NOT NULL,
    [Freight] [money] NOT NULL,
    [TotalDue] [money] NOT NULL,
    [Comment] [nvarchar](128) NULL,
    [rowguid] [uniqueidentifier] NOT NULL,
    [ModifiedDate] [datetime] NOT NULL
) 
GO
 
-- Load ~31MB of data into the test database
INSERT INTO dbo.SalesOrderHeader
SELECT RevisionNumber, 
    DATEADD(DD, 1126+number, OrderDate), 
    DATEADD(DD, 1126+number, DueDate), 
    DATEADD(DD, 1126+number, ShipDate), 
    soh.Status, OnlineOrderFlag, SalesOrderNumber, 
    PurchaseOrderNumber, AccountNumber, 
    CustomerID, SalesPersonID, TerritoryID, 
    BillToAddressID, ShipToAddressID, 
    ShipMethodID, CreditCardID, CreditCardApprovalCode, 
    CurrencyRateID, SubTotal, TaxAmt, Freight, 
    TotalDue, Comment, rowguid, 
    DATEADD(DD, 1126+number, ModifiedDate)
FROM AdventureWorks2008R2.Sales.SalesOrderHeader AS soh
CROSS JOIN master.dbo.spt_values AS sv
WHERE sv.type = N'P'
  AND sv.number > 0 AND sv.number < 6
 
-- Grow the first data file to 63MB leaving 32MB free space
ALTER DATABASE [MultipleDataFiles] 
MODIFY FILE (    NAME = N'MultipleDataFiles_UserObjects1', 
                SIZE = 64512KB )
GO
 
-- Add second file with 32MB size
ALTER DATABASE [MultipleDataFiles] 
ADD FILE (    NAME = N'MultipleDataFiles_UserObjects2', 
            FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects2.ndf' , 
            SIZE = 32768KB , 
            FILEGROWTH = 32768KB ) 
TO FILEGROUP [UserObjects]
GO
 
-- Add third file with 32MB size
ALTER DATABASE [MultipleDataFiles] 
ADD FILE (    NAME = N'MultipleDataFiles_UserObjects3', 
            FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects3.ndf' , 
            SIZE = 32768KB , 
            FILEGROWTH = 32768KB ) 
TO FILEGROUP [UserObjects]
GO
 
-- Add fourth file with 32MB size
ALTER DATABASE [MultipleDataFiles] 
ADD FILE (    NAME = N'MultipleDataFiles_UserObjects4', 
            FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects4.ndf' , 
            SIZE = 32768KB , 
            FILEGROWTH = 32768KB ) 
TO FILEGROUP [UserObjects]
GO

With new database setup, the exact same test from the first demo can be run to view how proportional fill functions with one data file larger than the others, but with the same free space.

image

The impact of different free space amounts

Since proportional fill is free space based, lets look at the impact that having different free space in one file has to the writes that occur.  To setup this test, the database will be created with one file sized at 64MB and remaining files sized at 32MB.

-- Delete target files from previous tests
EXECUTE sp_configure 'show advanced options', 1; RECONFIGURE;
EXECUTE sp_configure 'xp_cmdshell', 1; RECONFIGURE; 
EXEC xp_cmdshell 'DEL C:\SQLskills\MultipleDataFiles*';
EXECUTE sp_configure 'xp_cmdshell', 0; RECONFIGURE;
EXECUTE sp_configure 'show advanced options', 0; RECONFIGURE;
 
-- Drop the database from the server
USE [master]
GO
IF DB_ID('MultipleDataFiles') IS NOT NULL
BEGIN
    ALTER DATABASE [MultipleDataFiles] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE [MultipleDataFiles];
END
GO
-- Create a multi-file database with one file larger than the others
CREATE DATABASE [MultipleDataFiles] ON  PRIMARY 
( NAME = N'MultipleDataFiles', 
    FILENAME = N'H:\SQLData\MultipleDataFiles.mdf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
FILEGROUP [UserObjects] 
( NAME = N'MultipleDataFiles_UserObjects1', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects1.ndf' , 
    SIZE = 65536KB , 
    FILEGROWTH = 32768KB ), 
( NAME = N'MultipleDataFiles_UserObjects2', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects2.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
( NAME = N'MultipleDataFiles_UserObjects3', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects3.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
( NAME = N'MultipleDataFiles_UserObjects4', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects4.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB )
 LOG ON 
( NAME = N'MultipleDataFiles_log', 
    FILENAME = N'L:\SQLLogs\MultipleDataFiles_log.ldf' , 
    SIZE = 131072KB , 
    FILEGROWTH = 32768KB )
GO
ALTER DATABASE [MultipleDataFiles] 
MODIFY FILEGROUP [UserObjects] DEFAULT
GO

The exact same test can be rerun and when the event data is parsed, the 64MB file will show roughly twice as many write operations and pages as the 32MB files, right in proportion to its free space.

  image

The impact of autogrowth

One question I had after beginning to look at this was, what impact autogrowth would have on data files that were evenly sized, and using the same sizes for autogrowth? We could easily test this by upping the number of executions for our insert operation from the original test to force the database to grow, but I was really interested in the impact over repeated growth operations, and being impatient I wasn’t really willing to wait for large insert operations to complete.  I instead went back a recreated the database using 8MB data files set to grow by 8MB.  Then I changed the GO 3 batch terminator for the INSERT in the test to a GO 5 to retest the impact of autogrowth.

-- Delete target files from previous tests
EXECUTE sp_configure 'show advanced options', 1; RECONFIGURE;
EXECUTE sp_configure 'xp_cmdshell', 1; RECONFIGURE; 
EXEC xp_cmdshell 'DEL C:\SQLskills\MultipleDataFiles*';
EXECUTE sp_configure 'xp_cmdshell', 0; RECONFIGURE;
EXECUTE sp_configure 'show advanced options', 0; RECONFIGURE;
 
-- Drop the database from the server
USE [master]
GO
IF DB_ID('MultipleDataFiles') IS NOT NULL
BEGIN
    ALTER DATABASE [MultipleDataFiles] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE [MultipleDataFiles];
END
GO
-- Create a multi-file database with one file larger than the others
CREATE DATABASE [MultipleDataFiles] ON  PRIMARY 
( NAME = N'MultipleDataFiles', 
    FILENAME = N'H:\SQLData\MultipleDataFiles.mdf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
FILEGROUP [UserObjects] 
( NAME = N'MultipleDataFiles_UserObjects1', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects1.ndf' , 
    SIZE = 8192KB , 
    FILEGROWTH = 8192KB ), 
( NAME = N'MultipleDataFiles_UserObjects2', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects2.ndf' , 
    SIZE = 8192KB , 
    FILEGROWTH = 8192KB ), 
( NAME = N'MultipleDataFiles_UserObjects3', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects3.ndf' , 
    SIZE = 8192KB , 
    FILEGROWTH = 8192KB ), 
( NAME = N'MultipleDataFiles_UserObjects4', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects4.ndf' , 
    SIZE = 8192KB , 
    FILEGROWTH = 8192KB )
 LOG ON 
( NAME = N'MultipleDataFiles_log', 
    FILENAME = N'L:\SQLLogs\MultipleDataFiles_log.ldf' , 
    SIZE = 131072KB , 
    FILEGROWTH = 32768KB )
GO
ALTER DATABASE [MultipleDataFiles] 
MODIFY FILEGROUP [UserObjects] DEFAULT
GO

The outcome of the autogrowth tests at five iterations for the INSERT batch seemed odd to me because one of the files was lagging the other three significantly.

image

I wasn’t satisfied with the initial results of a couple of iterations with five batch executions for the INSERT so I decided to validate the results at various scales including ten, twenty and fifty iterations of the INSERT operation.

Using GO 10

image

Using GO 20

image

Using GO 50

image

At the end of each of these series of tests, the files were always within a single autogrowth size of each other, so it is obvious that proportional fill is keeping things relatively equal throughout the tests, but there is the potential for hot spotting of a single data file when auto grow occurs, at least until the other data files grow as well.  In this test the auto grow numbers were kept small, primarily due to storage limitations on the SSD in my laptop, and in the configuration of the VM I was working on, but I am definitely going to make it a point to test this again at a later date using larger autogrowth numbers to see what the impact is longer term and whether the hot spots caused by autogrowth can impact performance significantly?  I have always espoused manual file size management, even in large environments so that certain factors like a filegroup with multiple files can be addressed at the same time.

So there you have it, evidence of how proportional fill functions inside of SQL Server.  Hope you found it interesting.

See you on the playground!

Two weeks ago at SQL Connections in Orlando, FL, I got to participate in a session that Paul and Kimberly do at the end of the conference titled, “Follow the Rabbit.”  The premise of the session is that Paul and Kimberly throw a big list of topics up on the screen and anyone in the audience can ask any question they have about one of those topics and Paul and Kimberly will try to answer it.  I wasn’t the only person to end up participating in this session answering questions, Maciej Pilecki another MVP who also recently passed the Microsoft Certified Masters Exam for SQL Server 2008 answered a number of questions as well.  One of my favorite questions that was asked during this session was “Does index fragmentation matter with SSD’s anymore?”  Paul’s answer to the question was very practical and dealt with the wasted space utilization that excessive index fragmentation can cause in a database that uses a uniqueidentifier column with newid() as the primary key, and given the cost per gigabyte for SSD storage this could be quite significant.  Paul pointed out a couple of other points that escape me, primarily because my mind immediately started churning ideas about how to test the impact that index fragmentation actually has on a database.

It should be no surprise that this blog post is going to use Extended Events, it has after all been my favorite feature in SQL Server since it was first released in an early CTP for SQL Server 2008.  One of the sessions that I presented at SQL Connections last month was on Extended Events and I made some pretty heavy use of the I/O related events as a part of that session.  I also made heavy usage of the I/O related events back in December for a couple of my XEvent a Day blog posts, so I knew that I could really get some detailed information back from the system about what kind of impact fragmentation really had on a database, and since I have a SSD in my laptop, which while not enterprise class is more than worthy of performing the tests on, I thought I’d give a whirl at figuring out how much impact fragmentation really had.

When I began working on this problem, I started out using Windows Server 2008R2 and SQL Server 2008R2, primarily because it is my standard VM platform for the Immersion Training, and for presentations and it was immediately available. I originally intended to use the I/O size information from the file_read_completed event to aggregate the I/O sizes being read from disk, and I soon learned that this was not available in SQL Server 2008 as I had originally thought.  I knew I had used it in the past so I went back to my XEvent a Day blog series and found An XEvent a Day (29 of 31) – The Future – Looking at Database Startup in Denali

Bummer!

I happened to have my old Denali CTP1 VM that I used to create that blog post on a external hard disk so I decided to make use of it for the investigation.  I had originally hoped to be able to create a reproducible demo that anyone could use in their SQL Server 2008 environment to see the impact of fragmentation, but since it is not possible this post will be based on the information available through SQL Server Denali instead.

The first thing that we will need is a database with two tables that have identical data in them; one fragmented and the other not. To create this, we’ll use a uniqueidentifier with newid() as the primary key for the first table and a uniqueidentifier with newsequentialid() as the primary key for the second table. Then we will load approximately 1000 pages worth of data into the first table and then copy that data into the second table, and rebuild the indexes on the second table to remove any residual fragmentation from it.

CREATE DATABASE FragmentationTest
GO
USE FragmentationTest
GO
CREATE TABLE GuidHighFragmentation
(UniqueID UNIQUEIDENTIFIER DEFAULT NEWID() PRIMARY KEY,
 FirstName nvarchar(50) NOT NULL,
 LastName nvarchar(50) NOT NULL)
GO
 
CREATE NONCLUSTERED INDEX IX_GuidHighFragmentation_LastName
ON GuidHighFragmentation(LastName)
GO
 
CREATE TABLE GuidLowFragmentation
(UniqueID UNIQUEIDENTIFIER DEFAULT NEWSEQUENTIALID() PRIMARY KEY,
 FirstName nvarchar(50) NOT NULL,
 LastName nvarchar(50) NOT NULL)
GO
 
CREATE NONCLUSTERED INDEX IX_GuidLowFragmentation_LastName
ON GuidLowFragmentation(LastName)
GO
 
INSERT INTO GuidHighFragmentation (FirstName, LastName)
SELECT TOP 1000
    a.name, b.name
FROM master.dbo.spt_values AS a
CROSS JOIN master.dbo.spt_values AS b
WHERE a.name IS NOT NULL 
    AND b.name IS NOT NULL
ORDER BY NEWID()
GO 70
 
INSERT INTO GuidLowFragmentation (FirstName, LastName)
SELECT FirstName, LastName
FROM GuidHighFragmentation
GO
 
ALTER INDEX ALL ON GuidLowFragmentation REBUILD

With our tables built, we can validate the fragmentation information by querying the sys.dm_index_physical_stats() DMF:

SELECT 
    OBJECT_NAME(ps.object_id),
    i.name,
    ps.index_id,
    ps.index_depth,
    avg_fragmentation_in_percent,
    fragment_count,
    page_count,
    avg_page_space_used_in_percent,
    record_count
FROM sys.dm_db_index_physical_stats(
        DB_ID(), 
        NULL, 
        NULL, 
        NULL, 
        'DETAILED') AS ps
JOIN sys.indexes AS i
    ON ps.object_id = i.object_id
        AND ps.index_id = i.index_id
WHERE index_level = 0

image

Next we will create our event session to capture the I/O events that are related to physical reads from disk.  The event session is being created using dynamic SQL so that each of the events has a predicate on the sqlserver.database_id for the current database.

-- Create an Event Session to investigate our IO operations
IF EXISTS(SELECT * FROM sys.server_event_sessions WHERE name='FragmentationEffect')
    DROP EVENT SESSION [FragmentationEffect] ON SERVER;
DECLARE @sqlcmd nvarchar(4000) = '
CREATE EVENT SESSION FragmentationEffect
ON SERVER
ADD EVENT sqlserver.sql_statement_starting
( ACTION (sqlserver.sql_text)),
ADD EVENT sqlserver.sql_statement_completed
( ACTION (sqlserver.sql_text)),
ADD EVENT sqlserver.file_read
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.file_read_completed
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.physical_page_read
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlos.async_io_requested
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlos.async_io_completed
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+'))--,
ADD TARGET package0.asynchronous_file_target(
     SET filename=''C:\SQLskills\EE_FragmentationEffect.xel'',
         metadatafile=''C:\SQLskills\EE_FragmentationEffect.xem'')
WITH (MAX_MEMORY = 8MB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, 
      TRACK_CAUSALITY = ON, MAX_DISPATCH_LATENCY=5SECONDS)'
EXEC (@sqlcmd)
GO

This event session is configured to have MAX_MEMORY set at 8MB to account for the partitioning of the memory buffers and to ensure that there is ample buffer space for the session to try and mitigate against the potential for event loss.  The EVENT_RETENTION_MODE for the session can not be configured for NO_EVENT_LOSS since the sqlserver.physical_page_read event is a part of the event session.  TRACK_CAUSALITY is turned ON for the event session so that correlation between events can be made to tie the I/O operations back to the statement that generated them, and the MAX_DISPATCH_LATENCY has been set at five seconds because I am impatient and don’t want to have to wait the default of thirty seconds for the events to be dispatched to the targets while doing a demo.

Prior to starting the event session, a manual CHECKPOINT will be issued against the database to flush any dirty pages from the buffer cache to disk, and then the buffer cache will be cleared using DBCC DROPCLEANBUFFERS to ensure that the test statements have to physically read the data from disk into cache.

-- Issue checkpoint to flush dirty buffers to disk
CHECKPOINT
GO
 
-- Clear the Buffer Cache to force reads from Disk 
DBCC DROPCLEANBUFFERS 
GO 

With this completed, we can now start our event session and run the same query against each table to force a scan of all the pages in the table from disk, then stop the event session so that our target file only has events associated with this specific test in it.

-- Start the Event Session
ALTER EVENT SESSION FragmentationEffect
ON SERVER
STATE=START
GO
 
-- Aggregate the data from both tables 
SELECT LastName, COUNT(*)
FROM GuidLowFragmentation
GROUP BY LastName
GO        
SELECT LastName, COUNT(*)
FROM GuidHighFragmentation
GROUP BY LastName
GO
 
-- Wait for the events to dispatch to the target
WAITFOR DELAY '00:00:10'
GO
 
-- Stop the Event Session
ALTER EVENT SESSION FragmentationEffect
ON SERVER
STATE=STOP
GO

Now that we have our data, we need to read it from the file and break down the event data into columns to simplify analysis of the information.  To do this, we’ll first load the event data as XML into a staging table.  Trying to parse the XML while reading from the file through the DMF is incredibly slow by comparison.  Then we can shred the XML into a tabular format based on the events being collected.

-- Drop Results tables if they exist
IF OBJECT_ID('FragmentationEffectResults') IS NOT NULL
    DROP TABLE FragmentationEffectResults 
GO
IF OBJECT_ID('FragmentationEffectResultsParsed') IS NOT NULL
    DROP TABLE FragmentationEffectResultsParsed 
GO
 
-- Create results table to load data from XE files
CREATE TABLE FragmentationEffectResults
(RowID int identity primary key, event_data XML)
GO
 
-- Load the event data from the file target
INSERT INTO FragmentationEffectResults(event_data)
SELECT
    CAST(event_data AS XML) AS event_data
FROM sys.fn_xe_file_target_read_file('C:\SQLskills\EE_FragmentationEffect*.xel', 
                                     'C:\SQLskills\EE_FragmentationEffect*.xem', 
                                     null, null)
GO
 
-- Parse the event data
SELECT 
    RowID,
    event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
    COALESCE(event_data.value('(event/data[@name="database_id"]/value)[1]', 'int'), 
             event_data.value('(event/action[@name="database_id"]/value)[1]', 'int')) AS database_id,
    event_data.value('(event/data[@name="mode"]/text)[1]', 'nvarchar(4000)') AS [mode],
    event_data.value('(event/data[@name="file_handle"]/value)[1]', 'nvarchar(4000)') AS [file_handle],
    event_data.value('(event/data[@name="offset"]/value)[1]', 'bigint') AS [offset],
    event_data.value('(event/data[@name="page_id"]/value)[1]', 'int') AS [page_id],
    event_data.value('(event/data[@name="file_id"]/value)[1]', 'int') AS [file_id],
    event_data.value('(event/data[@name="file_group_id"]/value)[1]', 'int') AS [file_group_id],
    event_data.value('(event/data[@name="size"]/value)[1]', 'bigint') AS [size],
    event_data.value('(event/data[@name="wait_type"]/text)[1]', 'nvarchar(100)') AS [wait_type],
    event_data.value('(event/data[@name="duration"]/value)[1]', 'bigint') AS [duration],
    event_data.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(4000)') AS [sql_text],
    event_data.value('(event/data[@name="cpu"]/value)[1]', 'int') AS [cpu],
    event_data.value('(event/data[@name="reads"]/value)[1]', 'bigint') AS [reads],
    event_data.value('(event/data[@name="writes"]/value)[1]', 'bigint') AS [writes],
    CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 1, 36) AS uniqueidentifier) as activity_id,
    CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 38, 10) AS int) as event_sequence
INTO FragmentationEffectResultsParsed
FROM FragmentationEffectResults
ORDER BY Rowid

This is where the real fun begins.  There are too many events in results set to be useful in raw form, so what I decided to do was to locate the activity_id from TRACK_CAUSALITY for the sqlserver.sql_statement_starting events for both of the tables, and store those into variables for use in querying off the results set.  The first thing I looked at was how many times did each of the events actually fire for each of the tables.

DECLARE @FragmentationHighActivityID varchar(50),
        @FragmentationLowActivityID varchar(50)
        
SELECT @FragmentationHighActivityID = activity_id
FROM FragmentationEffectResultsParsed
WHERE event_name = 'sql_statement_starting'
  AND sql_text LIKE '%GuidHighFragmentation%'
 
SELECT @FragmentationLowActivityID = activity_id
FROM FragmentationEffectResultsParsed
WHERE event_name = 'sql_statement_starting'
  AND sql_text LIKE '%GuidLowFragmentation%'
 
-- Aggregate the results by the event name and file_id
SELECT event_name, 
    CASE activity_id
        WHEN @FragmentationHighActivityID THEN 'GuidHighFragmentation'
        WHEN @FragmentationLowActivityID THEN 'GuidLowFragmentation'
        ELSE 'UNKNOWN'
    END as QueryTable, 
    COUNT(*) AS occurences
FROM FragmentationEffectResultsParsed
WHERE activity_id IN (@FragmentationHighActivityID, @FragmentationLowActivityID)
  AND event_name IN ('async_io_completed', 'async_io_requested', 
        'file_read', 'file_read_completed', 'physical_page_read')
GROUP BY event_name, activity_id
ORDER BY activity_id, event_name

image

Impressive!  SQL Server had to do ten times the I/O operations against the fragmented table than it did against the non-fragmented table.  You might notice in the output that the number of physical_page_read events is larger than the number of actual pages in the indexes for the table.  Yes and no, I restricted the output of sys.dm_db_index_physical_stats to just level 0 of the indexes, or the leaf level.  There are two additional levels in the nonclustered indexes being scanned but they only account for nine additional pages for the fragmented index and five additional pages for the non-fragmented index.  The other pages are from the system metadata tables which have to be read by the engine as well, which can be confirmed by reading the parsed event data from the dbo.FragmentationEffectResultsParsed table.

Where the impact really gets interesting is when we take a look at the I/O sizes for file_read_completed events and aggregate the results based on the table and the I/O size being read from disk using the same activity_id’s from the previous query for aggregation.

-- Aggregate the file_read_completed events by the IO size and table
SELECT
    CASE activity_id
        WHEN @FragmentationHighActivityID THEN 'GuidHighFragmentation'
        WHEN @FragmentationLowActivityID THEN 'GuidLowFragmentation'
        ELSE 'UNKNOWN'
    END as QueryTable, 
    size/1024 as read_size_kb, 
    size/1024/8 as read_size_pages, 
    COUNT(*) AS occurences
FROM FragmentationEffectResultsParsed
WHERE activity_id IN (@FragmentationHighActivityID, @FragmentationLowActivityID)
  AND event_name IN ('file_read_completed')
GROUP BY database_id, file_id, size, activity_id
ORDER BY QueryTable, size desc
GO

image

I suspected that fragmentation would impact the ability of the database engine to do large sequential reads from the table index while scanning it, but I was not expecting the impact to be as bad as it actually was.  I ran this test numerous times to isolate out if it was some kind of a fluke and the results are consistently repeatable within a relatively low margin of change.  The smallest read size for the non-fragmented index was 64KB where the fragmented table did a majority of its I/O using 64KB reads or smaller.

Now I know what you are thinking, the non-fragmented index in this case is smaller than the fragmented one so it should do less total I/O.  You’d be right to think that, but keep in mind that they both contain the same number of records, demonstrating the wasted space that Paul talked about in the session at SQL Connections.  However, not to worry, we can fix the non-fragmented table and add additional records to it to push the index size over the size of the fragmented one in page count.

-- Add additional data to bring page count higher than the GuidHighFragmentation table!
INSERT INTO GuidLowFragmentation (FirstName, LastName)
SELECT TOP 10000
    a.name, b.name
FROM master.dbo.spt_values AS a
CROSS JOIN master.dbo.spt_values AS b
WHERE a.name IS NOT NULL 
    AND b.name IS NOT NULL
ORDER BY NEWID()
GO 10
 
-- Rebuild the indexes to remove any fragmentation
ALTER INDEX ALL ON GuidLowFragmentation REBUILD
GO

If we go back and rerun our query against sys.dm_db_index_physical_stats the non-fragmented index will have an additional 100000 records and just over 1000 pages.

image

Now we can delete the files from our original test, execute CHECKPOINT and flush the buffer cache and rerun the tests to see what affect more than double the index pages has on the non-fragmented indexes I/O operations.

image

For the event counts, even with more pages in the index, the non-fragmented table still incurs significantly less I/O operations.

image

For the I/O size, the smallest IO performed for the non-fragmented table is still 64KB, and there was a increase in reads at 256KB and 512KB sizes. 

While SSD’s might reduce the IO latency for random read operations, they don’t negate the need to continue to adhere to proper design principals for primary keys, fill factor on indexes, and index maintenance.  The actual runtimes of the two tests were nearly identical, and often flipped back and forth between which one took a few milliseconds longer than the other to complete for this demo.  However, keep in mind that these tables were less than 16MB data total until the non-fragmented table had additional records added to it.  Even at larger data sizes the performance of the two scans will be close enough that the average end user wouldn’t notice the difference, but under the covers the number of I/O operations being performed is significantly different.  One thing to consider is that an SSD while faster still has an upper limit to the number of I/O operations per second (IOPS) that it can perform.  Your average database might not be able to push that limit even with heavy fragmentation, but with the cost of SSD’s still at a premium I wouldn’t waste any of what was available if I had them in my server.

So to wrap up, the answer to the question “Does index fragmentation matter with SSD’s?”  It Depends on if you care about wasting space and wasting I/O more than anything else, but YES it still matters and it still has an impact.

One note about this post:

The demo for this post works with SQL Server 2008 and 2008 R2 with the exception that you don’t get the I/O size back from the file_read_completed event.  You can still see the impact that fragmentation has on the number of I/O operations being performed using the demo exactly as provided from this blog post with SQL Server 2008 and 2008 R2.  I chose to go the Denali route because the impact to the I/O size is quite interesting IMO.

See you on the playground!

While attending PASS Summit this year, I got the opportunity to hang out with Brent Ozar (Blog|Twitter) one afternoon while he did some work for Yanni Robel (Blog|Twitter).  After looking at the wait stats information, Brent pointed out some potential problem points, and based on that information I pulled up my code for my PASS session the next day on Wait Statistics and Extended Events and made some changes to one of the demo’s so that the Event Session only focused on those potentially problematic waits that had been identified, and sent Brent the DDL so that he could give Extended Events a shot.  Within a few minutes, we were able track down to the statement level in a couple of stored procedures, the causes of those waits, and after some analysis Brent was able to offer some suggestions to Yanni about how to reduce the waits.

Understanding how SQL Server waits to continue execution can be key to improving performance since time spent waiting is time lost during the execution of a SQL Statement.  I love looking at wait statistics and the chapter that I wrote for SQL Server 2008 Internals and Troubleshooting was SQL Server Waits and Extended Events.  Information about wait statistics has been available in SQL Server for a long time, and many of the vendors that develop monitoring applications for SQL Server have polling methods that query sys.dm_os_waiting_tasks or sys.sysprocesses to capture wait information about the tasks that are currently active in the system.  However one of the shortcomings of a polling method is that it misses a lot of the wait information because it is a point in time snapshot only.  If the polling interval is every second, only the active waits that exist at that second are captured, and any waits that occur between the polling interval is missed.  This information is still accumulated in sys.dm_os_wait_stats, but it is impossible to track it back to the statement level from that DMV.

Extended Events offers us the ability to capture information about waits without missing any of the information.  Already in this series we’ve seen how to use Extended Events with the Bucketizer Target to count the occurrences of waits by type.  This isn’t really a great use of Extended Events since sys.dm_os_wait_stats counts the occurrences of the wait types already, and a differential analysis of the information contained in sys.dm_os_wait_stats can provide this information.  The purpose of that example was to discuss the bug that existed in the RTM release of SQL Server 2008 more than it was to provide a sensible use for the target.  However, if we wanted to break our waits down by database, we could bucket on the database_id, and begin to understand which database had the most waits associated with it, but not by the individual wait type.  To get to that level of information, we need to collect all of the waits and the associated information for them to do further analysis.

There are two Events in Extended Events associated with wait types; sqlos.wait_info and sqlos.wait_info_external.  Looking at the description of the Events in the Metadata DMV’s we can get an idea of when each Event will fire.

SELECT name, description 
FROM sys.dm_xe_objects
WHERE name LIKE 'wait_info%'

image

The sqlos.wait_info_external Event will fire for wait types that begin with PREEMPTIVE_ in the name, and the sqlos.wait_info Event will fire for the other wait types that occur on the server.  Glenn Alan Berry (Blog|Twitter) has a great script that filters queries sys.dm_os_wait_stats and filters out common waits that are not generally problematic.  You can find his script on his blog post Updated SQL 2005 and 2008 Diagnostic Queries.  You can use this script to identify the most common waits on a server, and then use that information to build an Event Session that captures the session and statement information for those individual wait types.  In SQL Server 2008, there are 484 wait types listed in sys.dm_os_wait_stats and there are 599 map_value’s for the wait_types Map in sys.dm_xe_map_values.  The reason this is different is that the Map was created from the header file for the wait types and there are padded values that exist in the Map that don’t really correspond to wait types that exist in SQL Server.  However, there are also a couple of Maps for the wait_types that don’t match the wait type in sys.dm_os_wait_stats, the most notable being the ASYNC_NETWORK_IO to NETWORK_IO.

To build the Event Session, we just need to query sys.dm_map_values for our wait_types and use the map_key's in the Predicate definition of the sqlos.wait_info or sqlos.wait_info_external Event as appropriate.  We can also do the same thing to build a generic Event Session that tracks the most common resource related waits.

SELECT map_key, map_value 
FROM sys.dm_xe_map_values
WHERE name = 'wait_types'
  AND ((map_key > 0 AND map_key < 22) -- LCK_ waits
            OR (map_key > 31 AND map_key < 38) -- LATCH_ waits
            OR (map_key > 47 AND map_key < 54) -- PAGELATCH_ waits
            OR (map_key > 63 AND map_key < 70) -- PAGEIOLATCH_ waits
            OR (map_key > 96 AND map_key < 100) -- IO (Disk/Network) waits
            OR (map_key = 107) -- RESOURCE_SEMAPHORE waits
            OR (map_key = 113) -- SOS_WORKER waits
            OR (map_key = 120) -- SOS_SCHEDULER_YIELD waits
            OR (map_key = 178) -- WRITELOG waits
            OR (map_key > 174 AND map_key < 177) -- FCB_REPLICA_ waits
            OR (map_key = 186) -- CMEMTHREAD waits
            OR (map_key = 187) -- CXPACKET waits
            OR (map_key = 207) -- TRACEWRITE waits
            OR (map_key = 269) -- RESOURCE_SEMAPHORE_MUTEX waits
            OR (map_key = 283) -- RESOURCE_SEMAPHORE_QUERY_COMPILE waits
            OR (map_key = 284) -- RESOURCE_SEMAPHORE_SMALL_QUERY waits
        )

Once we have the list of map_key’s defined, we can do a replace in SSMS and change map_key to wait_type and define the predicate for the sqlos.wait_info Event for our Event Session.

CREATE EVENT SESSION [TrackResourceWaits] ON SERVER 
ADD EVENT  sqlos.wait_info
(    -- Capture the database_id, session_id, plan_handle, and sql_text
    ACTION(sqlserver.database_id,sqlserver.session_id,sqlserver.sql_text,sqlserver.plan_handle)
    WHERE
        (opcode = 1 --End Events Only
            AND duration > 100 -- had to accumulate 100ms of time
            AND ((wait_type > 0 AND wait_type < 22) -- LCK_ waits
                    OR (wait_type > 31 AND wait_type < 38) -- LATCH_ waits
                    OR (wait_type > 47 AND wait_type < 54) -- PAGELATCH_ waits
                    OR (wait_type > 63 AND wait_type < 70) -- PAGEIOLATCH_ waits
                    OR (wait_type > 96 AND wait_type < 100) -- IO (Disk/Network) waits
                    OR (wait_type = 107) -- RESOURCE_SEMAPHORE waits
                    OR (wait_type = 113) -- SOS_WORKER waits
                    OR (wait_type = 120) -- SOS_SCHEDULER_YIELD waits
                    OR (wait_type = 178) -- WRITELOG waits
                    OR (wait_type > 174 AND wait_type < 177) -- FCB_REPLICA_ waits
                    OR (wait_type = 186) -- CMEMTHREAD waits
                    OR (wait_type = 187) -- CXPACKET waits
                    OR (wait_type = 207) -- TRACEWRITE waits
                    OR (wait_type = 269) -- RESOURCE_SEMAPHORE_MUTEX waits
                    OR (wait_type = 283) -- RESOURCE_SEMAPHORE_QUERY_COMPILE waits
                    OR (wait_type = 284) -- RESOURCE_SEMAPHORE_SMALL_QUERY waits
                )
        )
)
ADD TARGET package0.ring_buffer(SET max_memory=4096)
WITH (EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
      MAX_DISPATCH_LATENCY=5 SECONDS)
GO

 

Now that we have the Event Session defined, we can start it as needed to collect the resource wait information for our system.  The only concern with this Event Session is the Target being used.  If the Event Session is going to run for a long period of time, or if the waits on the server being monitored occur in large quantities, the Target should be changed away from the ring_buffer to the asynchronous_file_target.  I configured this session to only collect waits that exceed 100ms in duration.  If you want waits that have shorter durations this can easily be changed.  If you set the duration to be > 0 a lot of 1-5ms waits will be collected that aren’t generally interesting.  To query the wait information from this Event Session using the ring_buffer:

-- Extract the Event information from the Event Session 
SELECT 
    event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    DATEADD(hh, 
        DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
        event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
    COALESCE(event_data.value('(event/data[@name="database_id"]/value)[1]', 'int'), 
        event_data.value('(event/action[@name="database_id"]/value)[1]', 'int')) AS database_id,
    event_data.value('(event/action[@name="session_id"]/value)[1]', 'int') AS [session_id],
    event_data.value('(event/data[@name="wait_type"]/text)[1]', 'nvarchar(4000)') AS [wait_type],
    event_data.value('(event/data[@name="opcode"]/text)[1]', 'nvarchar(4000)') AS [opcode],
    event_data.value('(event/data[@name="duration"]/value)[1]', 'bigint') AS [duration],
    event_data.value('(event/data[@name="max_duration"]/value)[1]', 'bigint') AS [max_duration],
    event_data.value('(event/data[@name="total_duration"]/value)[1]', 'bigint') AS [total_duration],
    event_data.value('(event/data[@name="signal_duration"]/value)[1]', 'bigint') AS [signal_duration],
    event_data.value('(event/data[@name="completed_count"]/value)[1]', 'bigint') AS [completed_count],
    event_data.value('(event/action[@name="plan_handle"]/value)[1]', 'nvarchar(4000)') AS [plan_handle],
    event_data.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(4000)') AS [sql_text]
FROM 
(    SELECT XEvent.query('.') AS event_data 
    FROM 
    (    -- Cast the target_data to XML 
        SELECT CAST(target_data AS XML) AS TargetData 
        FROM sys.dm_xe_session_targets st 
        JOIN sys.dm_xe_sessions s 
            ON s.address = st.event_session_address 
        WHERE name = 'TrackResourceWaits' 
          AND target_name = 'ring_buffer'
    ) AS Data 
    -- Split out the Event Nodes 
    CROSS APPLY TargetData.nodes ('RingBufferTarget/event') AS XEventData (XEvent)   
) AS tab (event_data)

 

In the result set, you will notice that some of the wait_info Events do not have an associated session_id, database_id, plan_handle, or sql_text value.  Depending on where the wait actually occurs in code, this information is not available to the firing Event, for example, the NETWORK_IO Event generally does not successfully collect these Actions.

While collecting session and statement level waits like this is certainly interesting, there are some considerations that have to be made whenever you look at wait information like this.  The first consideration is that, while a specific session or statement waited on a resource, that doesn’t necessarily mean that the problem exists within that session or statement.  Take for example a query that has to wait 500ms on ASYNC_IO_COMPLETION waits.  At the same time that query is executing there are 10 DSS queries running that scan large ranges of data from the database data files and generate a lot of IO activity.  Where exactly is the problem?  The root problem is that there is a disk IO bottleneck, but not necessarily related to the query that is waiting on disk IO, it could be another query performing Table Scan that is leading to the heavy IO activity.

As I have said previously in this series, one of my favorite aspects of Extended Events is that it allows you to look at what is going on under the covers in SQL Server, at a level that has never previously been possible.  SQL Server Denali CTP1 includes a number of new Events that expand on the information that we can learn about how SQL Server operates and in today’s blog post we’ll look at how we can use those Events to look at what happens when a database starts up inside of SQL Server.

First lets create our Event Session, which will collect a large number of events that relate to the operations that occur when a database starts in SQL Server.  

DECLARE @sqlcmd nvarchar(max) =
'IF EXISTS(SELECT * FROM sys.server_event_sessions WHERE
name=''MonitorStartupLogRecovery'')
   DROP EVENT SESSION [MonitorStartupLogRecovery] ON SERVER;
CREATE EVENT SESSION [MonitorStartupLogRecovery]
ON SERVER
ADD EVENT sqlserver.database_started 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.databases_log_file_used_size_changed 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.databases_log_flush 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.databases_log_flush_wait 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.file_read 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.file_read_completed 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.file_write_completed 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.file_written 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.log_block_cache 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.log_block_consume 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.log_blocks_uncache 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.log_cache_buffer_refcounter_change 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.log_consumer_act 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.log_flush_complete 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.log_flush_requested 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.log_flush_start 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.log_single_record 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.new_log_interest_flip 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.redo_single_record 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.redo_target_set 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + ')),
ADD EVENT sqlserver.transaction_log 
(WHERE (database_id = ' + CAST(DB_ID('AdventureWorks2008R2') AS varchar(3)) + '))
ADD TARGET package0.asynchronous_file_target(
     SET filename=''C:\SQLBlog\MonitorStartupLogRecovery.xel'',
         metadatafile=''C:\SQLBlog\MonitorStartupLogRecovery.xem'')
WITH (MAX_MEMORY = 8192KB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, STARTUP_STATE = ON)'
EXEC(@sqlcmd)
GO

With the Event Session created, we can make some changes that write to our test database to see what happens when the database is recovered at startup. We are going to can make two changes to the database.  First we’ll begin a transaction and create a table with 10 rows of data in it without committing the transaction. 

USE AdventureWorks2008R2
GO
-- Begin a Transaction and leave it open
BEGIN TRANSACTION
-- Create the First Table
SELECT TOP 10 *
INTO TestTable
FROM Sales.SalesOrderDetail sod
GO

Now in a New Query Window, we’ll create a second table with 10 rows of data without opening a transaction, and then force a dirty shutdown of the Database Engine.

USE AdventureWorks2008R2
GO
-- Create a Second Table
SELECT TOP 10 *
INTO TestTable2
FROM Sales.SalesOrderDetail sod
GO
-- Flush changes to data file
CHECKPOINT
GO
-- Force Shutdown the Engine
SHUTDOWN

Once SHUTDOWN is issued, the process terminates, and the Database Engine will need to be restarted from the Services Snapin, the SQL Server Configuration Manager, or through SSMS.  When the Engine starts up, the Event Session will become active, and the Events will be logged to the package0.asynchronous_file_target for analysis.  Once recovery completes, we can drop the Event Session from the Server, so that the buffers flush, and we can then begin our analysis of the collected information.

USE tempdb
GO

-- Read the Raw Event data into a table
SELECT CAST(event_data AS XML) AS event_data
INTO TargetEvents
FROM sys.fn_xe_file_target_read_file('C:\SQLBlog\MonitorStartupLogRecovery*.xel', 'C:\SQLBlog\MonitorStartupLogRecovery*.xem', null, null)

-- Fetch the Event Data from the Raw Event Data into another table
SELECT 
    event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
    COALESCE(event_data.value('(event/data[@name="database_id"]/value)[1]', 'int'), 
             event_data.value('(event/action[@name="database_id"]/value)[1]', 'int')) AS database_id,
    event_data.value('(event/data[@name="count"]/value)[1]', 'bigint') AS [count],
    event_data.value('(event/data[@name="start_log_block_id"]/value)[1]', 'bigint') AS [start_log_block_id],
    event_data.value('(event/data[@name="is_read_ahead"]/value)[1]', 'nvarchar(4000)') AS [is_read_ahead],
    event_data.value('(event/data[@name="private_consumer_id"]/value)[1]', 'bigint') AS [private_consumer_id],
    event_data.value('(event/data[@name="mode"]/text)[1]', 'nvarchar(4000)') AS [mode],
    event_data.value('(event/data[@name="file_handle"]/value)[1]', 'nvarchar(4000)') AS [file_handle],
    event_data.value('(event/data[@name="offset"]/value)[1]', 'bigint') AS [offset],
    event_data.value('(event/data[@name="file_id"]/value)[1]', 'int') AS [file_id],
    event_data.value('(event/data[@name="filegroup_id"]/value)[1]', 'int') AS [filegroup_id],
    event_data.value('(event/data[@name="size"]/value)[1]', 'bigint') AS [size],
    event_data.value('(event/data[@name="path"]/value)[1]', 'nvarchar(4000)') AS [path],
    event_data.value('(event/data[@name="duration"]/value)[1]', 'bigint') AS [duration],
    event_data.value('(event/data[@name="io_data"]/value)[1]', 'nvarchar(4000)') AS [io_data],
    event_data.value('(event/data[@name="resource_type"]/text)[1]', 'nvarchar(4000)') AS [resource_type],
    event_data.value('(event/data[@name="owner_type"]/text)[1]', 'nvarchar(4000)') AS [owner_type],
    event_data.value('(event/data[@name="transaction_id"]/value)[1]', 'bigint') AS [transaction_id],
    event_data.value('(event/data[@name="lockspace_workspace_id"]/value)[1]', 'nvarchar(4000)') AS [lockspace_workspace_id],
    event_data.value('(event/data[@name="lockspace_sub_id"]/value)[1]', 'int') AS [lockspace_sub_id],
    event_data.value('(event/data[@name="lockspace_nest_id"]/value)[1]', 'int') AS [lockspace_nest_id],
    event_data.value('(event/data[@name="resource_0"]/value)[1]', 'int') AS [resource_0],
    event_data.value('(event/data[@name="resource_1"]/value)[1]', 'int') AS [resource_1],
    event_data.value('(event/data[@name="resource_2"]/value)[1]', 'int') AS [resource_2],
    event_data.value('(event/data[@name="object_id"]/value)[1]', 'int') AS [object_id],
    event_data.value('(event/data[@name="associated_object_id"]/value)[1]', 'bigint') AS [associated_object_id],
    event_data.value('(event/data[@name="resource_description"]/value)[1]', 'nvarchar(4000)') AS [resource_description],
    event_data.value('(event/data[@name="database_name"]/value)[1]', 'nvarchar(4000)') AS [database_name],
    event_data.value('(event/data[@name="log_block_id"]/value)[1]', 'bigint') AS [log_block_id],
    event_data.value('(event/data[@name="log_block_size"]/value)[1]', 'int') AS [log_block_size],
    event_data.value('(event/data[@name="from_disk"]/value)[1]', 'nvarchar(4000)') AS [from_disk],
    event_data.value('(event/data[@name="incomplete"]/value)[1]', 'nvarchar(4000)') AS [incomplete],
    event_data.value('(event/data[@name="cache_buffer_pointer"]/value)[1]', 'nvarchar(4000)') AS [cache_buffer_pointer],
    event_data.value('(event/data[@name="consumer_id"]/value)[1]', 'bigint') AS [consumer_id],
    event_data.value('(event/data[@name="old_weight"]/value)[1]', 'int') AS [old_weight],
    event_data.value('(event/data[@name="new_weight"]/value)[1]', 'int') AS [new_weight],
    event_data.value('(event/data[@name="new_position"]/value)[1]', 'int') AS [new_position],
    event_data.value('(event/data[@name="last_log_block_id"]/value)[1]', 'bigint') AS [last_log_block_id],
    event_data.value('(event/data[@name="weight"]/value)[1]', 'int') AS [weight],
    event_data.value('(event/data[@name="address"]/value)[1]', 'nvarchar(4000)') AS [address],
    event_data.value('(event/data[@name="type"]/text)[1]', 'nvarchar(4000)') AS [type],
    event_data.value('(event/data[@name="current_count"]/value)[1]', 'int') AS [current_count],
    event_data.value('(event/data[@name="change_type"]/value)[1]', 'int') AS [change_type],
    event_data.value('(event/data[@name="activity_id"]/value)[1]', 'int') AS [activity_id],
    event_data.value('(event/data[@name="write_size"]/value)[1]', 'int') AS [write_size],
    event_data.value('(event/data[@name="rows"]/value)[1]', 'int') AS [rows],
    event_data.value('(event/data[@name="pending_writes"]/value)[1]', 'int') AS [pending_writes],
    event_data.value('(event/data[@name="pending_bytes"]/value)[1]', 'int') AS [pending_bytes],
    event_data.value('(event/data[@name="reason"]/text)[1]', 'nvarchar(4000)') AS [reason],
    event_data.value('(event/data[@name="waiters"]/value)[1]', 'int') AS [waiters],
    event_data.value('(event/data[@name="error"]/value)[1]', 'int') AS [error],
    event_data.value('(event/data[@name="slot_id"]/value)[1]', 'int') AS [slot_id],
    event_data.value('(event/data[@name="used_size"]/value)[1]', 'int') AS [used_size],
    event_data.value('(event/data[@name="reservation_size"]/value)[1]', 'bigint') AS [reservation_size],
    event_data.value('(event/data[@name="log_op_id"]/value)[1]', 'int') AS [log_op_id],
    event_data.value('(event/data[@name="log_op_name"]/value)[1]', 'nvarchar(4000)') AS [log_op_name],
    event_data.value('(event/data[@name="interest"]/value)[1]', 'nvarchar(4000)') AS [interest],
    event_data.value('(event/data[@name="cache_type"]/value)[1]', 'int') AS [cache_type],
    event_data.value('(event/data[@name="keys"]/value)[1]', 'nvarchar(4000)') AS [keys],
    event_data.value('(event/data[@name="stop_mark"]/value)[1]', 'nvarchar(4000)') AS [stop_mark],
    event_data.value('(event/data[@name="operation"]/text)[1]', 'nvarchar(4000)') AS [operation],
    event_data.value('(event/data[@name="success"]/value)[1]', 'nvarchar(4000)') AS [success],
    event_data.value('(event/data[@name="index_id"]/value)[1]', 'int') AS [index_id],
    event_data.value('(event/data[@name="log_record_size"]/value)[1]', 'int') AS [log_record_size],
    event_data.value('(event/data[@name="context"]/text)[1]', 'nvarchar(4000)') AS [context],
    event_data.value('(event/data[@name="replication_command"]/value)[1]', 'int') AS [replication_command],
    event_data.value('(event/data[@name="transaction_start_time"]/value)[1]', 'nvarchar(4000)') AS [transaction_start_time]
INTO Results
FROM TargetEvents

Now we can begin to analyze the information that we collected by querying the Results table.  Looking at the Results as a whole, we can see the database opened by reading the first page of the database and then the database boot page (page_id=9) and page 32 of the primary data file.  Then the engine scans each of the VLF’s of the transaction log.  We can tell that the log reads are scans of the VLF’s by looking at the DBCC LOGINFO information for the database and comparing the file_read offsets from the Event Session to the StartOffset of each of the VLF’s in the DBCC LOGINFO output.

DBCC LOGINFO

image

SELECT event_name, timestamp, database_id, file_id, mode, offset, 
    CASE WHEN file_id = 1 THEN offset/8192 ELSE NULL END AS page_id, size, 
    log_block_id, log_block_size, start_log_block_id, last_log_block_id, 
    from_disk, consumer_id, activity_id, log_op_id, log_op_name, change_type, 
    operation, object_id, index_id, log_record_size, slot_id, used_size, 
    reservation_size, write_size, rows, pending_writes, pending_bytes, 
    context, waiters
FROM Results

image 

After the startup scans the VLF’s 120K of information is read from the log file, and the log buffers start to be consumed to determine the redo start point for recovery.  We can filter our Event data to remove Events that while interesting are not necessary for our analysis at the moment, as well as to reduce the number of columns being returned from the data set.

SELECT 
    event_name, timestamp, file_id, mode, offset, size, log_block_id, 
    COALESCE(log_op_name, operation) as [operation], slot_id, 
    object_id, index_id, log_record_size, context, write_size, rows
FROM Results
WHERE event_name NOT IN ('log_consumer_act', 'log_single_record', 'log_cache_buffer_refcounter_change', 'file_read') 

image

With the filtered results, we can see the log reads into cache and the setting of the redo target.  If we scroll down further, we can get a better picture of what is happening.

image

The last active log blog was consumed, and the redo target was set at that log_block_id.  Then the log is reread starting at offset 318976 and the blocks get cached and the redo operations begin against the database.  Scrolling through the results further, we can see that the redo operations continue as the log blocks increase up to the Checkpoint operation that was executed immediately before the Shutdown of the instance occurred, at the log block that was set as the redo target originally.

image

At this point the data file begins to be read so that the undo operations can be performed before making the database available.

image

After the undo completes the database_started Event is raised and that database becomes available for use.

image

To validate that the changes we see occurring after the redo operations and before the database_started Event, we can set the database OFFLINE, recreate our Event Session, and then bring the database back ONLINE again, and then compare the logged Events when no changes have occurred to our original Events when known changes have occurred.  I am not going to do that in this blog post, but will instead leave that up to the reader to investigate on their own.

The Database Compression feature in SQL Server 2008 Enterprise Edition can provide some significant reductions in storage requirements for SQL Server databases, and in the right implementations and scenarios performance improvements as well.  There isn’t really a whole lot of information about the operations of database compression that is documented as being available in the DMV’s or SQL Trace.  Paul Randal pointed out on Twitter today that sys.dm_db_index_operational_stats() provides the page_compression_attempt_count and page_compression_success_count available.  Beyond that the only other documented information for monitoring Data Compression are the Page Compression Attempts/sec and Pages Compressed/sec Performance Counters of the SQL Server:Access Methods object in Perfmon (http://msdn.microsoft.com/en-us/library/cc280449.aspx). 

There is one thing in common about the documented methods of monitoring Data Compression, and that is they all only deal with Page compression, and not Row compression, and in Extended Events we find the same commonality as there are no Row compression Events in Extended Events.  There are two Page compression Events in Extended Events; sqlserver.page_compression_attempt_failed and sqlserver.page_compression_tracing.  These two Events can be used to track Page compression operations at multiple levels, including database, object, index, and even down to the individual page. The sqlserver.page_compression_tracing Event provides Start and End tracing of Page compression operations inside of the Database Engine and returns the database_id, index_id, rowset_id, page_id, and duration of the compression operation.  The sqlserver.page_compression_attempt_failed is really poorly named, and doesn’t provide information about failures in the sense that something broke, but provides information for why a page compression attempt did not actually change the compression of the data in the page.  It also returns the database_id, index_id, rowset_id, and page_id for the compression attempt, and it also includes a failure_reason column which correlates to the page_compression_failure_reason Map Value.

-- Get the payload information for the Events 
SELECT 
    object_name, 
    column_id, 
    name, 
    type_name
FROM sys.dm_xe_object_columns
WHERE object_name IN ('page_compression_tracing', 
                      'page_compression_attempt_failed')
  AND column_type = 'data'

image

To demonstrate how these Events function, I am going to use the LineItem table from the TPC-H Benchmark that was created by Quest Benchmark Factory using Level 2 for the table sizing, which makes the table just at 1.8GB in size.  All of the indexes on the table will be rebuilt using PAGE compression, and then 10,000 rows will be added to the table.  To setup the environment, first load the TPC-H LineItem table with the appropriate seed of data, this can be done with the free trial version of Benchmark Factory.  Then rebuild all of the indexes on the LineItem table using PAGE compression, and review the PAGE compression statistics from sys.dm_db_index_operational_stats for the database and object.

USE [TPCH]
GO
-- Rebuild the indexes with Page compression 
ALTER INDEX ALL ON dbo.H_Lineitem REBUILD WITH (DATA_COMPRESSION = PAGE)
GO
-- Look at the compression information in sys.dm_db_index_operational_stats
SELECT 
    database_id, 
    object_id, 
    index_id, 
    page_compression_attempt_count, 
    page_compression_success_count,
    (page_compression_attempt_count - page_compression_success_count) as page_compression_failure_count
FROM sys.dm_db_index_operational_stats(db_id('TPCH'), object_id('H_Lineitem'), null, null)
GO

image

Once the table and its indexes have been rebuilt using PAGE compression, we can then create our Event Session, start it, and add 10,000 rows to the LineItem table.  After we add the rows, we can then check the page compression statistics in sys.dm_db_index_operational_stats, and drop our Event Session from the server.

-- Create an Event Session to Track the Failed attempts
CREATE EVENT SESSION PageCompressionTracing
ON SERVER
ADD EVENT sqlserver.page_compression_attempt_failed,
ADD EVENT sqlserver.page_compression_tracing
ADD TARGET package0.asynchronous_file_target(
     SET filename='C:\SQLBlog\PageCompressionTracing.xel',
         metadatafile='C:\SQLBlog\PageCompressionTracing.xem')
WITH (MAX_MEMORY = 8MB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY=5SECONDS)
GO
-- Start the Event Session
ALTER EVENT SESSION PageCompressionTracing
ON SERVER
STATE=START
GO
-- Insert 10000 rows into the H_Lineitem table
INSERT INTO H_Lineitem
    (l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, 
     l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, 
     l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, 
     l_comment)
SELECT TOP 10000 
     l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, 
     l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, 
     l_shipdate, l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, 
     l_comment
FROM H_Lineitem
GO
-- Look at the compression information in sys.dm_db_index_operational_stats
SELECT 
    database_id, 
    object_id, 
    index_id, 
    page_compression_attempt_count, 
    page_compression_success_count,
    (page_compression_attempt_count - page_compression_success_count) as page_compression_failure_count
FROM sys.dm_db_index_operational_stats(db_id('TPCH'), object_id('H_Lineitem'), null, null)
GO
-- Drop the Event Session
DROP EVENT SESSION PageCompressionTracing
ON SERVER
GO

image

Now we can parse the Events that were captured by our Event Session and compare the information presented by sys.dm_db_index_operational_stats() with what was collected by Extended Events.

-- Create our result Analysis database
CREATE DATABASE [PageCompTestResults]
GO
USE [PageCompTestResults]
GO
-- Create intermediate temp table for raw event data
CREATE TABLE RawEventData
(Rowid int identity primary key, event_data xml)
GO
-- Read the file data into intermediate temp table
INSERT INTO RawEventData (event_data)
SELECT
    CAST(event_data AS XML) AS event_data
FROM sys.fn_xe_file_target_read_file('C:\SQLBlog\PageCompressionTracing*.xel', 
                                     'C:\SQLBlog\PageCompressionTracing*.xem', 
                                     null, null)
GO
-- Fetch the Event Data from the Event Session Target
SELECT 
    RowID,
    event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
    COALESCE(event_data.value('(event/data[@name="database_id"]/value)[1]', 'int'), 
             event_data.value('(event/action[@name="database_id"]/value)[1]', 'int')) AS database_id,
    event_data.value('(event/data[@name="file_id"]/value)[1]', 'int') AS [file_id],
    event_data.value('(event/data[@name="page_id"]/value)[1]', 'int') AS [page_id],
    event_data.value('(event/data[@name="rowset_id"]/value)[1]', 'bigint') AS [rowset_id],
    event_data.value('(event/data[@name="failure_reason"]/text)[1]', 'nvarchar(150)') AS [failure_reason],
    event_data.value('(event/action[@name="system_thread_id"]/value)[1]', 'int') AS [system_thread_id],
    event_data.value('(event/action[@name="scheduler_id"]/value)[1]', 'int') AS [scheduler_id],
    event_data.value('(event/action[@name="cpu_id"]/value)[1]', 'int') AS [cpu_id]
INTO ParsedResults
FROM RawEventData
GO

After parsing out the data, we can begin to really leverage the information we’ve gathered.  If we join the ParsedResults table to sys.partitions for our TPCH database by rowset_id = hobt_id, we can get the object_id and index_id and aggregate the failure reasons up to the object and index level.

SELECT 
    pr.database_id, 
    p.object_id, 
    p.index_id,
    failure_reason, 
    COUNT(*) as failure_count
FROM TPCH.sys.partitions p
JOIN ParsedResults pr
    ON pr.rowset_id = p.hobt_id
WHERE event_name = 'page_compression_attempt_failed'
GROUP BY     pr.database_id, 
    p.object_id, 
    p.index_id,
    failure_reason
GO
-- Look at the compression information in sys.dm_db_index_operational_stats
SELECT 
    database_id, 
    object_id, 
    index_id, 
    page_compression_attempt_count, 
    page_compression_success_count,
    (page_compression_attempt_count - page_compression_success_count) as page_compression_failure_count
FROM sys.dm_db_index_operational_stats(db_id('TPCH'), object_id('H_Lineitem'), null, null)
GO

image

With this we can se that the Extended Events sqlserver.page_compression_attempt_failed Event tracks failures and attempts that are not counted in sys.dm_db_index_operational_stats().  The PageModCountBelowThreshold failure isn’t really a failed attempt at compression.  This reason shows that the page was evaluated for recalculation, and the modified counter for the page hadn’t passed the internal threshold for recalculation so the actual compression operation wasn’t performed.  If we look at the sqlserver.page_compression_tracing Event information, we can see how the numbers begin to come together to match what is output by sys.dm_db_index_operational_stats().

SELECT 
    pr.database_id, 
    p.object_id, 
    p.index_id,
    COUNT(*) as attempt_count
FROM TPCH.sys.partitions p
JOIN ParsedResults pr
    ON pr.rowset_id = p.hobt_id
WHERE event_name = 'page_compression_tracing'
  AND opcode = 'Begin'
GROUP BY     pr.database_id, 
    p.object_id, 
    p.index_id
GO
-- Look at the compression information in sys.dm_db_index_operational_stats
SELECT 
    database_id, 
    object_id, 
    index_id, 
    page_compression_attempt_count, 
    page_compression_success_count,
    (page_compression_attempt_count - page_compression_success_count) as page_compression_failure_count
FROM sys.dm_db_index_operational_stats(db_id('TPCH'), object_id('H_Lineitem'), null, null)
GO

image

We have 193 attempts by this Event, and we have 72 PageModCountBelowThreshold failures, matching our actual attempts of 121 from the DMF.  We can then subtract out the other failures and get the 93 successful operations matching the DMF as well.

Nearly two years ago Kalen Delaney blogged about Splitting a page into multiple pages, showing how page splits occur inside of SQL Server.  Following her blog post, Michael Zilberstein wrote a post, Monitoring Page Splits with Extended Events, that showed how to see the sqlserver.page_split Events using Extended Events.  Eladio Rincón also blogged about Using XEvents (Extended Events) in SQL Server 2008 to detect which queries are causing Page Splits, but not in relation to Kalen’s blog post.  Both of these blog posts demonstrate how to get the sqlserver.page_split Events, but as discussed in the comments section of Michael Zilberstein’s blog post, the Event fires for all page splits and Adam Machanic and I talked after Eladio’s blog post and opened a connect item to have the sqlserver.page_split Event extended in the product so that you know what kind of split is actually occurring.

https://connect.microsoft.com/SQLServer/feedback/details/388482/sql-server-extended-events-page-split-event-additions

The CTP1 release of Denali has significant changes to the sqlserver.page_split Event, that makes it easier to find the splitting object as well the type of split that is occurring.  Before we look at that, I am going to show the code required to get the object and index information from SQL Server 2008, which is based on Adam’s comments to use sys.dm_os_buffer_descriptors.  For the examples in this blog post I am going use Kalen’s multipage split example from her blog post referenced above.

	-- Create the table 
	USE tempdb;
	GO
	SET NOCOUNT ON
	GO
	IF EXISTS (SELECT * FROM sys.tables
	            WHERE name = 'split_page')
	    DROP TABLE split_page;
	GO
	CREATE TABLE split_page 
	(id INT IDENTITY(0,2) PRIMARY KEY,
	id2 bigint DEFAULT 0,
	data1 VARCHAR(33) NULL, 
	data2 VARCHAR(8000) NULL);
	GO
	-- fill page until no more rows fit
	INSERT INTO split_page DEFAULT VALUES;
	GO 385
	-- verify that there is only one data page 
	DBCC IND(tempdb, split_page, -1);
	-- Create MonitorPageSplits Extended Event Session 
	IF (SELECT 1 FROM sys.server_event_sessions WHERE name = 'MonitorPageSplits') IS NOT NULL 
	   DROP EVENT SESSION MonitorPageSplits ON SERVER 
	GO 
	CREATE EVENT SESSION MonitorPageSplits ON SERVER 
	ADD EVENT sqlserver.page_split 
	( 
	    ACTION (sqlserver.database_id, sqlserver.sql_text)   
	    WHERE sqlserver.database_id = 2 
	) 
	ADD TARGET package0.ring_buffer 
	WITH(MAX_DISPATCH_LATENCY = 1 SECONDS)
	GO 
	-- Start the MonitorPageSplits Event Session 
	ALTER EVENT SESSION MonitorPageSplits ON SERVER STATE = start; 
	GO 
	-- Now insert one more row, this time filling the VARCHARs to the maximum length. 
	SET IDENTITY_INSERT split_page  ON;
	GO
	INSERT INTO split_page (id, id2, data1, data2)
	      SELECT 111, 0, REPLICATE('a', 33), REPLICATE('b', 8000);
	GO
	SET IDENTITY_INSERT split_page  OFF;
	GO 
	ALTER EVENT SESSION MonitorPageSplits ON SERVER 
	DROP EVENT sqlserver.page_split; 
	GO 
	-- Wait to allow dispatch to complete
	WAITFOR DELAY '00:00:01.000' 
	GO
	SELECT oTab.*
	  , p.OBJECT_ID
	  , p.index_id
	  , OBJECT_NAME(p.OBJECT_ID)
	  , i.name
	FROM
	(
	SELECT 
	    XEvent            = XEvent.query('.') 
	  , time              = XEvent.value('(@timestamp)[1]','datetime') 
	  , FILE_ID           = XEvent.value('(data[@name=''file_id'']/value)[1]','int') 
	  , page_id           = XEvent.value('(data[@name=''page_id'']/value)[1]','int') 
	  , database_id       = XEvent.value('(action[@name=''database_id'']/value)[1]','int') 
	  , sql_text          = XEvent.value('(action[@name=''sql_text'']/value)[1]','varchar(max)') 
	FROM 
	( 
	   SELECT CAST(target_data AS XML) AS target_data 
	   FROM sys.dm_xe_session_targets xst 
	   JOIN sys.dm_xe_sessions xs ON xs.address = xst.event_session_address 
	   WHERE xs.name = 'MonitorPageSplits' 
	) AS tab (target_data) 
	CROSS APPLY target_data.nodes('/RingBufferTarget/event') AS EventNodes(XEvent) 
	) AS oTab
	LEFT JOIN sys.dm_os_buffer_descriptors AS obd
	   ON obd.database_id = oTab.database_id
	       AND obd.FILE_ID = oTab.FILE_ID
	       AND obd.page_id = oTab.page_id
	LEFT JOIN sys.allocation_units au
	   ON au.allocation_unit_id = obd.allocation_unit_id
	LEFT JOIN sys.partitions p 
	   ON p.partition_id = au.container_id  
	LEFT JOIN sys.indexes i
	   ON p.OBJECT_ID = i.OBJECT_ID
	       AND p.index_id = i.index_id
	-- verify that there is only one data page 
	DBCC IND(tempdb, split_page, -1);
	
	

The above code creates a table in tempdb, loads one page of data in it exactly as in Kalen’s blog post, and then creates an Event Session for the sqlserver.page_split Event in tempdb, that also collects the sqlserver.database_id and sqlserver.sql_text actions when the Event fires.  After triggering the page split, it drops the Event from the Event Session and then uses WAITFOR DELAY to allow the events to be buffered to the package0.ring_buffer Target.  Then it shreds the XML and joins to the DMV’s to get the object and index names.  The output of running the above script in SQL Server 2008 should be similar to the following, showing 10 split events and 10 additional pages in the database table.

 image

Note that the only two columns returned by the sqlserver.page_split Event are the file_id and page_id.  In SQL Server Denali CTP1, the sqlserver.page_split event now has a much larger Event payload associated with it.  It now returns the file_id, page_id, database_id (as a part of the event, not requiring an action), rowset_id, splitOperation, new_page_file_id, and the new_page_page_id associated with the page_split Event.   This makes the Event much more useful and allows it to be used without having to query the buffer descriptors to find the object association.  The following demo is identical to the demo for SQL Server 2008 listed above with the exception of that the XQuery is slightly different (a requirement to pull the new information from the XML).

	-- Create the table 
	USE tempdb;
	GO
	SET NOCOUNT ON
	GO
	IF EXISTS (SELECT * FROM sys.tables
	            WHERE name = 'split_page')
	    DROP TABLE split_page;
	GO
	CREATE TABLE split_page 
	(id INT IDENTITY(0,2) PRIMARY KEY,
	id2 bigint DEFAULT 0,
	data1 VARCHAR(33) NULL, 
	data2 VARCHAR(8000) NULL);
	GO
	-- fill page until no more rows fit
	INSERT INTO split_page DEFAULT VALUES;
	GO 385
	-- verify that there is only one data page 
	DBCC IND(tempdb, split_page, -1);
	-- Create MonitorPageSplits Extended Event Session 
	IF (SELECT 1 FROM sys.server_event_sessions WHERE name = 'MonitorPageSplits') IS NOT NULL 
	   DROP EVENT SESSION MonitorPageSplits ON SERVER 
	GO 
	CREATE EVENT SESSION MonitorPageSplits ON SERVER 
	ADD EVENT sqlserver.page_split 
	( 
	    ACTION (sqlserver.database_id, sqlserver.sql_text)   
	    WHERE sqlserver.database_id = 2 
	) 
	ADD TARGET package0.ring_buffer 
	WITH (MAX_DISPATCH_LATENCY = 1 SECONDS)
	GO 
	-- Start the MonitorPageSplits Event Session 
	ALTER EVENT SESSION MonitorPageSplits ON SERVER STATE = start; 
	GO 
	-- Now insert one more row, this time filling the VARCHARs to the maximum length. 
	SET IDENTITY_INSERT split_page  ON;
	GO
	INSERT INTO split_page (id, id2, data1, data2)
	      SELECT 111, 0, REPLICATE('a', 33), REPLICATE('b', 8000);
	GO
	SET IDENTITY_INSERT split_page  OFF;
	GO 
	ALTER EVENT SESSION MonitorPageSplits ON SERVER 
	DROP EVENT sqlserver.page_split; 
	
	GO
	SELECT 
	    event_time         = XEvent.value('(@timestamp)[1]','datetime') 
	  , orig_file_id      = XEvent.value('(data[@name=''file_id'']/value)[1]','int') 
	  , orig_page_id      = XEvent.value('(data[@name=''page_id'']/value)[1]','int') 
	  , database_id           = XEvent.value('(data[@name=''database_id'']/value)[1]','int') 
	  , OBJECT_ID         = p.OBJECT_ID
	  , index_id          = p.index_id
	  , OBJECT_NAME           = OBJECT_NAME(p.OBJECT_ID)
	  , index_name            = i.name
	  , rowset_id         = XEvent.value('(data[@name=''rowset_id'']/value)[1]','bigint') 
	  , splitOperation        = XEvent.value('(data[@name=''splitOperation'']/text)[1]','varchar(255)') 
	  , new_page_file_id  = XEvent.value('(data[@name=''new_page_file_id'']/value)[1]','int') 
	  , new_page_page_id  = XEvent.value('(data[@name=''new_page_page_id'']/value)[1]','int') 
	  , sql_text          = XEvent.value('(action[@name=''sql_text'']/value)[1]','varchar(max)') 
	FROM 
	( 
	   SELECT CAST(target_data AS XML) AS target_data 
	   FROM sys.dm_xe_session_targets xst 
	   JOIN sys.dm_xe_sessions xs ON xs.address = xst.event_session_address 
	   WHERE xs.name = 'MonitorPageSplits' 
	) AS tab (target_data) 
	CROSS APPLY target_data.nodes('/RingBufferTarget/event') AS EventNodes(XEvent) 
	LEFT JOIN sys.allocation_units au
	   ON au.container_id = XEvent.value('(data[@name=''rowset_id'']/value)[1]','bigint') 
	LEFT JOIN sys.partitions p 
	   ON p.partition_id = au.container_id  
	LEFT JOIN sys.indexes i
	   ON p.OBJECT_ID = i.OBJECT_ID
	       AND p.index_id = i.index_id
	-- View the Page allocations 
	DBCC IND(tempdb, split_page, -1);
	

If you run the above demo the output should be similar to the below (if you click on the picture, it will open up larger).  One thing that should become immediately obvious is that the same demo in Denali is doing 1/3rd of the page splits that occur in SQL Server 2008. 

image

The old_page_id and new_page_id tell where the page originated and moved to, and the splitOperation tells the type of split.  In this case only two of the type of splits are occurring; SPLIT_FOR_ROOT_NODE which occurs when the first page allocated is split into multiple pages, and SPLIT_FOR_INSERT which occurs as the inserts continue and the pages are split to accommodate the data.  There are a number of additional split operations that exist in SQL Server Denali CTP1 including, SPLIT_FOR_DELETE, SPLIT_FOR_GHOST, SPLIT_FOR_INTERNAL_NODE, and SPLIT_FOR_UPDATE.  I’ve tried to figure out how to correlate the output from DBCC IND with the data held in the Event Session for page splits to correlate the old_page_id and new_page_id to identify problematic splits, but haven’t finalized validation of my tests yet (hopefully I can finish this work and I’ll write an update to this blog post showing how to do this at some point in the near future).  One item that I have noted in my testing is that mid-page splits generally generate multiple sqlserver.page_split Events in the same operation, similar to the demonstrations used in this example, where as end-page splits for identity and sequential GUID inserts do not.  I am not certain that this is a valid conclusion to come to at this point and have further testing to do to investigate page splits more.

There are 7 Session level options that can be configured in Extended Events that affect the way an Event Session operates.  These options can impact performance and should be considered when configuring an Event Session.  I have made use of a few of these periodically throughout this months blog posts, and in today’s blog post I’ll cover each of the options separately, and provide further information about their usage.  Mike Wachal from the Extended Events team at Microsoft, talked about the Session options on his blog post, Option Trading: Getting the most out of the event session options, and I’d recommend giving it a read for additional information as well.

EVENT_RETENTION_MODE

The EVENT_RETENTION_MODE option specifies how the Event Session handles Event loss when Events generate faster than they can be dispatched to the Targets.  There are three possible values for this option; ALLOW_SINGLE_EVENT_LOSS, ALLOW_MULTIPLE_EVENT_LOSS, and NO_EVENT_LOSS.  This option directly affects the possible impact that an Event Session may have on the performance of a system while the Event Session is active.  A trade off occurs between performance impact and the guarantee whether all Events are captured or not.

ALLOW_SINGLE_EVENT_LOSS

The ALLOW_SINGLE_EVENT_LOSS value is the system default for all Event Sessions where the EVENT_RETENTION_MODE is not explicitly specified as a part of the Event Session definition.  This value allows single events to be dropped and lost from the session when the memory buffers for the Event Session are full and dispatch to the Targets can not keep up with the Event generation. 

ALLOW_MULTIPLE_EVENT_LOSS

The ALLOW_MULTIPLE_EVENT_LOSS value allows an entire memory buffer containing multiple events to be dropped and lost when the memory buffers are full and the Events are generating faster than the buffers can be dispatched to the Targets.  This can minimize the performance impact on the server at the trade off that many Events could potentially be lost, with the number of Events lost depending on the size of the Events being generated, the configuration of the MAX_MEMORY session option, and the MEMORY_PARTITION_MODE session option. 

NO_EVENT_LOSS

The NO_EVENT_LOSS value guarantees that all Events that fire are captured, but at the expense of possible system performance degradation when the Event Session is active.  If the memory buffers are all full and an Event fires, the task firing the Event will wait until space is available in a memory buffer for the Event to be buffered.  This option value is not recommended by the Extended Events team at Microsoft for most Event Sessions and should be used with extreme caution and only when it is absolutely necessary that every Event be captured, even at the expense of degraded performance of the system.

MAX_DISPATCH_LATENCY

The MAX_DISPATCH_LATENCY option specifies the time in seconds that Events are held in a memory buffer that is not full before being dispatched to the asynchronous session Targets.  The default value if the MAX_DISPATCH_LATENCY is not explicitly defined in the Session definition is 30 seconds, and the option has a minimum value of 1 second.  If a value of 0 or INFINITE is specified, the Events held in a memory buffer will not be dispatched until the memory buffer becomes full.

MAX_EVENT_SIZE

The MAX_EVENT_SIZE option specifies the maximum size in kilobytes or megabytes an individual Event can be.  The default value for this option when it is not explicitly set in the Session definition is 0KB, allowing the maximum Event size to be the size of a single memory buffer in the Event Session.  This option can be explicitly set to allow Events that are larger than a single memory buffer to be captured by the Event Session.  The minimum value for this option is 64KB.

MAX_MEMORY

The MAX_MEMORY option specifies the amount of memory in kilobytes or megabytes that is allocated to the memory buffers for the Event Session.  The value of this options is divided evenly amongst the memory buffers that are created for the Event Session based on the configuration of the MEMORY_PARTITION_MODE session option.  The MAX_MEMORY option can be used to increase the memory available for buffering Events when a large number of Events are expected to fire, minimizing Event loss due to full memory buffers.  The default value for this option is 4 megabytes (MB) or 4096 kilobytes (KB). 

Mike Wachal blogged about this option on the Extended Events blog Take it to the MAX (and beyond), and again in response to a number of questions that I sent him early on in this blog series when I was working on a large NUMA based server, Session memory – who’s this guy named Max and what’s he doing with my memory?

MEMORY_PARTITION_MODE

The MEMORY_PARTITION_MODE option specifies how the memory buffers for the Event Session are created and/or partitioned.  For servers with multiple processors and/or multiple NUMA nodes the memory buffers can become a bottleneck performance wise if multiple CPU’s are firing Events and have to wait on a memory buffer to buffer the Event information being collected.  There are three values for this option; NONE, PER_NODE, and PER_CPU. 

NONE

The NONE value specifies that a single set of memory buffers will be created for the Event Session.  In this configuration, three memory buffers are created for the Event Session, and the memory for the Event Session is divided evenly, to the nearest 64KB boundary, amongst the three memory buffers.  This is the default value for an Event Session if the MEMORY_PARTITION_MODE is not explicitly defined.

PER_NODE

The PER_NODE value specifies that a separate set of three memory buffers will be created.  In this configuration, three memory buffers are created for each NUMA node that exists for the SQL Server Instance, and the memory is divided evenly, to the nearest 64KB boundary, amongst all of the memory buffers.  

PER_CPU

The PER_CPU value specifies that a set of memory buffers is created for each CPUs/Scheduler that is assigned to the SQL Server Instance.  In this configuration, the number of memory buffers is 2.5 times the number of CPUs/Schedulers available, and the memory is divided evenly, to the nearest 64KB boundary, amongst all of the memory buffers.

STARTUP_STATE

The STARTUP_STATE option specifies whether an Event Session automatically starts in an Active state when the SQL Server Instance starts up.  There are two valid values for this option, ON and OFF, with OFF being the default.

TRACK_CAUSALITY

The TRACK_CAUSALITY option specifies whether causality tracking across multiple Events is turned ON or OFF.  The default configuration for this option is OFF.  When TRACK_CAUSALITY is turned on, an additional Action, package0.attach_activity_id, is added to each Event that fires in the Event Session.  This Action is a combination GUID and sequence number that allows related Events to be tracked for cause and effect analysis of the Events that fired in the order in which they have fired.

I should make note of the fact that in many cases, the options specified in the blog posts, may not be appropriate for a production implementation, and may have been made based on the fact that I just didn’t want to wait over multiple test cycles for Events to dispatch to the Targets.

One of my favorite features that was added to SQL Server 2005 has been the Blocked Process Report trace event which collects an XML report whenever a process is blocked inside of the database engine longer than the user configurable threshold.  I wrote an article about this feature on SQL Server Central  two years ago titled Using the Blocked Process Report in SQL Server 2005/2008.  One of the aspects of this feature is that it requires that you either have a SQL Trace running that captures the event, or you configure Event Notifications on the server to capture the event information in a Service Broker Queue, neither of which is overly difficult to accomplish.  In SQL Server Denali CTP1, there is a new Extended Event, sqlserver.blocked_process_report, that makes this even easier to use.  We can now create an Event Session that exists in our SQL Server and is waiting to be started to capture blocked process information as needed.  However, we still have to set the ‘blocked process threshold’ sp_configure option to set the threshold at which blocked process report information is generated by Database Engine, firing the Event in our Event Session.

CREATE EVENT SESSION MonitorBlocking
ON SERVER
ADD EVENT sqlserver.blocked_process_report
ADD TARGET package0.ring_buffer(SET MAX_MEMORY=2048)
WITH (MAX_DISPATCH_LATENCY = 5SECONDS)
GO
ALTER EVENT SESSION MonitorBlocking
ON SERVER
STATE=START
GO
EXECUTE sp_configure 'show advanced options', 1
GO
RECONFIGURE
GO
EXECUTE sp_configure 'blocked process threshold', 15
GO
RECONFIGURE
GO
EXECUTE sp_configure 'show advanced options', 0
GO
RECONFIGURE
GO

To test this Event Session, we can open two New Query Windows in SSMS and connect them to our database engine.  Then in one window run the following code:

USE [tempdb]
GO
CREATE TABLE t1 (RowID int identity primary key)
GO
BEGIN TRANSACTION
INSERT INTO t1 DEFAULT VALUES
WAITFOR DELAY '00:00:30'
COMMIT

Then in the other window run the following code:

USE [tempdb]
GO
SELECT * FROM t1

The first query will block the execution of the second query until it completes, generating our blocked process report in the ring_buffer Target for our Event Session.  To get the information from the ring_buffer target, we can run a quick XQuery to parse the Target (at this point in the series, you should be an XQuery pro):

-- Query the XML to get the Target Data
SELECT 
    n.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    n.value('(event/@package)[1]', 'varchar(50)') AS package_name,
    DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            n.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
    ISNULL(n.value('(event/data[@name="database_id"]/value)[1]', 'int'),
            n.value('(event/action[@name="database_id"]/value)[1]', 'int')) as [database_id],
    n.value('(event/data[@name="database_name"]/value)[1]', 'nvarchar(128)') as [database_name],
    n.value('(event/data[@name="object_id"]/value)[1]', 'int') as [object_id],
    n.value('(event/data[@name="index_id"]/value)[1]', 'int') as [index_id],
    CAST(n.value('(event/data[@name="duration"]/value)[1]', 'bigint')/1000000.0 AS decimal(6,2)) as [duration_seconds],
    n.value('(event/data[@name="lock_mode"]/text)[1]', 'nvarchar(10)') as [file_handle],
    n.value('(event/data[@name="transaction_id"]/value)[1]', 'bigint') as [transaction_id],
    n.value('(event/data[@name="resource_owner_type"]/text)[1]', 'nvarchar(10)') as [resource_owner_type],
    CAST(n.value('(event/data[@name="blocked_process"]/value)[1]', 'nvarchar(max)') as XML) as [blocked_process_report]
FROM
(    SELECT td.query('.') as n
    FROM 
    (
        SELECT CAST(target_data AS XML) as target_data
        FROM sys.dm_xe_sessions AS s    
        JOIN sys.dm_xe_session_targets AS t
            ON s.address = t.event_session_address
        WHERE s.name = 'MonitorBlocking'
          AND t.target_name = 'ring_buffer'
    ) AS sub
    CROSS APPLY target_data.nodes('RingBufferTarget/event') AS q(td)
) as tab
GO

image

The output of the Event in Extended Events contains some additional information to the blocked process report in XML format including the database_id, object_id, index_id, duration, lock_mode, transaction_id, and resource_owner_type for the blocking resource.  Clicking on the blocked process report XML will open it up in Management Studio as an XML document allowing detailed analysis of the blocking to be performed.

<blocked-process-report>
  <blocked-process>
    <process id="process2eb8bda8" taskpriority="0" logused="0" waitresource="KEY: 2:2666130980878942208 (61a06abd401c)" 
             waittime="25480" ownerId="12748" transactionname="SELECT" lasttranstarted="2010-12-21T18:19:03.263" 
             XDES="0x2dfb9c10" lockMode="S" schedulerid="1" kpid="2484" status="suspended" spid="60" sbid="0" ecid="0" 
             priority="0" trancount="0" lastbatchstarted="2010-12-21T18:19:03.263" 
             lastbatchcompleted="2010-12-21T18:19:03.263" clientapp="Microsoft SQL Server Management Studio - Query" 
             hostname="WIN-5B9V8JPLP3H" hostpid="2708" loginname="WIN-5B9V8JPLP3H\Administrator" 
             isolationlevel="read committed (2)" xactid="12748" currentdb="2" lockTimeout="4294967295" 
             clientoption1="671090784" clientoption2="390200">
      <executionStack>
        <frame line="1" sqlhandle="0x02000000d9de7b2f4f3a78e40f100bc02a84efbb9f01a84d" />
      </executionStack>
      <inputbuf>
SELECT * FROM t1   </inputbuf>
    </process>
  </blocked-process>
  <blocking-process>
    <process status="suspended" waittime="27430" spid="57" sbid="0" ecid="0" priority="0" trancount="1" 
             lastbatchstarted="2010-12-21T18:19:01.437" lastbatchcompleted="2010-12-21T18:13:25.637" 
             clientapp="Microsoft SQL Server Management Studio - Query" hostname="WIN-5B9V8JPLP3H" 
             hostpid="2708" loginname="WIN-5B9V8JPLP3H\Administrator" isolationlevel="read committed (2)" 
             xactid="12733" currentdb="2" lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200">
      <executionStack>
        <frame line="3" stmtstart="100" stmtend="150" sqlhandle="0x020000005a74b3030117e049389a93b2ce5bfb48e272f938" />
      </executionStack>
      <inputbuf>
BEGIN TRANSACTION
INSERT INTO t1 DEFAULT VALUES
WAITFOR DELAY '00:00:30'
COMMIT   </inputbuf>
    </process>
  </blocking-process>
</blocked-process-report>

The output of the blocked process report in Extended Events is the same as the blocked process report from SQL Trace and Event Notifications.  This is only a new mechanism of collecting this information, and when Denali releases RTM, this will be an Event Session that I install on my Denali based servers as a part of my configuration scripts, having it ready to activate when necessary.

Make sure that you turn off the generation of blocked process reports by changing the ‘blocked process threshold’ sp_configure option back to 0 (zero) whenever you are not actively looking for blocked processes.

Today’s post will be somewhat short, but we’ll look at Customizable Fields on Events in Extended Events and how they are used to collect additional information.  Customizable Fields generally represent information of potential interest that may be expensive to collect, and is therefore made available for collection if specified by the Event Session.  In SQL Server 2008 and 2008 R2, there are 50 Events that have customizable columns in their payload.  In SQL Server Denali CTP1, there are 124 Events that have customizable columns in their payload. The customizable columns and Events that have them can be found with the following query.

SELECT 
    p.name AS package_name,
    o.name AS event_name,
    oc.name AS column_name,
    oc.column_type,
    oc.type_name,
    oc.description
FROM sys.dm_xe_packages p
JOIN sys.dm_xe_objects o
    ON p.guid = o.package_guid
JOIN sys.dm_xe_object_columns oc
    ON o.name = oc.object_name 
        AND o.package_guid = oc.object_package_guid
WHERE ((p.capabilities is null or p.capabilities & 1 = 0)
  AND (o.capabilities is null or o.capabilities & 1 = 0)
  AND (oc.capabilities is null or oc.capabilities & 1 = 0))
  AND o.object_type = 'event'
  AND oc.column_type = 'customizable'

If we look at a specific Event containing a customizable column, in this case the sqlserver.file_read_completed Event, we will see that there is a customizable column as well as a data column for the data collected by the customizable column.

SELECT 
    p.name AS package_name,
    o.name AS event_name,
    oc.column_id,
    oc.name AS column_name,
    oc.column_type,
    oc.type_name,
    oc.description
FROM sys.dm_xe_packages p
JOIN sys.dm_xe_objects o
    ON p.guid = o.package_guid
JOIN sys.dm_xe_object_columns oc
    ON o.name = oc.object_name 
        AND o.package_guid = oc.object_package_guid
WHERE ((p.capabilities is null or p.capabilities & 1 = 0)
  AND (o.capabilities is null or o.capabilities & 1 = 0)
  AND (oc.capabilities is null or oc.capabilities & 1 = 0))
  AND o.object_type = 'event'
  AND o.name = 'file_read_completed'
  AND oc.column_type <> 'readonly'
ORDER BY oc.column_type, oc.column_id
image

In the red box are the customizable columns, and the blue box has the associated data columns to the customizable columns.  The data columns will exist in the Event data from the Event firing, but they will only have a value in the Event data if the customizable column is set to collect the information.

CREATE EVENT SESSION CustomizableColumnDemo
ON SERVER
ADD EVENT sqlserver.file_read_completed
(
    WHERE (database_id = 4)    
)
ADD TARGET package0.ring_buffer
GO
ALTER EVENT SESSION CustomizableColumnDemo
ON SERVER
STATE=START
GO
DBCC DROPCLEANBUFFERS
GO
SELECT TOP 10 * FROM msdb.dbo.backupset
GO

If we query the Target data for the above Event, we’ll see that the path and io_data columns are included in the Event XML, but there is no value in the XML nodes.

SELECT CAST(target_data AS XML) as target_data
FROM sys.dm_xe_sessions AS s    
JOIN sys.dm_xe_session_targets AS t
    ON s.address = t.event_session_address
WHERE s.name = 'CustomizableColumnDemo'
  AND t.target_name = 'ring_buffer'
<event name="file_read_completed" package="sqlserver" id="83" version="1" timestamp="2010-12-20T03:14:20.393Z">
  <data name="mode">
    <type name="file_io_mode" package="sqlserver" />
    <value>0</value>
    <text>Contiguous</text>
  </data>
  <data name="duration">
    <type name="uint64" package="package0" />
    <value>0</value>
    <text />
  </data>
  <data name="file_handle">
    <type name="ulong_ptr" package="package0" />
    <value>0x0000000000000b38</value>
    <text />
  </data>
  <data name="offset">
    <type name="uint64" package="package0" />
    <value>14352384</value>
    <text />
  </data>
  <data name="database_id">
    <type name="uint16" package="package0" />
    <value>4</value>
    <text />
  </data>
  <data name="file_id">
    <type name="uint16" package="package0" />
    <value>1</value>
    <text />
  </data>
  <data name="file_group_id">
    <type name="uint16" package="package0" />
    <value>1</value>
    <text />
  </data>
  <data name="path">
    <type name="unicode_string" package="package0" />
    <value />
    <text />
  </data>
  <data name="io_data">
    <type name="binary_data" package="package0" />
    <value />
    <text />
  </data>
</event>

To set the customizable column to collect the data, in the ADD EVENT section of the CREATE EVENT SESSION or ALTER EVENT SESSION DDL command, the SET option is used to turn data collection on for the column.

DROP EVENT SESSION CustomizableColumnDemo
ON SERVER
GO
CREATE EVENT SESSION CustomizableColumnDemo
ON SERVER
ADD EVENT sqlserver.file_read_completed
(    
    SET collect_path = 1
    WHERE(database_id = 4)
)
ADD TARGET package0.ring_buffer
GO
ALTER EVENT SESSION CustomizableColumnDemo
ON SERVER
STATE=START
GO
DBCC DROPCLEANBUFFERS
GO
SELECT TOP 10 * FROM msdb.dbo.backupset
GO
ALTER EVENT SESSION CustomizableColumnDemo
ON SERVER
DROP EVENT sqlserver.file_read_completed
GO

Notice that the SET option does not use parenthesis, they are not allowed in the DDL definition.  By setting the collect_path customizable column to 1 the Event XML now contains the path to the data file that was read.

-- Query the XML to get the Target Data
SELECT 
    n.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    n.value('(event/@package)[1]', 'varchar(50)') AS package_name,
    DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            n.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
    ISNULL(n.value('(event/data[@name="database_id"]/value)[1]', 'int'),
            n.value('(event/action[@name="database_id"]/value)[1]', 'int')) as [database_id],
    n.value('(event/data[@name="mode"]/value)[1]', 'nvarchar(50)') as [mode],
    n.value('(event/data[@name="duration"]/value)[1]', 'bigint') as [duration],
    n.value('(event/data[@name="file_handle"]/value)[1]', 'nvarchar(50)') as [file_handle],
    n.value('(event/data[@name="offset"]/value)[1]', 'int') as [offset],
    n.value('(event/data[@name="file_id"]/value)[1]', 'int') as [file_id],
    n.value('(event/data[@name="path"]/value)[1]', 'nvarchar(250)') as [path],
    n.value('(event/data[@name="id_data"]/value)[1]', 'nvarchar(max)') as [io_data]   
FROM
(    SELECT td.query('.') as n
    FROM 
    (
        SELECT CAST(target_data AS XML) as target_data
        FROM sys.dm_xe_sessions AS s    
        JOIN sys.dm_xe_session_targets AS t
            ON s.address = t.event_session_address
        WHERE s.name = 'CustomizableColumnDemo'
          AND t.target_name = 'ring_buffer'
    ) AS sub
    CROSS APPLY target_data.nodes('RingBufferTarget/event') AS q(td)
) as tab
GO

image

The increase in the number of Events with customizable columns in Denali CTP1 is, in my own opinion, a great step in the right direction for Extended Events.  The use of customizable columns to add data into the Event payload extends the flexibility of Extended Events by providing a mechanism to gather additional data related to Events that is specific to the Event and not globally available like Actions.

If you don’t know anything about Ghost Cleanup, I recommend highly that you go read Paul Randal’s blog posts Inside the Storage Engine: Ghost cleanup in depth, Ghost cleanup redux, and Turning off the ghost cleanup task for a performance gain.  To my knowledge Paul’s posts are the only things that cover Ghost Cleanup at any level online.

In this post we’ll look at how you can use Extended Events to track the activity of Ghost Cleanup inside of your SQL Server.  To do this, we’ll first take a look at the ghost_cleanup Event and what it returns.

-- Find the Event
SELECT 
    p.name, 
    o.name, 
    o.description
FROM sys.dm_xe_packages AS p
JOIN sys.dm_xe_objects AS o
    ON p.guid = o.package_guid
WHERE o.name = 'ghost_cleanup'

-- Get the data columns for the Event
SELECT 
    name, 
    TYPE_NAME
FROM sys.dm_xe_object_columns
WHERE OBJECT_NAME = 'ghost_cleanup'
  AND column_type = 'data'

image

The ghost_cleanup Event is in the sqlserver Package and returns the file_id and page_id that the ghost cleanup process is working on.  Since most SQL Servers have multiple databases, we probably will want to track the database_id through an Action as well.  Since ghost cleanup is a background process, and we don’t know much about how it works, or how many Events are going to be generated, we could start off with the synchronous_event_counter Target, but in this case, I just want to capture Events and all of them, so we will just go with the asynchronous_file_target.

CREATE EVENT SESSION TrackGhostCleanup
ON SERVER
ADD EVENT sqlserver.ghost_cleanup
( ACTION(sqlserver.database_id))
ADD TARGET package0.asynchronous_file_target(
     SET filename='C:\SQLBlog\TrackGhostCleanup.xel',
         metadatafile='C:\SQLBlog\TrackGhostCleanup.xem')
WITH (MAX_MEMORY = 4MB, EVENT_RETENTION_MODE = NO_EVENT_LOSS )
GO
ALTER EVENT SESSION TrackGhostCleanup
ON SERVER
STATE=START

This is a really basic Event Session, it captures one Event, sqlserver.ghost_cleanup and collects the sqlserver.database_id Action when the Event fires.  The Event data is captured by the package0.asynchronous_file_target, and the Event Session is configured to not allow Event Loss.  After the starting the Event Session and allowing it to run, we can query the files for the captured events and see how ghost_cleanup is running on our instance.

-- Query the Event data from the Target.
SELECT 
    event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    event_data.value('(event/@package)[1]', 'varchar(50)') AS package_name,
    event_data.value('(event/@id)[1]', 'int') AS id,
    event_data.value('(event/@version)[1]', 'int') AS version,
    DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
    event_data.value('(event/action[@name="database_id"]/value)[1]', 'int') as database_id,
    event_data.value('(event/data[@name="file_id"]/value)[1]', 'int') as file_id,
    event_data.value('(event/data[@name="page_id"]/value)[1]', 'int') as page_id
FROM 
(SELECT
    CAST(event_data AS XML) AS event_data
 FROM sys.fn_xe_file_target_read_file('C:\SQLBlog\TrackGhostCleanup*.xel', 'C:\SQLBlog\TrackGhostCleanup*xem', null, null)
) as tab

From around 15 minutes of runtime on one of my development servers, over 17.5K Events have fired, much more than I initially anticipated, and after nearly 30 minutes of runtime, I had just over 37K Events. 

image

Some interesting information can be found in the Events.  In SQL Server 2008, the Ghost Cleanup process runs every 10 seconds, just as Paul has documented in his blog posts, which was a change from every 5 seconds in SQL Server 2005.

image

The process in 2008 cleans up 200 pages at a time, something Paul hasn’t specifically blogged about for SQL Server 2008.  Before anyone debates this, Paul’s statement “It will check through or cleanup a limited number of pages each time it wakes up - I remember the limit is 10 pages - to ensure it doesn't swamp the system.” from his Inside the Storage Engine: Ghost cleanup in depth is based on SQL Server 2005, which also ran ghost cleanup every 5 seconds instead of 10 seconds.  We can look at the Event information over subsequent 10 second intervals and see that 200 pages are cleaned up each time ghost_cleanup runs.

image

image

This has to be one of my favorite aspects of Extended Events.  You get to really learn about SQL Server Internals by just playing with SQL Server.  I have a couple more blog posts that show how you can learn about SQL Server Internals using Extended Events for this series, and if you are interested in a previous post on the subject check out TSQL Tuesday #11 – Physical IO’s Don’t Always Accumulate Wait Times.

Until tomorrow….

When working with SQL Trace, one of my biggest frustrations has been the limitations that exist in filtering.  Using sp_trace_setfilter to establish the filter criteria is a non-trivial task, and it falls short of being able to deliver complex filtering that is sometimes needed to simplify analysis.  Filtering of trace data was performed globally and applied to the trace affecting all of the events being collected.  Extended Events introduces a much better system of filtering using Predicates that are applied at the individual Event level, allow for short circuiting of evaluation, and provide the ability to create complex groups of independent criteria, ensuring only Events of interest are captured by the Event Session.

In yesterdays post, The system_health Session, I talked about the default system_health session that is running on every SQL Server 2008/2008R2 and Denali CTP1 instance out of the box.  The Predicate definition for the sqlos.wait_info event in the system_health session is a good example to follow for complex, short-circuiting Predicate definition in Extended Events.

ADD EVENT sqlos.wait_info
(
    ACTION (package0.callstack, sqlserver.session_id, sqlserver.sql_text)
    WHERE 
    (duration > 15000 AND 
        (    
            (wait_type > 31    -- Waits for latches and important wait resources (not locks) 
                            -- that have exceeded 15 seconds. 
                AND
                (
                    (wait_type > 47 AND wait_type < 54)
                    OR wait_type < 38
                    OR (wait_type > 63 AND wait_type < 70)
                    OR (wait_type > 96 AND wait_type < 100)
                    OR (wait_type = 107)
                    OR (wait_type = 113)
                    OR (wait_type > 174 AND wait_type < 179)
                    OR (wait_type = 186)
                    OR (wait_type = 207)
                    OR (wait_type = 269)
                    OR (wait_type = 283)
                    OR (wait_type = 284)
                )
            )
            OR 
            (duration > 30000        -- Waits for locks that have exceeded 30 secs.
                AND wait_type < 22
            ) 
        )
    )
),

Since Predicates perform short-circuit evaluation, where the criteria groups are evaluated in order and the first failure in the criteria causes the Predicate evaluation to stop and preventing the Event from being fired in the engine, the order of the criteria can directly impact the performance of an Event Session.  If we look at the definition for the sqlos.wait_info Event, the first Predicate criteria specifies that the duration of the wait has to be greater than 15 seconds.  Since the majority of waits in SQL Server generally occur with durations less than 15 seconds, the Predicate evaluations shortcut immediately and the Event does not fire.  If the wait exceeds the 15 second duration, the evaluation continues and checks that the wait_type matches one of defined values.  How do we know what these values are? 

When looking at an Event, all of the columns have a type_name associated with them that can be found in the sys.dm_xe_object_columns DMV as previously discussed in this series.  If we take a look at the type_name for the wait_info Event wait_type column, we’ll see that it has a type of wait_types.

SELECT 
    name, 
    type_name, 
    column_type
FROM sys.dm_xe_object_columns
WHERE object_name = 'wait_info'
  AND column_type <> 'readonly'
image

When a column has a non-standard type_name like this, it corresponds to a Map that has been loaded in Extended Events.  We can find a list of the wait_types that the Event will fire for by querying the sys.dm_xe_map_values DMV for the map_keys defined in the Event Session:

SELECT map_key, map_value
FROM sys.dm_xe_map_values
WHERE name = 'wait_types'
  AND 
    (map_key > 31    -- Waits for latches and important wait resources (not locks) 
                    -- that have exceeded 15 seconds. 
        AND
        (
            (map_key > 47 AND map_key < 54)
            OR map_key < 38
            OR (map_key > 63 AND map_key < 70)
            OR (map_key > 96 AND map_key < 100)
            OR (map_key = 107)
            OR (map_key = 113)
            OR (map_key > 174 AND map_key < 179)
            OR (map_key = 186)
            OR (map_key = 207)
            OR (map_key = 269)
            OR (map_key = 283)
            OR (map_key = 284)
        )
    )

The wait_types that correspond to the first complex grouping are:

map_key map_value
32 LATCH_NL
33 LATCH_KP
34 LATCH_SH
35 LATCH_UP
36 LATCH_EX
37 LATCH_DT
48 PAGELATCH_NL
49 PAGELATCH_KP
50 PAGELATCH_SH
51 PAGELATCH_UP
52 PAGELATCH_EX
53 PAGELATCH_DT
64 PAGEIOLATCH_NL
65 PAGEIOLATCH_KP
66 PAGEIOLATCH_SH
67 PAGEIOLATCH_UP
 
map_key map_value
68 PAGEIOLATCH_EX
69 PAGEIOLATCH_DT
97 IO_COMPLETION
98 ASYNC_IO_COMPLETION
99 NETWORK_IO
107 RESOURCE_SEMAPHORE
113 SOS_WORKER
175 FCB_REPLICA_WRITE
176 FCB_REPLICA_READ
177 HOLDER11
178 WRITELOG
186 CMEMTHREAD
207 TRACEWRITE
269 RESOURCE_SEMAPHORE_MUTEX
283 RESOURCE_SEMAPHORE_QUERY_COMPILE
284 RESOURCE_SEMAPHORE_SMALL_QUERY

If you look at the way the Predicate is defined, it is much closer to how you’d write a WHERE clause with complex filtering criteria, allowing groups of specific criteria to be defined within sets of parenthesis’s that are evaluated together, something that was impossible with SQL Trace.

In addition to being able to define Predicates based on the Event columns returned by an Event, it is possible to also define Predicates on the global state data available in the Extended Events Engine.  If you’ll recall, the global state predicates are available in the sys.dm_xe_objects DMV as pred_source object_type’s.

SELECT 
    p.name AS package_name,
    o.name AS predicate_name,
    o.description
FROM sys.dm_xe_packages AS p
INNER JOIN sys.dm_xe_objects AS o
    ON p.guid = o.package_guid
WHERE (p.capabilities IS NULL OR p.capabilities & 1 = 0)
  AND o.object_type = 'pred_source'
  image

Two of the predicate sources are special, the package0.counter and package0.partitioned_counter, and can be used to restrict the number of occurrences of an Events that are captured by an Event Session.  The following demonstration creates an Event Session that captures the first five occurrences of the sqlserver.sql_statement_completed Event, and then executes six statements in sequence.  When the target_data is queried the last statement SELECT @@SPID is not included in the results.

CREATE EVENT SESSION CounterPredicateDemo
ON SERVER
ADD EVENT sqlserver.sql_statement_completed
( ACTION (sqlserver.sql_text)
  WHERE (package0.counter <=5))
ADD TARGET package0.ring_buffer
WITH (MAX_DISPATCH_LATENCY = 1 SECONDS)
GO
ALTER EVENT SESSION CounterPredicateDemo
ON SERVER
STATE = START
GO
SELECT @@VERSION
GO
SELECT @@SERVERNAME
GO
SELECT @@SPID
GO
SELECT @@VERSION
GO
SELECT @@SERVERNAME
GO
SELECT @@SPID
GO
ALTER EVENT SESSION CounterPredicateDemo
ON SERVER
DROP EVENT sqlserver.sql_statement_completed
GO
-- Wait in a delay for Events to Buffer
WAITFOR DELAY '00:00:05'
GO

-- Query the XML to get the Target Data
SELECT 
    n.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    n.value('(event/@package)[1]', 'varchar(50)') AS package_name,
    DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            n.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],  
    n.value('(event/data[@name="object_id"]/value)[1]', 'int') AS [object_id],
    n.value('(event/data[@name="object_type"]/value)[1]', 'nvarchar(128)') AS [object_type],
    n.value('(event/data[@name="duration"]/value)[1]', 'int') AS [duration],
    n.value('(event/data[@name="cpu"]/value)[1]', 'int') AS [cpu],
    n.value('(event/data[@name="reads"]/value)[1]', 'int') AS [reads],
    n.value('(event/data[@name="writes"]/value)[1]', 'int') AS [writes],
    n.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(max)') AS [sql_text]
FROM
(    SELECT td.query('.') as n
    FROM 
    (
        SELECT CAST(target_data AS XML) as target_data
        FROM sys.dm_xe_sessions AS s 
        JOIN sys.dm_xe_session_targets AS t 
            ON t.event_session_address = s.address
        WHERE s.name = 'CounterPredicateDemo'
          AND t.target_name = 'ring_buffer'
    ) AS sub
    CROSS APPLY target_data.nodes('RingBufferTarget/event') AS q(td)
) as tab
GO
-- Drop the Event Session
DROP EVENT SESSION CounterPredicateDemo
ON SERVER

image

This capabilities behind Predicate definition in Extended Events makes it much more flexible, and powerful for troubleshooting than SQL Trace.  It also makes Extended Events much more performant than Trace by preempting Event firing for Events that are not of interest.

Today’s post was originally planned for this coming weekend, but seems I’ve caught whatever bug my kids had over the weekend so I am changing up today’s blog post with one that is easier to cover and shorter.  If you’ve been running some of the queries from the posts in this series, you have no doubt come across an Event Session running on your server with the name of system_health.  In today’s post I’ll go over this session and provide links to references related to it.

When Extended Events was introduced in SQL Server 2008, the Produce Support Services group worked with the Extended Events developers to create the definition for an Event Session that could be shipped with SQL Server 2008, would startup automatically when SQL Server starts up, and contained Events of interest in troubleshooting common problems seen by the PSS Engineers.  Bob Ward (Blog|Twitter) blogged about the details of the system_health session that shipped with SQL Server 2008 in his blog post Supporting SQL Server 2008: The system_health session.  The script for this Event Session is inside of the utables.sql script file that is in the instance Install folder (for example c:\Program Files\Microsoft SQL Server\<InstanceDesignator>\MSSQL\Install) and can be used to recreate the Event Session if you inadvertently change it.

-- The predicates in this session have been carefully crafted to minimize impact of event collection
-- Changing the predicate definition may impact system performance
--
CREATE EVENT SESSION system_health ON SERVER
  • The sql_text and session_id for any sessions that encounter an error that has a severity >=20.
  • The sql_text and session_id for any sessions that encounter a memory-related error. The errors include 17803, 701, 802, 8645, 8651, 8657 and 8902.
ADD EVENT sqlserver.error_reported
(
    ACTION (package0.callstack, sqlserver.session_id, sqlserver.sql_text, sqlserver.tsql_stack)
    -- Get callstack, SPID, and query for all high severity errors ( above sev 20 )
    WHERE severity >= 20
    -- Get callstack, SPID, and query for OOM errors ( 17803 , 701 , 802 , 8645 , 8651 , 8657 , 8902 )
    OR (ERROR = 17803 OR ERROR = 701 OR ERROR = 802 OR ERROR = 8645 OR ERROR = 8651 OR ERROR = 8657 OR ERROR = 8902)
),
  • A record of any non-yielding scheduler problems. (These appear in the SQL Server error log as error 17883.)
ADD EVENT sqlos.scheduler_monitor_non_yielding_ring_buffer_recorded,
  • Any deadlocks that are detected.
ADD EVENT sqlserver.xml_deadlock_report,
  • The callstack, sql_text, and session_id for any sessions that have waited on latches (or other interesting resources) for > 15 seconds.
ADD EVENT sqlos.wait_info
(
    ACTION (package0.callstack, sqlserver.session_id, sqlserver.sql_text)
    WHERE 
    (duration > 15000 AND 
        (    
            (wait_type > 31    -- Waits for latches and important wait resources (not locks ) that have exceeded 15 seconds. 
                AND
                (
                    (wait_type > 47 AND wait_type < 54)
                    OR wait_type < 38
                    OR (wait_type > 63 AND wait_type < 70)
                    OR (wait_type > 96 AND wait_type < 100)
                    OR (wait_type = 107)
                    OR (wait_type = 113)
                    OR (wait_type > 174 AND wait_type < 179)
                    OR (wait_type = 186)
                    OR (wait_type = 207)
                    OR (wait_type = 269)
                    OR (wait_type = 283)
                    OR (wait_type = 284)
                )
            )
  • The callstack, sql_text, and session_id for any sessions that have waited on locks for > 30 seconds.
            OR 
            (duration > 30000        -- Waits for locks that have exceeded 30 secs.
                AND wait_type < 22
            ) 
        )
    )
),
  • The callstack, sql_text, and session_id for any sessions that have waited for a long time for preemptive waits. The duration varies by wait type. A preemptive wait is where SQL Server is waiting for external API calls.
ADD EVENT sqlos.wait_info_external
(
    ACTION (package0.callstack, sqlserver.session_id, sqlserver.sql_text)
    WHERE 
    (duration > 5000 AND
        (   
            (    -- Login related preemptive waits that have exceeded 5 seconds.
                (wait_type > 365 AND wait_type < 372)
                OR    (wait_type > 372 AND wait_type < 377)
                OR    (wait_type > 377 AND wait_type < 383)
                OR    (wait_type > 420 AND wait_type < 424)
                OR    (wait_type > 426 AND wait_type < 432)
                OR    (wait_type > 432 AND wait_type < 435)
            )
            OR 
            (duration > 45000     -- Preemptive OS waits that have exceeded 45 seconds. 
                AND 
                (    
                    (wait_type > 382 AND wait_type < 386)
                    OR    (wait_type > 423 AND wait_type < 427)
                    OR    (wait_type > 434 AND wait_type < 437)
                    OR    (wait_type > 442 AND wait_type < 451)
                    OR    (wait_type > 451 AND wait_type < 473)
                    OR    (wait_type > 484 AND wait_type < 499)
                    OR wait_type = 365
                    OR wait_type = 372
                    OR wait_type = 377
                    OR wait_type = 387
                    OR wait_type = 432
                    OR wait_type = 502
                )
            )
        )
    )
)
  • Capture Event information using the ring_buffer target.
ADD target package0.ring_buffer        -- Store events in the ring buffer target
    (SET max_memory = 4096)
  • Set the session to start automatically with SQL Server
WITH (startup_state = ON)
GO

In SQL Server Denali CTP1, two new Events have been added to the system_health session specific to SQLCLR.

  • A SQLCLR memory allocation failed.
ADD EVENT sqlclr.allocation_failure,
  • A SQLCLR virtual memory allocation failed.
ADD EVENT sqlclr.virtual_alloc_failure,

While the system_health session captures very useful information, it uses the ring_buffer Target to store the Event data.  In a scenario where the database engine fails completely the information that may have been captured by the system_health session will be lost when the process terminates.  Also since the Event Session uses the ring_buffer Target, it is possible that you may not receive back all of the Event data contained in the target, or the Events that you might have expected to exist.  Bob Ward talked about the limitation of the DMV’s to return 4MB of XML data and how this impacts the in memory Targets in Extended Events in his blog post You may not see the data you expect in Extended Event Ring Buffer Targets…. 

One of my favorite aspects of the system_health session is that it includes deadlock tracing through Extended Events by default.  However, in order to make use of the deadlock graph captured by Extended Events, you have to be on CU6 for SQL Server 2008 SP1 (http://support.microsoft.com/kb/978629), or you could try to hack your way around the bug as I showed in my article Retrieving Deadlock Graphs with SQL Server 2008 Extended Events, and Michael Zilberstein’s update to correct a problem with the code in that article, Parsing Extended Events xml_deadlock_report.  The deadlock graph in Extended Events will not open graphically in SSMS like a SQL Trace XML Deadlock Graph will due to changes in its output to support multi-victim deadlocks, which I covered in my blog post Changes to the Deadlock Monitor for the Extended Events xml_deadlock_report and Multi-Victim Deadlocks.

Yesterday’s blog post Targets Week – etw_classic_sync_target covered the ETW integration that is built into Extended Events and how the etw_classic_sync_target can be used in conjunction with other ETW traces to provide troubleshooting at a level previously not possible with SQL Server.  In today’s post we’ll look at how to use multiple targets to simplify analysis of Event collection.

Why Multiple Targets?

You might ask why you would want to use multiple Targets in an Event Session with Extended Events?  The best answer is because each of the Targets handles the Event data in different ways, and by combining their uses, you can easily simplify the task of tracking down problems.  Two days ago I talked about the pair_matching Target and how it only retains Events that have not been matched.  Two years ago out of nowhere, one of the production databases I supported started having transaction log growth problems.  After growing the log for the second time in a day, I started looking at the database because something was obviously not right. 

The first place I looked was the sys.databases DMV and specifically the log_reuse_wait_desc column, which provides the reason that the transaction log for a database is not truncating.  When I queried this DMV, I found that the log_reuse_wait_desc was ACTIVE_TRANSACTION, meaning that the database had an open transaction.  So I ran DBCC OPENTRAN to get some more information about the open transaction (you can query the DMV’s for this information as well) and found that the transaction had been open for over 8 hours.  I queried sys.dm_exec_sessions for the session_id and found that the session was still being used and had submitted a request within the last minute, so it seemed to be an orphaned transaction.

SELECT 
    session_id,
    login_time, 
    last_request_start_time, 
    last_request_end_time
FROM sys.dm_exec_sessions
WHERE session_id = 76

After discussions with the Analyst for the application and the applications vendor, it was decided that the session should be killed forcing a ROLLBACK of the transaction and allowing the log to be truncate.  (I’ll discuss why this might prove be problematic later in this post)  This resolved the problem, at least until the next day when the database began running out of space in the transaction log again, and once again had an open transaction that had been open for hours on a session that was sill being used by the application.  What was really interesting was there was no correlation between the previous days open transactions begin time, and the begin time of the second occurrence of the problem, so it seemed to be a completely random occurrence which was not going to be easy to troubleshoot. 

I created a server side trace and tried over the next two days to figure out what the issue actually was, but didn’t make much head way until I expanded the trace to have the statement starting and completed events along with the Errors and Warnings Events in the trace.  When the problem reoccurred, I was able to read through the trace files using filtering to minimize the Trace Events captured down to the specific spid that held the open transaction and the events that occurred five minutes before and after the transaction_begin_time for the open transaction.  While looking at the event information I found an Attention Event and was able to deduce what had happened.

The application was a ASP.NET application, and the vendor used the CommandTimeout default which is 30 seconds.  What happened was that a process was invoked that called a stored procedure to archive information from a transactional table into an archive table inside of the database, and the number of rows being archived caused the stored procedures execution to exceeded 30 seconds resulting in a timeout in the ASP.NET application, and the application silently handled the exception by doing nothing.  The problem was that the stored procedure issued a BEGIN TRANSACTION before archiving the rows, and when the timeout occurred and the Attention Event was raised, the command aborted leaving the open transaction and creating the problem with the log not truncating. 

The connection was returned to the ASP.NET Connection Pool, and was constantly being reused by the application to do who knows what other operations, which is where killing the connection was potentially a very bad thing to do.  All of the activity performed by this session was performed under the open transaction, so by killing the session, all of the activity would be rolled back.  With no way to tell what exactly would be rolled back, killing the session should not be taken lightly.

For the remainder of this post I am going to show a repro of this particular problem and how to use Multiple Targets in Extended Events to simplify the troubleshooting this.

Setting Up the Demo

To setup a reproduction of this problem you will need two instances of SSMS open.  One of them will be used to connect to the SQL Server normally, and the other will be used to act like the ASP.NET application that originally had the problem.  To setup the second instance of SSMS to act like the ASP.NET application, we are going to set the Execution Timeout using the Options of the Connect to Database Engine window, and we are also going to add an Additional Connection Parameter to the connection to set the Application Name on the connection to “Some Poorly Written App” as shown in the below screenshots from SSMS.

image   image

For the remainder of this blog post I am going to refer to the two different instances of SSMS as Normal and PoorApp in the hopes that this prevents confusion. 

In the Normal SSMS we will create a database and some objects to support the repro of the problem:

CREATE DATABASE [MultiTargetDemo]
GO
USE [MultiTargetDemo]
GO
CREATE TABLE dbo.RandomObjectsArchive
(ArchiveID int identity primary key,
 TableName nvarchar(128),
 IndexName nvarchar(128),
 ColumnName nvarchar(128))
GO
CREATE TABLE dbo.TrackArchiveRunTimes
(RuntimeID int identity primary key,
 ArchiveRuntime datetime DEFAULT(CURRENT_TIMESTAMP))
GO
CREATE PROCEDURE dbo.GenerateRandomObjects
AS
BEGIN TRANSACTION
    INSERT INTO dbo.TrackArchiveRunTimes
    DEFAULT VALUES;

    INSERT INTO RandomObjectsArchive
        (TableName, IndexName, ColumnName)
    SELECT TOP 10000 a.name, i.name, c.name
    FROM sys.objects AS a
        CROSS JOIN sys.indexes AS i
        CROSS JOIN sys.columns AS c
        CROSS JOIN master.dbo.spt_values AS sv
    WHERE sv.type = 'P' 
      AND sv.number < 6 --Adjust to increase runtime
    ORDER BY NEWID() DESC
COMMIT TRANSACTION
GO
USE [master]
GO

The GenerateRandomObjects stored procedure Inserts a row into a tracking table that tracks when the stored procedure was executed, and then simulates a long running archive process by doing something that you should never do in production code.  The sv.number predicate in the query can be increased or decreased based on the performance of the system being tested against to ensure that the stored procedure runs longer than the Execution Timeout setting, which on my PoorApp SSMS instance was set to 10 seconds.  Increasing the value by 1 has an exponential impact on the performance degradation of the stored procedure, so any changes should be made incrementally to ensure that you don’t create a tempdb bloat problem with the Cartesian product of the query being executed.

Setting Up the Event Session

To troubleshoot this problem using Extended Events we will create an Event Session that captures the following Events:

sqlserver.database_transaction_begin
sqlserver.database_transaction_end
sqlserver.sql_statement_starting
sqlserver.sql_statement_completed
sqlserver.sp_statement_starting
sqlserver.sp_statement_completed
sqlserver.rpc_starting
sqlserver.rpc_completed
sqlserver.module_start
sqlserver.module_end
sqlserver.error_reported

We’ll add the following Actions to each of the Events:

sqlserver.session_id
sqlserver.database_id
sqlserver.tsql_stack

and add the sqlserver.sql_text Action to the starting Events so that we can track what is actually being executed.  Every Event in the Event Session will have a Predicate on the sqlserver.client_app_name so that the Event only fires for connections and requests from “Some Poorly Written App”. 

IF EXISTS(SELECT * 
         FROM sys.server_event_sessions 
         WHERE name='OrphanedTransactionHunter') 
    DROP EVENT SESSION [OrphanedTransactionHunter] ON SERVER; 
CREATE EVENT SESSION OrphanedTransactionHunter
ON SERVER
ADD EVENT sqlserver.database_transaction_begin
(    ACTION(sqlserver.session_id, sqlserver.database_id, sqlserver.tsql_stack)
    WHERE (sqlserver.client_app_name = 'Some Poorly Written App')),
ADD EVENT sqlserver.database_transaction_end
(    ACTION(sqlserver.session_id, sqlserver.database_id, sqlserver.tsql_stack)
    WHERE (sqlserver.client_app_name = 'Some Poorly Written App')),
ADD EVENT sqlserver.sql_statement_starting
(    ACTION(sqlserver.session_id, sqlserver.database_id, sqlserver.tsql_stack, sqlserver.sql_text)
    WHERE (sqlserver.client_app_name = 'Some Poorly Written App')),
ADD EVENT sqlserver.sql_statement_completed
(    ACTION(sqlserver.session_id, sqlserver.database_id, sqlserver.tsql_stack)
    WHERE (sqlserver.client_app_name = 'Some Poorly Written App')),
ADD EVENT sqlserver.sp_statement_starting
(    ACTION(sqlserver.session_id, sqlserver.database_id, sqlserver.tsql_stack, sqlserver.sql_text)
    WHERE (sqlserver.client_app_name = 'Some Poorly Written App')),
ADD EVENT sqlserver.sp_statement_completed
(    ACTION(sqlserver.session_id, sqlserver.database_id, sqlserver.tsql_stack)
    WHERE (sqlserver.client_app_name = 'Some Poorly Written App')),
ADD EVENT sqlserver.rpc_starting
(    ACTION(sqlserver.session_id, sqlserver.database_id, sqlserver.tsql_stack, sqlserver.sql_text)
    WHERE (sqlserver.client_app_name = 'Some Poorly Written App')),
ADD EVENT sqlserver.rpc_completed
(    ACTION(sqlserver.session_id, sqlserver.database_id, sqlserver.tsql_stack)
    WHERE (sqlserver.client_app_name = 'Some Poorly Written App')),
ADD EVENT sqlserver.module_start
(    ACTION(sqlserver.session_id, sqlserver.database_id, sqlserver.tsql_stack, sqlserver.sql_text)
    WHERE (sqlserver.client_app_name = 'Some Poorly Written App')),
ADD EVENT sqlserver.module_end
(    ACTION(sqlserver.session_id, sqlserver.database_id, sqlserver.tsql_stack)
    WHERE (sqlserver.client_app_name = 'Some Poorly Written App')),
ADD EVENT sqlserver.error_reported
(    ACTION(sqlserver.session_id, sqlserver.database_id, sqlserver.tsql_stack)
    WHERE (sqlserver.client_app_name = 'Some Poorly Written App'))
ADD TARGET package0.ring_buffer,
ADD TARGET package0.pair_matching
( SET begin_event = 'sqlserver.database_transaction_begin',
      begin_matching_actions = 'sqlserver.session_id',
      end_event = 'sqlserver.database_transaction_end',
      end_matching_actions = 'sqlserver.session_id',
      respond_to_memory_pressure = 1
)
WITH (MAX_DISPATCH_LATENCY=5 SECONDS, TRACK_CAUSALITY=ON)
GO

ALTER EVENT SESSION OrphanedTransactionHunter
ON SERVER
STATE=START
GO

Fired Events will be dispatched to two different Targets, the package0.ring_buffer to capture the Raw Data (in a true production environment, the package0.asynchronous_file_target would generally be a better Target for Raw Data capture of any volume), and the package0.pair_matching Target which has been configured to match on the sqlserver.database_transaction_begin/end Events based on the sqlserver.session_id Action.  To ensure that we can track the relationship between events, the Event Session will have TRACK_CAUSALITY set to ON, and to minimize the time it takes for Events to be dispatched for our test, the MAX_DISPATCH_LATENCY will be set to 5 seconds.

Putting It All Together

With the Event Session running, we can change over to our PoorApp SSMS instance and execute the GenerateRandomObjects stored procedure inside of the MultiTargetDemo database.

EXECUTE MultiTargetDemo.dbo.GenerateRandomObjects

When this executes, the command will timeout and leave the transaction open, simulating the original problem exactly.  Once the query times out, switch back to the Normal SSMS Instance and in a new window execute the stored procedure again and allow it complete its execution.  Since the default timeout of 0 is used in the Normal SSMS Instance, the execution will not time out.  Then we can look at the sys.databases DMV and see that the log_reuse_wait_desc is ACTIVE_TRANSACTION.

SELECT log_reuse_wait_desc 
FROM sys.databases
WHERE database_id = DB_ID('MultiTargetDemo')

image

If we look at DBCC OPENTRAN for the MultiTargetDemo database we will see our orphaned transaction:

DBCC OPENTRAN([MultiTargetDemo])

image

As I mentioned earlier in this post, the transaction can also be seen in the DMV’s:

SELECT 
    dtst.session_id, 
    dtdt.database_id,
    dtst.transaction_id, 
    dtat.name,
    dtdt.database_transaction_begin_time
FROM sys.dm_tran_session_transactions AS dtst
JOIN sys.dm_tran_active_transactions AS dtat
    ON dtst.transaction_id = dtat.transaction_id
JOIN sys.dm_tran_database_transactions AS dtdt
    ON dtdt.transaction_id = dtst.transaction_id
WHERE database_id = DB_ID('MultiTargetDemo')

image

Now that we have our problem reproduced, lets look at how we can use the information captured by our Extended Event Session to track it back to the source of the problem.  First we’ll query the pair_matching Target to find out information about the sqlserver.database_transaction_begin Event that was unmatched.

-- Query the XML to get the Target Data
SELECT 
    n.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    n.value('(event/@package)[1]', 'varchar(50)') AS package_name,
    DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            n.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
    n.value('(event/action[@name="session_id"]/value)[1]', 'int') as session_id,
    n.value('(event/action[@name="database_id"]/value)[1]', 'int') as [database_id],
    n.value('(event/action[@name="tsql_stack"]/value)[1]', 'nvarchar(max)') as tsql_stack,    
    n.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)') as attach_activity_id
FROM
(    SELECT td.query('.') as n
    FROM 
    (
        SELECT CAST(target_data AS XML) as target_data
        FROM sys.dm_xe_sessions AS s 
        JOIN sys.dm_xe_session_targets AS t 
            ON t.event_session_address = s.address
        WHERE s.name = 'OrphanedTransactionHunter'
          AND t.target_name = 'pair_matching'
    ) AS sub
    CROSS APPLY target_data.nodes('PairingTarget/event') AS q(td)
) as tab
-- We are interested in unmatched sqlserver.database_transaction_begin Events
WHERE n.value('(event/@name)[1]', 'varchar(50)') = 'database_transaction_begin'
ORDER BY session_id, activity_id
GO

image

From this we can see our orphaned transaction event, and find the attach_activity_id of that Event.  The attach_activity_id Action is added to the Events in an Event Session when TRACK_CAUSALITY is turned ON.  There are two pieces of information contained in the attach_activity_id Action, the activity Guid (the first 36 characters of the value) and the sequence number for the Event, the number following the Guid.  The Guid can be used to find related Events, and the sequence number can be used to determine the order that the Events occurred.  By using the Guid from the attach_activity_id Action from our first query, we can query the ring_buffer Target and parse out the specific Events we are interested in looking at further.

-- Query the XML to get the Target Data
SELECT 
    n.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    n.value('(event/@package)[1]', 'varchar(50)') AS package_name,
    DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            n.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],  
    n.value('(event/action[@name="database_id"]/value)[1]', 'int') as [database_id],
    n.value('(event/action[@name="session_id"]/value)[1]', 'int') as [session_id],
    n.value('(event/data[@name="object_id"]/value)[1]', 'int') AS [object_id],
    n.value('(event/data[@name="object_type"]/value)[1]', 'nvarchar(128)') AS [object_type],
    n.value('(event/data[@name="object_name"]/value)[1]', 'nvarchar(128)') AS [object_name],
    n.value('(event/data[@name="error"]/value)[1]', 'int') AS [error],
    n.value('(event/data[@name="severity"]/value)[1]', 'int') AS [severity],
    n.value('(event/data[@name="state"]/value)[1]', 'int') AS [state],
    n.value('(event/data[@name="user_defined"]/value)[1]', 'bit') AS [user_defined],
    n.value('(event/data[@name="message"]/value)[1]', 'nvarchar(4000)') AS [message],
    n.value('(event/data[@name="duration"]/value)[1]', 'int') AS [duration],
    n.value('(event/data[@name="row_count"]/value)[1]', 'int') AS [row_count],
    n.value('(event/data[@name="cpu"]/value)[1]', 'int') AS [cpu],
    n.value('(event/data[@name="reads"]/value)[1]', 'int') AS [reads],
    n.value('(event/data[@name="writes"]/value)[1]', 'int') AS [writes],
    n.value('(event/action[@name="tsql_stack"]/value)[1]', 'nvarchar(max)') AS [tsql_stack],
    n.value('(event/data[@name="offset"]/value)[1]', 'int') AS [offset],
    n.value('(event/data[@name="offset_end"]/value)[1]', 'int') AS [offset_end],
    n.value('(event/data[@name="nest_level"]/value)[1]', 'int') AS [nest_level],           
     n.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(max)') AS [sql_text],
    CAST(SUBSTRING(n.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 1, 36) AS uniqueidentifier) AS activity_id,
    CAST(SUBSTRING(n.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 38, 10) AS int) AS event_sequence
FROM
(    SELECT td.query('.') as n
    FROM 
    (
        SELECT CAST(target_data AS XML) as target_data
        FROM sys.dm_xe_sessions AS s 
        JOIN sys.dm_xe_session_targets AS t 
            ON t.event_session_address = s.address
        WHERE s.name = 'OrphanedTransactionHunter'
          AND t.target_name = 'ring_buffer'
    ) AS sub
    CROSS APPLY target_data.nodes('RingBufferTarget/event') AS q(td)
) as tab
-- We are interested in unmatched sqlserver.database_transaction_begin Events
WHERE SUBSTRING(n.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 1, 36) = 'ADCF379A-4BCA-41BA-9B08-4C2265894392'
ORDER BY session_id, event_sequence
GO

 

With a little bit more work, we can reduce the XML parsing to only the important data elements that we need, and we can parse the tsql_stack Action to retrieve the related Event level statement_text from the sys.dm_exec_sql_text() DMF, since the sql_text Action did not have the intended information.

-- Query the XML to get the Target Data
SELECT 
    event_name,
    timestamp,
    database_id, 
    OBJECT_NAME(st.objectid, st.dbid) AS ObjectName,
    SUBSTRING(st.text, (tsql_stack.value('(/frame/@offsetStart)[1]', 'int')/2)+1, 
        ((CASE tsql_stack.value('(/frame/@offsetEnd)[1]', 'int')
            WHEN -1 THEN DATALENGTH(st.text)
            ELSE tsql_stack.value('(/frame/@offsetEnd)[1]', 'int')
            END - tsql_stack.value('(/frame/@offsetStart)[1]', 'int'))/2) + 1) AS statement_text,
    duration,
    activity_id,
    event_sequence
FROM
(
SELECT 
    n.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            n.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],  
    n.value('(event/action[@name="database_id"]/value)[1]', 'int') as [database_id],
    n.value('(event/data[@name="duration"]/value)[1]', 'int') AS [duration],
    CAST(n.value('(event/action[@name="tsql_stack"]/value)[1]', 'nvarchar(max)') AS XML) AS [tsql_stack],
    CAST(SUBSTRING(n.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 1, 36) AS uniqueidentifier) AS activity_id,
    CAST(SUBSTRING(n.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 38, 10) AS int) AS event_sequence
FROM
(    SELECT td.query('.') as n
    FROM 
    (
        SELECT CAST(target_data AS XML) as target_data
        FROM sys.dm_xe_sessions AS s 
        JOIN sys.dm_xe_session_targets AS t 
            ON t.event_session_address = s.address
        WHERE s.name = 'OrphanedTransactionHunter'
          AND t.target_name = 'ring_buffer'
    ) AS sub
    CROSS APPLY target_data.nodes('RingBufferTarget/event') AS q(td)
) as tab
) as tab2
CROSS APPLY sys.dm_exec_sql_text(tsql_stack.value('xs:hexBinary(substring((/frame/@handle)[1], 3))', 'varbinary(max)')) AS st
-- We are interested in Events in activity_id sequence of the orphaned transaction only
WHERE activity_id = 'ADCF379A-4BCA-41BA-9B08-4C2265894392'
ORDER BY session_id, event_sequence
GO

image

The highlighted value shows that the stored procedure execution ended at the Execution Timeout limit that was set for the PoorApp SSMS Instance.  Beyond that we can track each of the statements and see that when the execution ended, it was in the INSERT INTO RandomObjectsArchive statement in the stored procedure allowing us to target our efforts at troubleshooting to the specific problem in a short amount of time.

Back in April, Paul Randal (Blog|Twitter) did a 30 day series titled A SQL Server Myth a Day, where he covered a different myth about SQL Server every day of the month.  At the same time Glenn Berry (Blog|Twitter) did a 30 day series titled A DMV a Day, where he blogged about a different DMV every day of the month.  Being so inspired by these two guys, I have decided to attempt a month long series on Extended Events that I am going to call A XEvent a Day.  I originally wanted to do this series during the month of November, but with school requirements and preparations for PASS Summit, I just couldn’t make it work out.  Instead I am going to end 2010 with a bang and at the same time double my blog post count for they year by doing it in December.

This post will be the master post and will have a link to each of the posts throughout the month as I post them.

An XEvent a Day (1 of 31) – An Overview of Extended Events
An XEvent a Day (2 of 31) – Querying the Extended Events Metadata
An XEvent a Day (3 of 31) – Managing Event Sessions
An XEvent a Day (4 of 31) – Querying the Session Definition and Active Session DMV’s
An XEvent a Day (5 of 31) – Targets Week – ring_buffer
An XEvent a Day (6 of 31) – Targets Week – asynchronous_file_target
An XEvent a Day (7 of 31) – Targets Week – bucketizers
An XEvent a Day (8 of 31) – Targets Week – synchronous_event_counter
An XEvent a Day (9 of 31) – Targets Week – pair_matching
An XEvent a Day (10 of 31) – Targets Week – etw_classic_sync_target
An XEvent a Day (11 of 31) – Targets Week – Using multiple targets to simplify analysis
An XEvent a Day (12 of 31) – Using the Extended Events SSMS Addin
An XEvent a Day (13 of 31) – The system_health Session
An XEvent a Day (14 of 31) – A Closer Look at Predicates
An XEvent a Day (15 of 31) – Tracking Ghost Cleanup
An XEvent a Day (16 of 31) – How Many Checkpoints are Issued During a Full Backup?
An XEvent a Day (17 of 31) – A Look at Backup Internals and How to Track Backup and Restore Throughput (Part 1)
An XEvent a Day (18 of 31) – A Look at Backup Internals and How to Track Backup and Restore Throughput (Part 2)
An XEvent a Day (19 of 31) – Using Customizable Fields
An XEvent a Day (20 of 31) – Mapping Extended Events to SQL Trace
An XEvent a Day (21 of 31) – The Future – Tracking Blocking in Denali
An XEvent a Day (22 of 31) – The Future – fn_dblog() No More? Tracking Transaction Log Activity in Denali
An XEvent a Day (23 of 31) – How it Works – Multiple Transaction Log Files
An XEvent a Day (24 of 31) – What is the package0.callstack Action?
An XEvent a Day (25 of 31) – The Twelve Days of Christmas
An XEvent a Day (26 of 31) – Configuring Session Options
An XEvent a Day (27 of 31) – The Future - Tracking Page Splits in SQL Server Denali CTP1
An XEvent a Day (28 of 31) – Tracking Page Compression Operations
An XEvent a Day (29 of 31) – The Future – Looking at Database Startup in Denali
An XEvent a Day (30 of 31) – Tracking Session and Statement Level Waits
An XEvent a Day (31 of 31) – Event Session DDL Events

Theme design by Nukeation based on Jelle Druyts