This is a performance tuning post that's been on my to-do list for quite a while. Wait stats analysis is a great way of looking at the symptoms of performance problems (see my Wait Stats category for more posts on this) but using the sys.dm_os_wait_stats DMV shows everything that's happening on a server. If you want to see what waits are occuring because of a single query or operation on a production system (e.g. the effect of MAXDOP hints on number and length of CXPACKET waits for a query) then using the DMV isn't usually feasible – you'd need to clear the wait stats and then have nothing else running on the system except the query/operation you want to investigate. The way to do it is with extended events.

Whenever a thread state changes to SUSPENDED because of a wait, the wait_info event from the sqlos event package fires to mark the beginning of the wait. When the thread gets back on the CPU again, the wait_info event fires again, noting the total wait time and the signal wait time (spent on the RUNNABLE queue). Just like with the DMV, we need to calculate the resource wait time (spent being SUSPENDED) ourselves.

Here's a simple example for you – imagine we want to investigate the waits that occur during an index rebuild. First thing is to create the event session:

– Drop the session if it exists.
IF EXISTS (SELECT * FROM sys.server_event_sessions WHERE name = 'MonitorWaits')
    DROP EVENT SESSION MonitorWaits ON SERVER
GO

CREATE EVENT SESSION MonitorWaits ON SERVER
ADD EVENT sqlos.wait_info
    (WHERE sqlserver.session_id = 53 /* session_id of connection to monitor */)
ADD TARGET package0.asynchronous_file_target
    (SET FILENAME = N'C:\SQLskills\EE_WaitStats.xel',
    METADATAFILE = N'C:\SQLskills\EE_WaitStats.xem')
WITH (max_dispatch_latency = 1 seconds);
GO

You'll need to plug in the session ID of the SQL Server connection where you'll run the operation under investigation. I created a dummy production database with a table and some data so I can monitor the waits from rebuilding its clustered index.

You can also filter out any wait types you're not interested in by adding another predicate to the session that filters on the sqlos.wait_type. You'll need to specify numbers, and you can list the mapping between numbers and human-understandable wait types using this code:

SELECT xmv.map_key, xmv.map_value
FROM sys.dm_xe_map_values xmv
JOIN sys.dm_xe_packages xp
    ON xmv.object_package_guid = xp.guid
WHERE xmv.name = 'wait_types'
    AND xp.name = 'sqlos';
GO

Now I turn on the session and run my rebuild.

– Start the session
ALTER EVENT SESSION MonitorWaits ON SERVER STATE = START;
GO

– Go do the query

– Stop the event session
ALTER EVENT SESSION MonitorWaits ON SERVER STATE = STOP;
GO

And then I can see how many events fired:

SELECT COUNT (*)
FROM sys.fn_xe_file_target_read_file
    ('C:\SQLskills\EE_WaitStats*.xel',
    'C:\SQLskills\EE_WaitStats*.xem', null, null);
GO

13324

And then pull them into a temporary table and aggregate them (various ways of doing this, I prefer this one):

– Create intermediate temp table for raw event data
CREATE TABLE #RawEventData (
    Rowid  INT IDENTITY PRIMARY KEY,
    event_data XML);
 
GO

– Read the file data into intermediate temp table
INSERT INTO #RawEventData (event_data)
SELECT
    CAST (event_data AS XML) AS event_data
FROM sys.fn_xe_file_target_read_file (
    'C:\SQLskills\EE_WaitStats*.xel',
    'C:\SQLskills\EE_WaitStats*.xem', null, null);
GO

SELECT
    waits.[Wait Type],
    COUNT (*) AS [Wait Count],
    SUM (waits.[Duration]) AS [Total Wait Time (ms)],
    SUM (waits.[Duration]) – SUM (waits.[Signal Duration]) AS [Total Resource Wait Time (ms)],
    SUM (waits.[Signal Duration]) AS [Total Signal Wait Time (ms)]
FROM
    (SELECT
        event_data.value ('(
/event/@timestamp)[1]', 'DATETIME') AS [Time],
        event_data.value ('(/event/data[@name=''wait_type'']/text)[1]', 'VARCHAR(100)') AS [Wait Type],
        event_data.value ('(/event/data[@name=''opcode'']/text)[1]', 'VARCHAR(100)') AS [Op],
        event_data.value ('(/event/data[@name=''duration'']/value)[1]', 'BIGINT') AS [Duration],
        event_data.value ('(/event/data[@name=''signal_duration'']/value)[1]', 'BIGINT') AS [Signal Duration]
     FROM #RawEventData
    ) AS waits
WHERE waits.[op] = 'End'
GROUP BY waits.[Wait Type]
ORDER BY [Total Wait Time (ms)] DESC;
GO

– Cleanup
DROP TABLE #RawEventData;
GO

Wait Type             Wait Count  Total Wait Time (ms) Total Resource Wait Time (ms) Total Signal Wait Time (ms)
——————— ———– ——————– —————————– —————————
LATCH_EX              4919        15783                15693                         90
LATCH_SH              18          4399                 4398                          1
CXPACKET              16          3039                 3038                          1
PAGEIOLATCH_SH        813         468                  457                           11
PAGELATCH_SH          246         113                  109                           4
WRITELOG              99          71                   66                            5
PAGELATCH_UP          203         65                   63                            2
LOGBUFFER             123         58                   57                            1
WRITE_COMPLETION      82          55                   55                            0
PAGEIOLATCH_UP        35          27                   27                            0
IO_COMPLETION         80          25                   25                            0
SQLTRACE_LOCK         1           6                    6                             0
PAGEIOLATCH_EX        4           3                    3                             0
CMEMTHREAD            17          1                    0                             1
SOS_SCHEDULER_YIELD   6           0                    0                             0

You can easily change this to give average times and add filters to remove uninteresting waits.

If you're interested in getting more involved with extended events, check out Jonathan's excellent series from last December.

Happy waiting!