This is a performance tuning post that's been on my to-do list for quite a while. Wait stats analysis is a great way of looking at the symptoms of performance problems (see my Wait Stats category for more posts on this) but using the sys.dm_os_wait_stats DMV shows everything that's happening on a server. If you want to see what waits are occuring because of a single query or operation on a production system (e.g. the effect of MAXDOP hints on number and length of CXPACKET waits for a query) then using the DMV isn't usually feasible – you'd need to clear the wait stats and then have nothing else running on the system except the query/operation you want to investigate. The way to do it is with extended events.
Whenever a thread state changes to SUSPENDED because of a wait, the wait_info event from the sqlos event package fires to mark the beginning of the wait. When the thread gets back on the CPU again, the wait_info event fires again, noting the total wait time and the signal wait time (spent on the RUNNABLE queue). Just like with the DMV, we need to calculate the resource wait time (spent being SUSPENDED) ourselves.
Here's a simple example for you – imagine we want to investigate the waits that occur during an index rebuild. First thing is to create the event session:
– Drop the session if it exists.
IF EXISTS (SELECT * FROM sys.server_event_sessions WHERE name = 'MonitorWaits')
DROP EVENT SESSION MonitorWaits ON SERVER
CREATE EVENT SESSION MonitorWaits ON SERVER
ADD EVENT sqlos.wait_info
(WHERE sqlserver.session_id = 53 /* session_id of connection to monitor */)
ADD TARGET package0.asynchronous_file_target
(SET FILENAME = N'C:\SQLskills\EE_WaitStats.xel',
METADATAFILE = N'C:\SQLskills\EE_WaitStats.xem')
WITH (max_dispatch_latency = 1 seconds);
You'll need to plug in the session ID of the SQL Server connection where you'll run the operation under investigation. I created a dummy production database with a table and some data so I can monitor the waits from rebuilding its clustered index.
You can also filter out any wait types you're not interested in by adding another predicate to the session that filters on the sqlos.wait_type. You'll need to specify numbers, and you can list the mapping between numbers and human-understandable wait types using this code:
SELECT xmv.map_key, xmv.map_value
FROM sys.dm_xe_map_values xmv
JOIN sys.dm_xe_packages xp
ON xmv.object_package_guid = xp.guid
WHERE xmv.name = 'wait_types'
AND xp.name = 'sqlos';
Now I turn on the session and run my rebuild.
– Start the session
ALTER EVENT SESSION MonitorWaits ON SERVER STATE = START;
– Go do the query
– Stop the event session
ALTER EVENT SESSION MonitorWaits ON SERVER STATE = STOP;
And then I can see how many events fired:
SELECT COUNT (*)
'C:\SQLskills\EE_WaitStats*.xem', null, null);
And then pull them into a temporary table and aggregate them (various ways of doing this, I prefer this one):
– Create intermediate temp table for raw event data
CREATE TABLE #RawEventData (
Rowid INT IDENTITY PRIMARY KEY,
– Read the file data into intermediate temp table
INSERT INTO #RawEventData (event_data)
CAST (event_data AS XML) AS event_data
FROM sys.fn_xe_file_target_read_file (
'C:\SQLskills\EE_WaitStats*.xem', null, null);
COUNT (*) AS [Wait Count],
SUM (waits.[Duration]) AS [Total Wait Time (ms)],
SUM (waits.[Duration]) – SUM (waits.[Signal Duration]) AS [Total Resource Wait Time (ms)],
SUM (waits.[Signal Duration]) AS [Total Signal Wait Time (ms)]
event_data.value ('(/event/@timestamp)', 'DATETIME') AS [Time],
event_data.value ('(/event/data[@name=''wait_type'']/text)', 'VARCHAR(100)') AS [Wait Type],
event_data.value ('(/event/data[@name=''opcode'']/text)', 'VARCHAR(100)') AS [Op],
event_data.value ('(/event/data[@name=''duration'']/value)', 'BIGINT') AS [Duration],
event_data.value ('(/event/data[@name=''signal_duration'']/value)', 'BIGINT') AS [Signal Duration]
) AS waits
WHERE waits.[op] = 'End'
GROUP BY waits.[Wait Type]
ORDER BY [Total Wait Time (ms)] DESC;
DROP TABLE #RawEventData;
Wait Type Wait Count Total Wait Time (ms) Total Resource Wait Time (ms) Total Signal Wait Time (ms)
——————— ———– ——————– —————————– —————————
LATCH_EX 4919 15783 15693 90
LATCH_SH 18 4399 4398 1
CXPACKET 16 3039 3038 1
PAGEIOLATCH_SH 813 468 457 11
PAGELATCH_SH 246 113 109 4
WRITELOG 99 71 66 5
PAGELATCH_UP 203 65 63 2
LOGBUFFER 123 58 57 1
WRITE_COMPLETION 82 55 55 0
PAGEIOLATCH_UP 35 27 27 0
IO_COMPLETION 80 25 25 0
SQLTRACE_LOCK 1 6 6 0
PAGEIOLATCH_EX 4 3 3 0
CMEMTHREAD 17 1 0 1
SOS_SCHEDULER_YIELD 6 0 0 0
You can easily change this to give average times and add filters to remove uninteresting waits.
If you're interested in getting more involved with extended events, check out Jonathan's excellent series from last December.