SQLskills has an ongoing initiative to blog about basic topics, which we’re calling SQL101. We’re all blogging about things that we often see done incorrectly, technologies used the wrong way, or where there are many misunderstandings that lead to serious problems. If you want to find all of our SQLskills SQL101 blog posts, check out SQLskills.com/help/SQL101.
Wait statistics analysis is one of my favorite things to talk about because it’s so incredibly useful for performance tuning and can dramatically shorten the time it takes to zero in on the root cause of a performance problem. But you have to do it correctly. You can’t just do a SELECT * FROM sys.dm_os_wait_stats. Various people have published scripts online to aggregate and display wait statistics in an actionable way, and my script is one of the most popular (latest version is always in this post).
One question I’m often asked is why does my script have a list of wait types that it specifically filters out? The answer is that those wait types are what I call ‘benign’ – they’re usually not a problem but happen frequently enough from regular SQL Server operations that they would show up as the top waits and so would obscure the waits that you can do something about.
For instance, if I take my waits script and remove all the filtering of benign waits, the results on my laptop where I’m forcing a tempdb contention problem are as follows:
WaitType Wait_S Resource_S Signal_S WaitCount Percentage AvgWait_S AvgRes_S AvgSig_S Help/Info URL ----------------------------------- --------- ---------- -------- --------- ---------- --------- -------- -------- ----------------------------------------------------------------------- SLEEP_TASK 123335.21 123326.43 8.77 5232828 10.68 0.0236 0.0236 0.0000 https://www.sqlskills.com/help/waits/SLEEP_TASK DIRTY_PAGE_POLL 82215.60 82214.61 0.98 808502 7.12 0.1017 0.1017 0.0000 https://www.sqlskills.com/help/waits/DIRTY_PAGE_POLL HADR_FILESTREAM_IOMGR_IOCOMPLETION 82215.08 82214.43 0.65 163809 7.12 0.5019 0.5019 0.0000 https://www.sqlskills.com/help/waits/HADR_FILESTREAM_IOMGR_IOCOMPLETION LOGMGR_QUEUE 82213.89 82210.58 3.31 669980 7.12 0.1227 0.1227 0.0000 https://www.sqlskills.com/help/waits/LOGMGR_QUEUE SQLTRACE_INCREMENTAL_FLUSH_SLEEP 82212.97 82212.94 0.03 20546 7.12 4.0014 4.0014 0.0000 https://www.sqlskills.com/help/waits/SQLTRACE_INCREMENTAL_FLUSH_SLEEP REQUEST_FOR_DEADLOCK_SEARCH 82212.74 0.00 82212.74 16442 7.12 5.0002 0.0000 5.0002 https://www.sqlskills.com/help/waits/REQUEST_FOR_DEADLOCK_SEARCH LAZYWRITER_SLEEP 82210.41 82209.82 0.59 86524 7.12 0.9501 0.9501 0.0000 https://www.sqlskills.com/help/waits/LAZYWRITER_SLEEP CHECKPOINT_QUEUE 82204.96 82204.92 0.04 125 7.12 657.6396 657.6394 0.0003 https://www.sqlskills.com/help/waits/CHECKPOINT_QUEUE XE_TIMER_EVENT 82204.08 0.00 82204.08 37409 7.12 2.1974 0.0000 2.1974 https://www.sqlskills.com/help/waits/XE_TIMER_EVENT QDS_CLEANUP_STALE_QUERIES_TASK_ MAIN_LOOP_SLEEP 82201.37 82201.36 0.01 1371 7.12 59.9572 59.9572 0.0000 https://www.sqlskills.com/help/waits/QDS_CLEANUP_STALE_QUERIES_TASK_MAIN_LOOP_SLEEP QDS_PERSIST_TASK_MAIN_LOOP_SLEEP 82201.29 82201.28 0.01 1371 7.12 59.9572 59.9572 0.0000 https://www.sqlskills.com/help/waits/QDS_PERSIST_TASK_MAIN_LOOP_SLEEP SP_SERVER_DIAGNOSTICS_SLEEP 82200.36 0.00 82200.36 299612 7.12 0.2744 0.0000 0.2744 https://www.sqlskills.com/help/waits/SP_SERVER_DIAGNOSTICS_SLEEP XE_DISPATCHER_WAIT 82198.10 82198.10 0.00 686 7.12 119.8223 119.8223 0.0000 https://www.sqlskills.com/help/waits/XE_DISPATCHER_WAIT
All of these are benign waits happening on system threads. For instance, you can see that the lazy writer thread on my instance is waking up every 1 second or so to check for memory pressure in the buffer pool, finding none, and then sleeping again (that’s the LAZYWRITER_SLEEP wait type, showing an average of 0.95s average resource wait time in the AvgRes_S column). You can also see that nearly all of these waits have a total wait time of around 82, 200 seconds, which is how long my laptop has been running since its last reboot.
The point is that these waits always occur and if you don’t filter them out, they will show up as the most prevalent wait types on your instance, and they’re not related to performance issues.
When I put the filters back in, and re-run the script, I get the following output:
WaitType Wait_S Resource_S Signal_S WaitCount Percentage AvgWait_S AvgRes_S AvgSig_S Help/Info URL ----------------------------------- --------- ---------- -------- --------- ---------- --------- -------- -------- ----------------------------------------------------------------------- PAGELATCH_UP 3451.97 3312.34 139.63 502282 56.73 0.0069 0.0066 0.0003 https://www.sqlskills.com/help/waits/PAGELATCH_UP PAGELATCH_SH 2324.96 1449.37 875.60 2030686 38.21 0.0011 0.0007 0.0004 https://www.sqlskills.com/help/waits/PAGELATCH_SH LATCH_EX 217.89 214.96 2.94 7628 3.58 0.0286 0.0282 0.0004 https://www.sqlskills.com/help/waits/LATCH_EX
Now I can see an indication of a problem and know to go look at my sys.dm_os_waiting_tasks script (latest version always here) for further investigation (and note the automatically-generated URLs which will take you to the relevant page of my waits library for explanation of the wait types and troubleshooting advice).
Bottom line: always make sure you’re filtering out benign wait types so you’re not trying to troubleshoot a problem that you can’t do anything about.
2 thoughts on “SQLskills SQL101: Why do some wait types need to be ignored?”
As someone who is new to this : I still don’t know how to interpret the result from the scripts. So I don’t understand what I see.
The help info is very generic.
If you click on the link it takes you into the help page for that wait type, where some waits have more info than others. If you watch my Pluralsight course, you’ll have a much better idea of how to view waits overall.