Continuing my series on advanced performance troubleshooting – see these two posts for the scripts I’ll be using and an introduction to the series:
- Wait statistics, or please tell me where it hurts
- Advanced performance troubleshooting: waits, latches, spinlocks
In this blog post I’d like to show you an example of SOS_SCHEDULER_YIELD waits occurring and how it can seem like a spinlock is the cause.
I originally published this blog post and then had a discussion with good friend Bob Ward from Product Support who questioned my conclusions given what he’s seen (thanks Bob!). After digging in further, I found that my original post was incorrect, so this is the corrected version.
The SOS_SCHEDULER_YIELD wait means that an executing thread voluntarily gave up the CPU to allow other threads to execute. The SQL Server code is sprinkled with “voluntary yields” in places where high CPU usage may occur.
One such place where a thread will sleep but not explicitly yield is when backing off after a spinlock collision waiting to see if it can get access to the spinlock. A spinlock is a very lightweight synchronization mechanism deep inside SQL Server that protects access to a data structure (not database data itself). See the end of the second blog post above for more of an explanation on spinlocks.
When a thread voluntarily yields, it does not go onto the waiting tasks list – as it’s not waiting for anything – but instead goes to the bottom of the Runnable Queue for that scheduler. SOS_SCHEDULER_YIELD waits by themselves are not cause for concern unless they are the majority of waits on the system, and performance is suffering.
To set up the test, I’ll create a simple database and table:
USE [master]; GO DROP DATABASE [YieldTest]; GO CREATE DATABASE [YieldTest]; GO USE [YieldTest]; GO CREATE TABLE [SampleTable] ([c1] INT IDENTITY); GO CREATE NONCLUSTERED INDEX [SampleTable_NC] ON [SampleTable] ([c1]); GO SET NOCOUNT ON; GO INSERT INTO [SampleTable] DEFAULT VALUES; GO 100
Then I’ll clear out wait stats and latch stats:
DBCC SQLPERF (N'sys.dm_os_latch_stats', CLEAR); GO DBCC SQLPERF (N'sys.dm_os_wait_stats', CLEAR); GO
And then fire up 50 clients running the following code (I just have a CMD script that fires up 50 CMD windows, each running the T-SQL code):
USE [YieldTest]; GO SET NOCOUNT ON; GO DECLARE @a INT; WHILE (1=1) BEGIN SELECT @a = COUNT (*) FROM [YieldTest]..[SampleTable] WHERE [c1] = 1; END; GO
And the CPUs in my laptop are jammed solid (snapped from my desktop and rotated 90 degrees to save space):
Looking at perfmon:
The CPU usage is not Windows (% User Time), and of all the counters I usually monitor (I’ve cut them all out here for clarity) I can see a sustained very high Lock Requests/sec for Object and Page locks (almost 950 thousand requests per second for both types! Gotta love my laptop :-)
So let’s dig in. First off, looking at wait stats (using the script in the wait stats post referenced above):
WaitType Wait_S Resource_S Signal_S WaitCount Percentage AvgWait_S AvgRes_S AvgSig_S ------------------- ------- ---------- -------- --------- ---------- --------- -------- -------- SOS_SCHEDULER_YIELD 4574.77 0.20 4574.57 206473 99.43 0.0222 0.0000 0.0222
SOS_SCHEDULER_YIELD at almost 100% of the waits on the system means that I’ve got CPU pressure – as I saw from the CPU graphs above. The fact that nothing else is showing up makes me suspect this is a spinlock issue.
Checking in the sys.dm_os_waiting_tasks output (see script in the second blog post referenced above), I see nothing waiting, and if I refresh a few times I see the occasional ASYNC_NETWORK_IO and/or PREEMPTIVE_OS_WAITFORSINGLEOBJECT wait type – which I’d expect from the CMD windows running the T-SQL code.
Checking the latch stats (again, see script in the second blog post referenced above) shows no latch waits apart from a few BUFFER latch waits.
So now let’s look at spinlocks. First off I’m going to take a snapshot of the spinlocks on the system:
-- Baseline IF EXISTS (SELECT * FROM [tempdb].[sys].[objects] WHERE [name] = N'##TempSpinlockStats1') DROP TABLE [##TempSpinlockStats1]; GO SELECT * INTO [##TempSpinlockStats1] FROM sys.dm_os_spinlock_stats WHERE [collisions] > 0 ORDER BY [name]; GO
(On 2005 you’ll need to use DBCC SQLPERF (‘spinlockstats’) and use INSERT/EXEC to get the results into a table – see the post above for example code.)
Then wait 10 seconds or so for the workload to continue… and then grab another snapshot of the spinlocks:
-- Capture updated stats IF EXISTS (SELECT * FROM [tempdb].[sys].[objects] WHERE [name] = N'##TempSpinlockStats2') DROP TABLE [##TempSpinlockStats2]; GO SELECT * INTO [##TempSpinlockStats2] FROM sys.dm_os_spinlock_stats WHERE [collisions] > 0 ORDER BY [name]; GO
Now running the code I came up with to show the difference between the two snapshots:
-- Diff them SELECT '***' AS [New], [ts2].[name] AS [Spinlock], [ts2].[collisions] AS [DiffCollisions], [ts2].[spins] AS [DiffSpins], [ts2].[spins_per_collision] AS [SpinsPerCollision], [ts2].[sleep_time] AS [DiffSleepTime], [ts2].[backoffs] AS [DiffBackoffs] FROM [##TempSpinlockStats2] [ts2] LEFT OUTER JOIN [##TempSpinlockStats1] [ts1] ON [ts2].[name] = [ts1].[name] WHERE [ts1].[name] IS NULL UNION SELECT '' AS [New], [ts2].[name] AS [Spinlock], [ts2].[collisions] - [ts1].[collisions] AS [DiffCollisions], [ts2].[spins] - [ts1].[spins] AS [DiffSpins], CASE ([ts2].[spins] - [ts1].[spins]) WHEN 0 THEN 0 ELSE ([ts2].[spins] - [ts1].[spins]) / ([ts2].[collisions] - [ts1].[collisions]) END AS [SpinsPerCollision], [ts2].[sleep_time] - [ts1].[sleep_time] AS [DiffSleepTime], [ts2].[backoffs] - [ts1].[backoffs] AS [DiffBackoffs] FROM [##TempSpinlockStats2] [ts2] LEFT OUTER JOIN [##TempSpinlockStats1] [ts1] ON [ts2].[name] = [ts1].[name] WHERE [ts1].[name] IS NOT NULL AND [ts2].[collisions] - [ts1].[collisions] > 0 ORDER BY [New] DESC, [Spinlock] ASC; GO
New Spinlock DiffCollisions DiffSpins SpinsPerCollision DiffSleepTime DiffBackoffs ---- ------------------------------- -------------- ---------- ----------------- ------------- ------------ LOCK_HASH 6191134 4005774890 647 686 1601383 OPT_IDX_STATS 1164849 126549245 108 57 7555 SOS_OBJECT_STORE 73 305 4 0 0 SOS_WAITABLE_ADDRESS_HASHBUCKET 115 44495 386 0 3 XDESMGR 1 0 0 0 0
This is telling me that there is massive contention for the LOCK_HASH spinlock, further backed up by the OPT_IDX_STATS spinlock (which controls access to the metrics tracked by sys.dm_db_index_usage_stats). The LOCK_HASH spinlock protects access to the hash buckets used by the lock manager to efficiently keep track of the lock resources for locks held inside SQL Server and to allow efficient searching for lock hash collisions (i.e. does someone hold the lock we want in an incompatible mode?). In this case the contention is so bad that instead of just spinning, the threads are actually backing off and letting other threads execute to allow progress to be made.
And that makes perfect sense because of what my workload is doing – 50 concurrent connections all trying to read the same row on the same page in the same nonclustered index.
But is that the cause of the SOS_SCHEDULER_YIELD waits? To prove it one way or the other, I created an Extended Event session that would capture call stacks when a wait occurs:
-- Note that before SQL 2012, the wait_type to use is 120, and -- on 2014 SP1 the wait_type to use is 123. You MUST verify the -- map_value to use on your build. -- On SQL 2012 the target name is 'histogram' but the old name still works. CREATE EVENT SESSION [MonitorWaits] ON SERVER ADD EVENT [sqlos].[wait_info] (ACTION ([package0].[callstack]) WHERE [wait_type] = 124) -- SOS_SCHEDULER_YIELD only ADD TARGET [package0].[asynchronous_bucketizer] ( SET filtering_event_name = N'sqlos.wait_info', source_type = 1, -- source_type = 1 is an action source = N'package0.callstack') -- bucketize on the callstack WITH (MAX_MEMORY = 50MB, max_dispatch_latency = 5 seconds) GO -- Start the session ALTER EVENT SESSION [MonitorWaits] ON SERVER STATE = START; GO -- TF to allow call stack resolution DBCC TRACEON (3656, -1); --DBCC TRACEON (2592, -1); -- SQL Server 2019+ requires this too GO -- Let the workload run for a few seconds -- Get the callstacks from the bucketizer target -- Are they showing calls into the lock manager? SELECT [event_session_address], [target_name], [execution_count], CAST ([target_data] AS XML) FROM sys.dm_xe_session_targets [xst] INNER JOIN sys.dm_xe_sessions [xs] ON ([xst].[event_session_address] = [xs].[address]) WHERE [xs].[name] = N'MonitorWaits'; GO -- Stop the event session ALTER EVENT SESSION [MonitorWaits] ON SERVER STATE = STOP; GO
I also made sure to have the correct symbol files in the \binn directory (see How to download a sqlservr.pdb symbol file). After running the workload and examining the callstacks, I found the majority of the waits were coming from voluntary yields deep in the Access Methods code. An example call stack is:
IndexPageManager::GetPageWithKey+ef [ @ 0+0x0 GetRowForKeyValue+146 [ @ 0+0x0 IndexRowScanner::EstablishInitialKeyOrderPosition+10a [ @ 0+0x0 IndexDataSetSession::GetNextRowValuesInternal+7d7 [ @ 0+0x0 RowsetNewSS::FetchNextRow+12a [ @ 0+0x0 CQScanRangeNew::GetRow+6a1 [ @ 0+0x0 CQScanCountStarNew::GetRowHelper+44 [ @ 0+0x0 CQScanStreamAggregateNew::Open+70 [ @ 0+0x0 CQueryScan::Uncache+32f [ @ 0+0x0 CXStmtQuery::SetupQueryScanAndExpression+2a2 [ @ 0+0x0 CXStmtQuery::ErsqExecuteQuery+2f8 [ @ 0+0x0 CMsqlExecContext::ExecuteStmts&lt;1,1&gt;+cca [ @ 0+0x0 CMsqlExecContext::FExecute+58b [ @ 0+0x0 CSQLSource::Execute+319
[Edit: check out how to do this in the spinlock whitepaper]
This is clearly (to me) nothing to do with the LOCK_HASH spinlock, so that’s a red herring. In this case, I’m just CPU bound. When a thread goes to sleep when backing off from a spinlock, it directly calls Windows Sleep() – so it does not show up as a SQL Server wait type at all, even though the sleep call is made from the SQL OS layer. Which is a bummer.
How to get around that and reduce CPU usage? This is really contrived workload, but this can occur for real. Even if I try using WITH (NOLOCK), the NOLOCK seek will take a table SCH_S (schema-stability) lock to make sure the table structure doesn’t change while the seek is occurring, so that only gets rid of the page locks, not the object locks and doesn’t help CPU usage. With this (arguably weird) workload, there are a few things I could do (just off the top of my head):
- Enable read_commited_snapshot for the database, which will reduce the LOCK_HASH spinlock backoffs
- Scale-out the connections to a few copies of the database, updated using replication
- Do some mid-tier or client-side caching, with some notification mechanism of data changes (e.g. a DDL trigger firing a Service Broker message)
So this was an example of where wait stats lead to having to look at spinlock stats, but that the spinlock, on even *deeper* investigation, wasn’t the issue at all. Cool stuff.
Next time we’ll look at another cause of SOS_SCHEDULER_YIELD waits.
Hope you enjoyed this – let me know your thoughts!
PS In the latest SQLskills Insider Quick Tips newsletter from last week, I did a video demo of looking at transaction log IOs and waits – you can get it here.