Continuing my series on advanced performance troubleshooting – see these two posts for the scripts I'll be using and an introduction to the series:
In this blog post I'd like to show you an example of SOS_SCHEDULER_YIELD waits occurring and how it can seem like a spinlock is the cause.
I originally published this blog post and then had a discussion with good friend Bob Ward from Product Support who questioned my conclusions given what he's seen (thanks Bob!). After digging in further, I found that my original post was incorrect, so this is the corrected version.
The SOS_SCHEDULER_YIELD wait means that an executing thread voluntarily gave up the CPU to allow other threads to execute. The SQL Server code is sprinkled with "voluntary yields" in places where high CPU usage may occur.
One such place where a thread will sleep but not explicitly yield is when backing off after a spinlock collision waiting to see if it can get access to the spinlock. A spinlock is a very lightweight synchronization mechanism deep inside SQL Server that protects access to a data structure (not database data itself). See the end of the second blog post above for more of an explanation on spinlocks.
When a thread voluntarily yields, it does not go onto the waiting tasks list – as it's not waiting for anything - but instead goes to the bottom of the RUNNABLE queue. SOS_SCHEDULER_YIELD waits by themselves are not cause for concern unless they are the majority of waits on the system, and performance is suffering.
To set up the test, I'll create a simple database and table:
USE master;
GO
DROP DATABASE YieldTest;
GOCREATE DATABASE YieldTest;
GOUSE YieldTest;
GOCREATE TABLE SampleTable (c1 INT IDENTITY);
GO
CREATE NONCLUSTERED INDEX SampleTable_NC ON SampleTable (c1);
GOSET NOCOUNT ON;
GO
INSERT INTO SampleTable DEFAULT VALUES;
GO 100
Then I'll clear out wait stats and latch stats:
DBCC SQLPERF ('sys.dm_os_latch_stats', CLEAR);
GODBCC SQLPERF ('sys.dm_os_wait_stats', CLEAR);
GO
And then fire up 50 clients running the following code (I just have a CMD script that fires up 50 CMD windows, each running the T-SQL code):
USE YieldTest;
GOSET NOCOUNT ON;
GODECLARE @a int
WHILE (1=1)
BEGIN
SELECT @a = COUNT (*) FROM YieldTest..SampleTable WHERE c1 = 1;
END;
GO
And the CPUs in my laptop are jammed solid (snapped from my desktop and rotated 90 degrees to save space):
Wow!
Looking at perfmon:
The CPU usage is not Windows (% User Time), and of all the counters I usually monitor (I've cut them all out here for clarity) I can see a sustained very high Lock Requests/sec for Object and Page locks (almost 950 thousand requests per second for both types! Gotta love my laptop :-)
So let's dig in. First off, looking at wait stats (using the script in the wait stats post referenced above):
WaitType Wait_S Resource_S Signal_S WaitCount Percentage AvgWait_S AvgRes_S AvgSig_S
——————- ——- ———- ——– ——— ———- ——— ——– ——–
SOS_SCHEDULER_YIELD 4574.77 0.20 4574.57 206473 99.43 0.0222 0.0000 0.0222
SOS_SCHEDULER_YIELD at almost 100% of the waits on the system means that I've got CPU pressure – as I saw from the CPU graphs above. The fact that nothing else is showing up makes me suspect this is a spinlock issue.
Checking in the sys.dm_os_waiting_tasks output (see script in the second blog post referenced above), I see nothing waiting, and if I refresh a few times I see the occasional ASYNC_NETWORK_IO and/or PREEMPTIVE_OS_WAITFORSINGLEOBJECT wait type – which I'd expect from the CMD windows running the T-SQL code.
Checking the latch stats (again, see script in the second blog post referenced above) shows no latch waits apart from a few BUFFER latch waits.
So now let's look at spinlocks. First off I'm going to take a snapshot of the spinlocks on the system:
– Baseline
DROP TABLE ##TempSpinlockStats1;
GO
SELECT * INTO ##TempSpinlockStats1 FROM sys.dm_os_spinlock_stats
WHERE [collisions] > 0
ORDER BY [name];
GO
(On 2005 you'll need to use DBCC SQLPERF ('spinlockstats') and use INSERT/EXEC to get the results into a table – see the post above for example code.)
Then wait 10 seconds or so for the workload to continue… and then grab another snapshot of the spinlocks:
– Capture updated stats
DROP TABLE ##TempSpinlockStats2;
GO
SELECT * INTO ##TempSpinlockStats2 FROM sys.dm_os_spinlock_stats
WHERE [collisions] > 0
ORDER BY [name];
GO
Now running the code I came up with to show the difference between the two snapshots:
– Diff them
SELECT
'***' AS [New],
ts2.[name] AS [Spinlock],
ts2.[collisions] AS [DiffCollisions],
ts2.[spins] AS [DiffSpins],
ts2.[spins_per_collision] AS [SpinsPerCollision],
ts2.[sleep_time] AS [DiffSleepTime],
ts2.[backoffs] AS [DiffBackoffs]
FROM ##TempSpinlockStats2 ts2
LEFT OUTER JOIN ##TempSpinlockStats1 ts1
ON ts2.[name] = ts1.[name]
WHERE ts1.[name] IS NULL
UNION
SELECT
'' AS [New],
ts2.[name] AS [Spinlock],
ts2.[collisions] – ts1.[collisions] AS [DiffCollisions],
ts2.[spins] – ts1.[spins] AS [DiffSpins],
CASE (ts2.[spins] – ts1.[spins]) WHEN 0 THEN 0
ELSE (ts2.[spins] – ts1.[spins]) / — > 0 spins = > 0 collisions
(ts2.[collisions] – ts1.[collisions]) END AS [SpinsPerCollision],
ts2.[sleep_time] – ts1.[sleep_time] AS [DiffSleepTime],
ts2.[backoffs] – ts1.[backoffs] AS [DiffBackoffs]
–, ts2.*
FROM ##TempSpinlockStats2 ts2
LEFT OUTER JOIN ##TempSpinlockStats1 ts1
ON ts2.[name] = ts1.[name]
WHERE ts1.[name] IS NOT NULL
AND ts2.[collisions] – ts1.[collisions] > 0
ORDER BY [New] DESC, [Spinlock] ASC;
GONew Spinlock DiffCollisions DiffSpins SpinsPerCollision DiffSleepTime DiffBackoffs
—- ——————————- ————– ———- —————– ————- ————
LOCK_HASH 6191134 4005774890 647 686 1601383
OPT_IDX_STATS 1164849 126549245 108 57 7555
SOS_OBJECT_STORE 73 305 4 0 0
SOS_WAITABLE_ADDRESS_HASHBUCKET 115 44495 386 0 3
XDESMGR 1 0 0 0 0
This is telling me that there is massive contention for the LOCK_HASH spinlock, further backed up by the SOS_WAITABLE_ADDRESS_HASHBUCKET spinlock. The LOCK_HASH spinlock protects access to the hash buckets used by the lock manager to efficiently keep track of the lock resources for locks held inside SQL Server and to allow efficient searching for lock hash collisions (i.e. does someone hold the lock we want in an incompatible mode?). In this case the contention is so bad that instead of just spinning, the threads are actually backing off and letting other threads execute to allow progress to be made.
And that makes perfect sense because of what my workload is doing - 50 concurrent connections all trying to read the same row on the same page in the same nonclustered index.
But is that the cause of the SOS_SCHEDULER_YIELDs? To prove it one way or the other, I created an XEvent session that would capture callstacks when a wait occcured:
CREATE EVENT SESSION MonitorWaits ON SERVER
ADD EVENT sqlos.wait_info
(ACTION (package0.callstack))
ADD TARGET package0.asynchronous_bucketizer (
SET filtering_event_name = 'sqlos.wait_info',
source_type = 1, source = 'package0.callstack')
WITH (MAX_MEMORY=50MB, MEMORY_PARTITION_MODE = PER_NODE,
max_dispatch_latency=5 seconds)
GO
I also made sure to have the correct symbol files in the \binn directory (see How to download a sqlservr.pdb symbol file). After running the workload and examining the callstacks, I found the majority of the waits were coming from voluntary yields deep in the Access Methods code. An example call stack is:
IndexPageManager::GetPageWithKey+ef [ @ 0+0x0
GetRowForKeyValue+146 [ @ 0+0x0
IndexRowScanner::EstablishInitialKeyOrderPosition+10a [ @ 0+0x0
IndexDataSetSession::GetNextRowValuesInternal+7d7 [ @ 0+0x0
RowsetNewSS::FetchNextRow+12a [ @ 0+0x0
CQScanRangeNew::GetRow+6a1 [ @ 0+0x0
CQScanCountStarNew::GetRowHelper+44 [ @ 0+0x0
CQScanStreamAggregateNew::Open+70 [ @ 0+0x0
CQueryScan::Uncache+32f [ @ 0+0x0
CXStmtQuery::SetupQueryScanAndExpression+2a2 [ @ 0+0x0
CXStmtQuery::ErsqExecuteQuery+2f8 [ @ 0+0x0
CMsqlExecContext::ExecuteStmts<1,1>+cca [ @ 0+0x0
CMsqlExecContext::FExecute+58b [ @ 0+0x0
CSQLSource::Execute+319
[Edit: check out how to do this in the spinlock whitepaper]
This is clearly (to me) nothing to do with the LOCK_HASH spinlock, so that's a red herring. In this case, I'm just CPU bound. When a thread goes to sleep when backing off from a spinlock, it directly calls Windows Sleep() – so it does not show up as a SQL Server wait type at all, even though the sleep call is made from the SQLOS layer. Which is a bummer.
How to get around that and reduce CPU usage? This is really contrived workload, but this can occur for real. Even if I try using WITH (NOLOCK), the NOLOCK seek will take a table SCH_S (schema-stability) lock to make sure the table structure doesn't change while the seek is occurring, so that only gets rid of the page locks, not the object locks and doesn't help CPU usage. With this (arguably weird) workload, there are a few things I could do (just off the top of my head):
-
Scale-out the connections to a few copies of the database, updated using replication
- Do some mid-tier or client-side caching, with some notification mechanism of data changes (e.g. a DDL trigger firing a Service Broker message)
So this was an example of where wait stats lead to having to look at spinlock stats, but that the spinlock, on even *deeper* investigation, wasn't the issue at all. Cool stuff.
Next time we'll look at another cause of SOS_SCHEDULER_YIELD waits.
Hope you enjoyed this – let me know your thoughts!
Cheers
PS In the latest SQLskills Insider Quick Tips newsletter from last week, I did a video demo of looking at transaction log IOs and waits – you can get it here.
12 Responses to SOS_SCHEDULER_YIELD waits and the LOCK_HASH spinlock
Very good post Paul. Thanks much.
Regards
Meher
Hi Paul,
thanks for this. There’s so much in depth stuff in this post, it is very interesting to get deep into the internals. I’d like to see more like this.
Thanks again
Paul.
Hi Paul,
Thanks again for the awesome info…. you really do give a good insight into SQL internals.
Look forward to many more posts.
Steve
Paul, any way to do this in 2005? This is very relevant to some issues we’ve had over the past week.
Thanks,
Eric Humphrey
Hey Eric – everything here should work on 2005 except XEvents.
And you’ll need to use DBCC SQLPERF(spinlockstats) instead and change the snapshot code to use INSERT/EXEC into two tables with the right column names/types
Did my exercise (Thanks for the help):
– Baseline
IF OBJECT_ID(‘tempdb..#TempSpinlockStats1′) IS NOT NULL
DROP TABLE #TempSpinlockStats1;
GO
CREATE TABLE #TempSpinlockStats1(
[name] NVARCHAR(30) NOT NULL,
[collisions] BIGINT NOT NULL,
[spins] BIGINT NOT NULL,
[spins_per_collision] FLOAT NOT NULL,
[sleep_time] BIGINT NOT NULL,
[backoffs] BIGINT NOT NULL
)
INSERT INTO #TempSpinlockStats1
EXEC (‘DBCC SQLPERF(”spinlockstats”)’)
GO
Excellent – I updated both posts with credit to you – thanks Eric!
Hi Paul,I have never heard about spinlocks till you blogged about it, thanks for sharing the knowledge.
Thanks
Thanks Paul Randals :)
Hi Paul,
Very interesting. There is one thing I don’t understand and that may want to explain. 50 ongoing processes on 8 schedulers, isn’t that exactly why processes should yield? Because otherwise only 8 of the 50 could do any work? So the I don’t understand your: “The fact that nothing else is showing up makes me suspect this is a spinlock issue.” In my understanding there would be scheduler_yield waits also without spinlock contention.
Thanks for your reply!
Paul
Yes indeed – which is the point of the article – demonstrating that spinlock contention is very rarely the root cause of SOS_SCHEDULER_YIELDs.