[Edit 2016: Check out my new resource – a comprehensive library of all wait types and latch classes – see here.]
SQL Server 2014 (and Azure SQL Database V12) added some cool new functionality for online index operations to allow you to prevent long-term blocking because of the two blocking locks that online index operations require.
At the start of any online index operation, it acquires a S (share) table lock. This lock will be blocked until all transactions that are changing the table have committed, and while the lock is pending, it will block any transactions wanting to change the table in any way. The S lock is only held for a short amount of time, then dropped to an IS (Intent-Share) lock for the long duration of the operation. At the end of any online index operation, it acquires a SCH-M (schema modification) table lock, which you can think of as a super-exclusive lock. This lock will be blocked by any transaction accessing or changing the table, and while the lock is pending, it will block any transactions wanting to read or change the table in any way.
The new syntax allow you to specify how long the online index operation will wait for each of these locks, and what to do when the timeout expires (nothing: NONE, kill the online index operation: SELF, or kill the blockers of the online index operation: BLOCKERS – see Books Online for more info). While the online index operation is blocked, it shows a different lock wait type than we’re used to seeing, and any lock requests are allowed to essentially jump over the online index operation in the lock pending queues – i.e. the online index operation waits with lower priority than everything else on the system.
To demonstrate this, I’ve got a table called NonSparseDocRepository, with a clustered index called NonSparse_CL, and 100,000 rows in the table.
First, I’ll kick off an online index rebuild of the clustered index, specifying a 1 minute wait, and to kill itself of the wait times out:
ALTER INDEX [NonSparse_CL] ON [nonsparsedocrepository] REBUILD WITH (FILLFACTOR = 70, ONLINE = ON ( WAIT_AT_LOW_PRIORITY ( MAX_DURATION = 1 MINUTES, ABORT_AFTER_WAIT = SELF) ) ); GO
I let it run for ten seconds or so, so make sure it got past the initial table S lock required. Now, in another connection, I’ll start a transaction that takes an IX table lock, which will block the final SCH-M lock the online index operation requires:
BEGIN TRAN; GO UPDATE [NonSparseDocRepository] SET [c4] = '1' WHERE [DocID] = 1; GO
And then I’ll wait until the drive light on my laptop goes off, which lets me know that the online index rebuild is stalled. If I look in sys.dm_os_waiting_tasks (using the script in this post), I’ll see the rebuild is blocked (script output heavily edited for clarity and brevity):
session_id exec_context_id scheduler_id wait_duration_ms wait_type blocking_session_id resource_description 57 0 4 7786 LCK_M_SCH_M_LOW_PRIORITY 58 objectlock
Look at the wait type: LCK_M_SCH_M_LOW_PRIORITY. The _LOW_PRIORITY suffix indicates that this is a special lock wait attributable to the online index operation being blocked.
This also neatly proves that the wait-at-low-priority feature applies to both the blocking locks that online index operations require, even if the first one isn’t blocked.
And eventually the online index operation fails, as follows:
Msg 1222, Level 16, State 56, Line 1 Lock request time out period exceeded.
If I leave that open transaction in the other connection (holding its IX table lock), and try the index rebuild again, with the exact same syntax, it’s immediately blocked and the sys.dm_os_waiting_tasks script shows:
session_id exec_context_id scheduler_id wait_duration_ms wait_type blocking_session_id resource_description 57 0 4 8026 LCK_M_S_LOW_PRIORITY 58 objectlock
This shows that the initial blocking lock is blocked, and is waiting at low priority.
So if either of these wait types show up during your regular wait statistics analysis, now you know what’s causing them.
10 thoughts on “Low priority locking wait types”
I am a little confused, like that’s news:) Anyway, I have not had the opportunity to work with Enterprise and the Online option has always confused me a bit.
I think you are saying
1). that the initial Shared Table Lock is maintained throughout the Online index operation which means the Online index operation will be blocked until any pending DDL or Insert/Update/Delete operations on the table have completed.
2.) Once the Online index operation commences, Reads on the table will proceed but any other operations will block
3.) When the Index Operation completes it acquires the Sch-M which blocks any other operations.
The BOL article seems to say the initial Shared Lock is temporary and that only an Intent Shared is held during the operation.
So I assume the index operation is building a ‘shadow’ of the new index and then doing a meta-data swap at completion. Is that more or less correct?
Also, is it possible to specify different different timeout options for the initial Share lock and the final Sch-M lock? It seems like the option for the first might be SELF but I think the final option would always be NONE or BLOCKERS. Otherwise, we’d have wasted a lot of processing and I/O.
No problem :-)
1) No – that’s not how it works. The S lock is only held for a short time then dropped to an IS lock. I updated the post to make that clear.
2) No – because of the IS, updates are allowed too
Yes, the operation builds a non-visible, in-build index that is then swapped at the metadata level for the old index.
No, it’s not possible to specify different timeouts/options for the two blocking locks. The thinking behind that is that it’s more likely that it’ll be blocked by the Sch-M than the S.
Does it mean threads rebuilding index got OS level low priority than others so its easy to kill by OS
Please help me understand
No, nothing to do with OS.
On a related subject, I have a situation where the auto update stats background task is blocking readers.
1 – Index reorganize starts from a SQL Agent job.
2 – Two minutes later, a background task is blocked by the index reorganize, waiting to acquire a schema modification lock to update the statistics for the same index.
3 – Dozens of sessions running read-only queries wait to acquire a schema stability lock, pilling up behind this background task. The pile-up lasts until the index reorganize finishes after 8 minutes. It had been holding exclusive locks for the duration of the ALTER INDEX statement execution. So much for “index reorganize holds only short lived locks”.
We know that an index reorganize will not block readers directly but it makes sense that it will block an attempted schema modification. That in turn will block readers.
Shouldn’t the auto update stats background task wait to acquire a schema modification lock at low priority by default? This would avoid blocking readers!
Absolutely. It’s a SQL Server ‘bug’ that it doesn’t wait at low priority for a loooong time for the schema-mod lock.
Hi Paul, could you please confirm that if the setting were MAX_DURATION = 1 MINUTES, ABORT_AFTER_WAIT = SELF and the index rebuild runs for say 5 hours, then at the end of any online index operation, if it cant acquire the SCH-M (schema modification) table lock, the killing of the online index operation would not cause a rollback? I.e, the non-visible, in-build index can just disappear quietly into the night.
That is my understanding, yes.
I just happened upon this while investigating a prod issue I’m having related to this. My automated reindex was running and got to table A and had been running for 20 some minutes. A lot of blocking was noted on the server around table A but the SPID of the reindex job was NEVER seen as a blocker (only as being blocked). When I stopped the job, the blocking IMMEDIATELY went away. This has happened intermittently when the reindex was working on several of our “hot” tables. I’m running SQL 2016 Enterprise in an AG. RCSI is turned on for this particular DB. The rebuild command being executed had the following – REBUILD WITH (ONLINE = ON (WAIT_AT_LOW_PRIORITY(MAX_DURATION = 1 MINUTES , ABORT_AFTER_WAIT = SELF )) , SORT_IN_TEMPDB = ON, MAXDOP = 2). This has “worked as advertised” on other occasions but in certain instances causes the blocking. Have you ever encountered this behavior?
I haven’t encountered this. If/when it happens, can you let me know what lock the rebuild is waiting for?