The DBMIRROR_DBM_MUTEX wait type is undocumented. Or rather, it is documented in books online as being “Identified for informational purposes only. Not supported. Future compatibility is not guaranteed.” Paul Randal had some additional information on this wait type in his “Wait statistics, or please tell me where it hurts” post, defining the wait type as “… contention for the send buffer that database mirroring shares between all the mirroring sessions on a server. It could indicate that you’ve got too many mirroring sessions.”
I saw one forum entry where someone said “you can ignore” the DBMIRROR_DBM_MUTEX wait type because, basically, it is undocumented. This assumes that Microsoft only documents wait types that we should care about – but this really isn’t the case.
A few months ago I had a discussion with a DBA about whether or not this wait type was associated with mirroring contention issues as Paul described or whether this really just represented a background process that naturally accumulated time (idle/spin waits). The DBA said this was an idle wait type that can be ignored. I said that it was indeed associated with activity and coupled with other statistics, may point to mirroring related contention.
As with anything undocumented, while you cannot point to official documentation, there is nothing to stop you from setting up a test scenario and evaluating what happens under specific conditions. The “show, don’t just tell” method is always something I prefer, even when I’m confident that a stated fact is correct.
Question 1: Does DBMIRROR_DBM_MUTEX accumulate during inactive periods?
I set up synchronous mirroring with automatic failover for this test (Credit database) and executing the following simple before/after WAITFOR capture of overall accumulated statistics to see which database mirroring related wait stats incremented during inactive (user inactive) periods. SQL Server version was 10.50.2500. The script runs in SQLCMD mode against CAESAR (principal) and AUGUSTUS (mirror)):
— Principal database instance
:CONNECTCAESAR
SELECT wait_type, wait_time_ms, signal_wait_time_ms
INTO tempdb.dbo.before_waits
FROM sys.dm_os_wait_stats
WHERE wait_type LIKE ‘DBM%’;
GO
— Mirror database instance
:CONNECTAUGUSTUS
SELECT wait_type, wait_time_ms, signal_wait_time_ms
INTO tempdb.dbo.before_waits
FROM sys.dm_os_wait_stats
WHERE wait_type LIKE ‘DBM%’;
GO
:CONNECTCAESAR
USE Credit;
WAITFOR DELAY ’00:00:30′
GO
— Principal
:CONNECTCAESAR
SELECT wait_type, wait_time_ms, signal_wait_time_ms
INTO tempdb.dbo.after_waits
FROM sys.dm_os_wait_stats;
SELECTb.wait_type,
a.wait_time_ms – b.wait_time_ms wait_time_ms,
a.signal_wait_time_ms – b.signal_wait_time_ms signal_time_ms
FROM tempdb.dbo.before_waits b
INNER JOIN tempdb.dbo.after_waits a ON
b.wait_type = a.wait_type
WHERE a.wait_time_ms > b.wait_time_ms
ORDER BY a.wait_time_ms – b.wait_time_ms DESC;
DROP TABLE tempdb.dbo.before_waits;
DROP TABLE tempdb.dbo.after_waits;
GO
:CONNECTAUGUSTUS
— Secondary replica
SELECT wait_type, wait_time_ms, signal_wait_time_ms
INTO tempdb.dbo.after_waits
FROM sys.dm_os_wait_stats;
SELECTb.wait_type,
a.wait_time_ms – b.wait_time_ms wait_time_ms,
a.signal_wait_time_ms – b.signal_wait_time_ms signal_time_ms
FROM tempdb.dbo.before_waits b
INNER JOIN tempdb.dbo.after_waits a ON
b.wait_type = a.wait_type
WHERE a.wait_time_ms > b.wait_time_ms
ORDER BY a.wait_time_ms – b.wait_time_ms DESC;
DROP TABLE tempdb.dbo.before_waits;
DROP TABLE tempdb.dbo.after_waits;
GO
Over a 30 second inactive period, the following DBM related wait stats accumulated on the principal:
wait_type | wait_time_ms | signal_time_ms |
DBMIRROR_EVENTS_QUEUE | 30043 | 5 |
And here is what accumulated on the mirror:
wait_type | wait_time_ms | signal_time_ms |
DBMIRROR_EVENTS_QUEUE | 30966 | 6 |
No DBMIRROR_DBM_MUTEX to be found.
Question 2: Will DBMIRROR_DBM_MUTEX accumulate during a large data load in synchronous mode?
In my second test, I inserted 1,600,000 rows into the dbo.charge table. Granted – we have all kinds of reasons why there could be contention in a mirroring session including network considerations, synchronous versus asynchronous settings, disk contention for the databases (principal/mirror), memory constraints, 32-bit/64-bit considerations, performance overhead of non-mirrored databases on the same instance or server, number of concurrent mirroring sessions and more.
But in a simple test environment with minimal concurrent activity, 4 scheduler and 2 GB of RAM for the principal and the same for the mirror, can I get DBMIRROR_DBM_MUTEX to appear? I ran the following test:
:CONNECTCAESAR
USE Credit;
INSERT dbo.charge
(member_no, provider_no, category_no, charge_dt, charge_amt, statement_no, charge_code)
SELECT member_no, provider_no, category_no, charge_dt, charge_amt, statement_no, charge_code
FROM dbo.charge;
GO
These are the database mirroring related wait statistics that accumulated on the principal SQL Server instance:
wait_type | wait_time_ms | signal_time_ms |
DBMIRRORING_CMD | 253578 | 2083 |
DBMIRROR_EVENTS_QUEUE | 83949 | 23775 |
DBMIRROR_SEND | 5017 | 25 |
DBMIRROR_DBM_EVENT | 12 | 0 |
And here is what accumulated on the mirror:
wait_type | wait_time_ms | signal_time_ms |
DBMIRROR_DBM_MUTEX | 482105 | 222 |
DBMIRROR_SEND | 258917 | 5338 |
DBMIRROR_EVENTS_QUEUE | 84676 | 3848 |
Changing to the mirroring session to asynchronous mode instead of synchronous, I saw the following on the principal:
wait_type | wait_time_ms | signal_time_ms |
DBMIRROR_EVENTS_QUEUE | 75359 | 22974 |
DBMIRRORING_CMD | 64219 | 1655 |
DBMIRROR_SEND | 8740 | 60 |
And I saw the following on the mirror:
wait_type | wait_time_ms | signal_time_ms |
DBMIRROR_DBM_MUTEX | 113730 | 105 |
DBMIRROR_EVENTS_QUEUE | 78699 | 3051 |
DBMIRROR_SEND | 69729 | 4057 |
So I saw the DBMIRROR_DBM_MUTEX wait type appear (on the mirror side) for both the synchronous and asynchronous modes during an INSERT of 1,600,000 rows. There are plenty of other tests I could have executed – but I wanted to show just a couple of examples to demonstrate viewing what gets accumulated over a specific period of time (and differentiating between idle waits versus activity-related waits).
Does this necessarily point to a database mirroring performance issue? As I’ve mentioned before, I use wait statistics as an initial pointer so that I’m prioritizing my efforts appropriately. We don’t have enough information to define root cause at this point just based on this data, but what we can derive through testing and observation is which wait types may be seen in conjunction with user activity in a mirrored partnership.