At the beginning of the year, I wrote an article titled Retrieving Deadlock Graphs with SQL Server 2008 Extended Events that detailed how to use the default system_health session to retrieve deadlock graphs from SQL Server 2008 without having to use SQL Profiler, SQL Trace or enabling a Trace flag on the SQL Server.  In writing that article I happened upon a bug that is covered in the article with a work around in the Deadlock XML that is output by the Extended Events Engine.  That bug was filed with Microsoft and is covered in the following connect feedback:

https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=404168&wa=wsignin1.0

When I first encountered the bug, I traded emails with Jerome Halmans, one of the developers for Extended Events, who confirmed the bug, and helped me validate the workaround to generate valid XML.  During this exchange it was brought up that the Deadlock Monitor in SQL Server was reworked to add support for Multi-Victim Deadlocks, and the new XML output in Extended Events can be used to identify such a monster.

This sparked my interests and I tried for a few days tirelessly to actually create a multi-victim deadlock to no avail.  I was warned that they can be difficult to actually produce, but it is possible to do.  I eventually gave up trying a few days later, due mostly to the fact that I was going to be gone for two months, but it was something in my list of things to figure out.

Fast forward to last week when I got an email from Yuxi Bai, a member of the SQL Development team who actually worked on the Deadlock Monitor.  He had read my article on SQL Server Central and noted that there was a discrepancy in the explanation of the bug in the XML when compared to the actual work around that corrects the bug.  The explanation of the defect in the article is over simplified, and was written based on my first findings when looking at the bug.  It is actually slightly more complex than the article makes out, which I figured out before the article was published and I changed the code for the workaround so that it correctly generated a deadlock graph.  A small difference in wording makes the article correct, and I have submitted a correction to the article.

Since I had his attention anyway, I decided to ask again about creating a Multi-Victim Deadlock, and this time I got the missing hint that I needed to actually trigger one.  Set one SPID to a higher Deadlock Priority than the other two, and voila, I was able to trigger a multi-victim deadlock.  All was cool, and I thought I was set to write a neat article about the new Deadlock XML, only to find that the XML from my workaround was now once again invalid.  It took a few minutes to figure out, but an invalid end tag is generated when multiple victims exist in the new output.  So I shot an email back to Yuxi to get validation, and to determine if a new connect item needed to be submitted.  Since it falls within the scope of the existing feedback, invalid XML in the Deadlock Event for Extended Events, which has not been released as fixed in a Cumulative Update or Service Pack, it was filed as a part of the existing connect item and will be fixed in the same release.

So what exactly does a multi-victim deadlock look like?  Using the attached scripts in the following order, a multi-victim deadlock can be triggered in SQL Server 2008:

  1. Run the setup script
  2. Run Transaction 1 to the — Stop Here tag.
  3. Run Transaction 3 to the — Stop Here tag.
  4. Run Transaction 2 to the — Stop Here tag.
  5. Run the remainder of Transaction 1.
  6. Run the remainder of Transaction 3
  7. Run the remainder of Transaction 2.   Deadlock should be triggered and rollback Transactions 1 and 3.

The deadlock graph will look as follows if you use the work around code from the SQL Server Central Article, excluding the outer CAST as xml of the deadlock graphs which will raise an exception:

<deadlock-list>
  <deadlock>
    <victim-list>
        <victimProcess id="processb1b390">
        <victimProcess id="process5ac6718"/>
      </victim-list>
    <process-list>
      <process id="processb1b390" taskpriority="5" logused="0" waitresource="OBJECT: 2:37575172:0 " waittime="8810" 
ownerId="1467" transactionname="user_transaction" lasttranstarted="2009-05-26T23:31:06.170" XDES="0x57f1b30" 
lockMode="IX" schedulerid="1" kpid="60524" status="suspended" spid="54" sbid="0" ecid="0" priority="-5" 
trancount="2" lastbatchstarted="2009-05-26T23:31:35.467" lastbatchcompleted="2009-05-26T23:31:06.170" 
clientapp="Microsoft SQL Server Management Studio - Query" hostname="LT-JKEHAYIAS" hostpid="2724" 
loginname="OUTBACKNT\JKehayias" isolationlevel="read committed (2)" xactid="1467" currentdb="2" 
lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200">
        <executionStack>
          <frame procname="" line="3" stmtstart="38" sqlhandle="0x020000006f183c102e5e14b67437dc81884f5c2bc608d7d1"> 
</frame>
        </executionStack>
        <inputbuf>  -- X LOCK ON R2 UPDATE r2 SET rowid = rowid +1 FROM r2 WITH (XLOCK, HOLDLOCK)      </inputbuf>
      </process>
      <process id="process5ac6718" taskpriority="5" logused="0" waitresource="OBJECT: 2:53575229:0 " waittime="13580" 
ownerId="1500" transactionname="user_transaction" lasttranstarted="2009-05-26T23:31:13.840" XDES="0x677c280" 
lockMode="IX" schedulerid="1" kpid="61076" status="suspended" spid="55" sbid="0" ecid="0" priority="-5" 
trancount="2" lastbatchstarted="2009-05-26T23:31:30.700" lastbatchcompleted="2009-05-26T23:31:13.840" 
clientapp="Microsoft SQL Server Management Studio - Query" hostname="LT-JKEHAYIAS" hostpid="2724" 
loginname="OUTBACKNT\JKehayias" isolationlevel="read committed (2)" xactid="1500" currentdb="2" 
lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200">
        <executionStack>
          <frame procname="" line="2" stmtstart="34" sqlhandle="0x02000000f52a3407c874ca4873bf6b08bffe3c14823c1f39">        </frame>
        </executionStack>
        <inputbuf> -- X LOCK ON R3 UPDATE r3 SET rowid = rowid +1 FROM r3 WITH (XLOCK, HOLDLOCK)      
</inputbuf>
      </process>
      <process id="process5ac61c0" taskpriority="-5" logused="284" waitresource="OBJECT: 2:21575115:0 " waittime="2691" 
ownerId="1531" transactionname="user_transaction" lasttranstarted="2009-05-26T23:31:24.153" XDES="0x677dbe0" 
lockMode="IX" schedulerid="1" kpid="3460" status="suspended" spid="56" sbid="0" ecid="0" priority="5" 
trancount="2" lastbatchstarted="2009-05-26T23:31:41.577" lastbatchcompleted="2009-05-26T23:31:24.153" 
clientapp="Microsoft SQL Server Management Studio - Query" hostname="LT-JKEHAYIAS" hostpid="2724" 
loginname="OUTBACKNT\JKehayias" isolationlevel="read committed (2)" xactid="1531" currentdb="2" 
lockTimeout="4294967295" clientoption1="671090784" clientoption2="390200">
        <executionStack>
          <frame procname="" line="2" stmtstart="34" sqlhandle="0x020000007107001868d80e0fb28607c5ab8111e7495ccb44">        </frame>
        </executionStack>
        <inputbuf> -- X LOCK ON R3 UPDATE r1 SET rowid = rowid +1 FROM r1 WITH (XLOCK, HOLDLOCK) 
</inputbuf>
      </process>
    </process-list>
    <resource-list>
      <objectlock lockPartition="0" objid="37575172" subresource="FULL" dbid="2" objectname="" id="lock48b7300" mode="X" 
associatedObjectId="37575172">
        <owner-list>
          <owner id="process5ac61c0" mode="X"/>
        </owner-list>
        <waiter-list>
          <waiter id="processb1b390" mode="IX" requestType="wait"/>
        </waiter-list>
      </objectlock>
      <objectlock lockPartition="0" objid="53575229" subresource="FULL" dbid="2" objectname="" id="lock48b6440" mode="X" 
associatedObjectId="53575229">
        <owner-list>
          <owner id="process5ac61c0" mode="X"/>
        </owner-list>
        <waiter-list>
          <waiter id="process5ac6718" mode="IX" requestType="wait"/>
        </waiter-list>
      </objectlock>
      <objectlock lockPartition="0" objid="21575115" subresource="FULL" dbid="2" objectname="" id="lock48b6d00" mode="S" 
associatedObjectId="21575115">
        <owner-list>
          <owner id="process5ac6718" mode="S"/>
          <owner id="processb1b390" mode="S"/>
        </owner-list>
        <waiter-list>
          <waiter id="process5ac61c0" mode="IX" requestType="wait"/>
        </waiter-list>
      </objectlock>
    </resource-list>
  </deadlock>
</deadlock-list>

Note that the first victim is missing the closing back slash for the XML node:

<victimProcess id="processb1b390">

it should actually be:

<victimProcess id="processb1b390"/>

This should be fixed in a coming update/service pack for SQL Server 2008.  Now what is really important here is that the only place you can get a multi-victim deadlock graph is Extended Events.  If you run a profiler trace or enable trace flag 1222 on the server, you will get two single deadlock graphs, and it is up to you to correlate that the two deadlocks are actually one deadlock.  The same thing occurs in SQL Server 2005.

While writing this blog post, I happened to notice another difference between the two deadlock graphs.  If you look at the <executionStack> in the Extended Events Deadlock graph, the <frame> information is only partially populated when compared to a Deadlock graph from SQL Server profiler.  I sent this back to Jerome and Yuxi as a potential additional problem and learned that this was intentionally missing from the deadlock graph and with good reason.  To retrieve this information requires that additional locks be taken inside the Database Engine which can delay publishing of the Event.  The information is also readily available by querying sys.dm_exec_text() or sys.fn_get_sql() with the provided sqlhandle in the <frame> attributes.