Tracking Compiles with Extended Events

This post comes from a question by Tara Kizer on DBA StackExchange related to Compilations/Sec in PerfMon and using Extended Events to try and track the compilations that were happening in SQL Server for one of their clients.  There are a number of events in Extended Events that relate to plans in SQL Server, and Tara was using the sqlserver.query_pre_execution_showplan event as a way of tracking compiles with Extended Events.  When I read the post on StackExchange this stood out to me as not being the correct event for tracking an actual compilation, and I would have selected the query_post_compilation_showplan event instead.  These two events have similar descriptions, but they differ by one specific word:

query_pre_execution_showplan

Occurs after a SQL statement is compiled. This event returns an XML representation of the estimated query plan that is generated when the query is optimized. Using this event can have a significant performance overhead so it should only be used when troubleshooting or monitoring specific problems for brief periods of time.

query_post_compilation_showplan

Occurs after a SQL statement is compiled. This event returns an XML representation of the estimated query plan that is generated when the query is compiled. Using this event can have a significant performance overhead so it should only be used when troubleshooting or monitoring specific problems for brief periods of time.

NOTE: Both of these event descriptions state “Using this event can have a significant performance overhead so it should only be used when troubleshooting or monitoring specific problems for brief periods of time.” There are only a handful of events in Extended Events that include this warning in their description and each can introduce significant performance impacts to production workloads, so using them in live workloads is not something I’d recommend. (query_post_execution_showplan impact post)

The events aren’t exactly the same in description and occur at different times using the repro example from the StackExchange post. Using a much larger event session definition that includes additional events, it is easy to see where compilations actually are happening.  The event session definition used for proofing this is:

CREATE EVENT SESSION [Compiles] ON SERVER
ADD EVENT sqlserver.query_post_compilation_showplan(
    ACTION(sqlserver.client_app_name,sqlserver.sql_text)
    WHERE ([sqlserver].[is_system]=(0))),
ADD EVENT sqlserver.query_pre_execution_showplan(
    ACTION(sqlserver.client_app_name,sqlserver.sql_text)
    WHERE ([sqlserver].[is_system]=(0))),
ADD EVENT sqlserver.sp_cache_hit(
    ACTION(sqlserver.client_app_name,sqlserver.sql_text)
    WHERE ([sqlserver].[is_system]=(0))),
ADD EVENT sqlserver.sp_cache_insert(
    ACTION(sqlserver.client_app_name,sqlserver.sql_text)
    WHERE ([sqlserver].[is_system]=(0))),
ADD EVENT sqlserver.sp_cache_miss(
    ACTION(sqlserver.client_app_name,sqlserver.sql_text)
    WHERE ([sqlserver].[is_system]=(0))),
ADD EVENT sqlserver.sp_cache_remove(
    ACTION(sqlserver.client_app_name,sqlserver.sql_text)
    WHERE ([sqlserver].[is_system]=(0))),
ADD EVENT sqlserver.sp_statement_completed(
    ACTION(sqlserver.client_app_name,sqlserver.sql_text)
    WHERE ([sqlserver].[is_system]=(0))),
ADD EVENT sqlserver.sp_statement_starting(
    ACTION(sqlserver.client_app_name,sqlserver.sql_text)
    WHERE ([sqlserver].[is_system]=(0))),
ADD EVENT sqlserver.sql_batch_completed(
    ACTION(sqlserver.client_app_name,sqlserver.sql_text)
    WHERE ([sqlserver].[is_system]=(0))),
ADD EVENT sqlserver.sql_batch_starting(
    ACTION(sqlserver.client_app_name,sqlserver.sql_text)
    WHERE ([sqlserver].[is_system]=(0))),
ADD EVENT sqlserver.sql_statement_completed(
    ACTION(sqlserver.client_app_name,sqlserver.sql_text)
    WHERE ([sqlserver].[is_system]=(0))),
ADD EVENT sqlserver.sql_statement_recompile(
    ACTION(sqlserver.client_app_name,sqlserver.sql_text)
    WHERE ([sqlserver].[is_system]=(0))),
ADD EVENT sqlserver.sql_statement_starting(
    ACTION(sqlserver.client_app_name,sqlserver.sql_text)
    WHERE ([sqlserver].[is_system]=(0))),
ADD EVENT sqlserver.uncached_sql_batch_statistics(
    ACTION(sqlserver.client_app_name,sqlserver.sql_text)
    WHERE ([sqlserver].[is_system]=(0)))
WITH (MAX_MEMORY=4096 KB,
      EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
      MAX_DISPATCH_LATENCY=30 SECONDS,
      MAX_EVENT_SIZE=0 KB,
      MEMORY_PARTITION_MODE=NONE,
      TRACK_CAUSALITY=ON,
      STARTUP_STATE=OFF)
GO

Using part of the repro example from the StackExchange post with slight changes to include clearing the plan cache and also to break the steps down into individual batches and then individual statements we can show how the plans were being used from cache and where compiles actually are occurring.

CREATE TABLE t1 (transaction_id int, Column2 varchar(100));
CREATE TABLE t2 (Column1 varchar(max), Column2 varchar(100));
GO

CREATE TRIGGER t2_ins
ON t2
AFTER INSERT
AS

INSERT INTO t1
SELECT (SELECT TOP 1 transaction_id FROM sys.dm_tran_active_transactions), Column2
FROM inserted;
GO

DBCC FREEPROCCACHE
GO

--Both of these show compilation events
INSERT INTO t2 VALUES ('row1', 'value1');
INSERT INTO t2 VALUES ('row2', 'value2');
GO

--Both of these show compilation events
INSERT INTO t2 VALUES ('row1', 'value1');
INSERT INTO t2 VALUES ('row2', 'value2');
GO

--Both of these show compilation events
INSERT INTO t2 VALUES ('row1', 'value1');
GO

INSERT INTO t2 VALUES ('row2', 'value2');
GO

Here you can see the first compilation happening for the INSERT statements as prepared plans being auto-parameterized in the green box. The trigger is compiled in the red box and the plan is inserted into the cache as shown by the sp_cache_insert event. Then in the orange box the trigger execution gets a cache hit and reuses the trigger plan for the second INSERT statement in the batch, so it’s not compiling every execution of the INSERT command and the plan does get reused as you can see with the sp_cache_hit event for the trigger.  The uncached_sql_batch_statistics event also fires for the initial batch since it doesn’t have a plan in cache and has to compile.

If we run the two INSERT statements individually again after the first execution the trigger doesn’t compile again as shown in the events below:

Here the first statement encounters a cache hit for the prepared auto-parameterized version of the statement in cache but a miss for the adhoc batch that was submitted. The trigger gets a cache hit and doesn’t compile again as shown in the red block of events. The green block of events repeats this behavior for the second INSERT statement run as a separate batch. However, in every case you still see the query_pre_execution_showplan event firing which I can only attribute to the difference in being optimized vs compiled in the event description, but the trigger isn’t compiling for every execution as shown by these series of events.  The query_post_compilation_showplan event only fires when the plan is actually being compiled for the statements and/or trigger.

Implicit Conversions that cause Index Scans

Implicit conversions seem to be a growing trend in the SQL Server performance tuning work that I’ve been engaged in recently, and I’ve blogged in the past about ways to identify when implicit conversions are occurring using the plan cache.

For those of you who don’t know, implicit conversions occur whenever data with two different data types are being compared, based on the data type precedence.  The precedence establishes the hierarchy of of the types, and lower precedence data types will always be implicitly converted up to the higher precedence type.  These conversions increase CPU usage for the operation, and when the conversion occurs on a table column can also result in an index scan where an index seek would have been possible without the implicit conversion.

For a long time I’ve wanted to map out the most common data types and the effect of a column-side implicit conversion for creating an index seek versus an index scan and recently I finally got around to mapping it all out.

Setting up the tests

To map out the implicit conversion affects I created two databases using different collations, one using SQL_Latin_General_CP1_CI_AS and the other using Latin_General_CI_AS, and then created the following table in each of the databases.

CREATE TABLE dbo.TestImplicitConverts
(
	RowID int NOT NULL IDENTITY (1, 1),
	BigIntCol bigint NOT NULL,
	BitCol bit NOT NULL,
	CharCol char(10) NOT NULL,
	DateCol date NOT NULL,
	DateTimeCol datetime NOT NULL,
	DateTime2Col datetime2(7) NOT NULL,
	DateTimeOffsetCol datetimeoffset(7) NOT NULL,
	DecimalCol decimal(10, 2) NOT NULL,
	FloatCol float(53) NOT NULL,
	IntCol int NOT NULL,
	MoneyCol money NOT NULL,
	NCharCol nchar(10) NOT NULL,
	NumericCol numeric(10, 2) NOT NULL,
	NVarchrCol nvarchar(50) NOT NULL,
	RealCol real NOT NULL,
	SmallDateTimeCol smalldatetime NOT NULL,
	SmallIntCol smallint NOT NULL,
	SmallMoneyCol smallmoney NOT NULL,
	TimeCol time(7) NOT NULL,
	TinyIntCol tinyint NOT NULL,
	GUIDCol uniqueidentifier NOT NULL,
	VarcharCol varchar(50) NOT NULL,
	CONSTRAINT PK_TestImplicitConverts PRIMARY KEY CLUSTERED (RowID)
);
GO

I then created a nonclustered index on each of the columns in the test table so that a single column query filtering on each column could then generate an execution plan with a single index seek when using the matching data type for the filtering.

-- Create nonclustered indexes on all columns to test implicit conversion affects
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_BigIntCol ON dbo.TestImplicitConverts (BigIntCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_BitCol ON dbo.TestImplicitConverts (BitCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_CharCol ON dbo.TestImplicitConverts (CharCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_DateCol ON dbo.TestImplicitConverts (DateCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_DateTime2Col ON dbo.TestImplicitConverts (DateTime2Col);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_DateTimeCol ON dbo.TestImplicitConverts (DateTimeCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_DateTimeOffsetCol ON dbo.TestImplicitConverts (DateTimeOffsetCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_DecimalCol ON dbo.TestImplicitConverts (DecimalCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_FloatCol ON dbo.TestImplicitConverts (FloatCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_GUIDCol ON dbo.TestImplicitConverts (GUIDCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_NVarcharCol ON dbo.TestImplicitConverts (NVarchrCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_RealCol ON dbo.TestImplicitConverts (RealCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_SmallDateTimeCol ON dbo.TestImplicitConverts (SmallDateTimeCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_SmallIntCol ON dbo.TestImplicitConverts (SmallIntCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_IntCol ON dbo.TestImplicitConverts (IntCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_MoneyCol ON dbo.TestImplicitConverts (MoneyCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_NCharCol ON dbo.TestImplicitConverts (NCharCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_NumericCol ON dbo.TestImplicitConverts (NumericCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_SmallMoneyCol ON dbo.TestImplicitConverts (SmallMoneyCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_TimeCol ON dbo.TestImplicitConverts (TimeCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_TinyIntCol ON dbo.TestImplicitConverts (TinyIntCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_VarcharCol ON dbo.TestImplicitConverts (VarcharCol);
GO

Finally each of the tables was loaded with 467000 rows of random data and the indexes were rebuilt to remove any fragmentation before testing was started.

INSERT INTO dbo.TestImplicitConverts
(	BigIntCol, BitCol, CharCol, DateCol, DateTimeCol, DateTime2Col, DateTimeOffsetCol,
	DecimalCol, FloatCol, IntCol, MoneyCol, NCharCol, NumericCol, NVarchrCol, RealCol,
	SmallDateTimeCol, SmallIntCol, SmallMoneyCol, TimeCol, TinyIntCol, GUIDCol, VarcharCol)
SELECT a.number, a.number%1, CAST(b.name AS CHAR(10)), DATEADD(ms, -1*a.number, GETDATE()), DATEADD(ms, -1*a.number, GETDATE()), DATEADD(ms, -1*a.number, GETDATE()), DATEADD(ms, -1*a.number, GETDATE()),
	a.number, a.number, a.number, a.number, CAST(b.name AS NCHAR(10)), a.number, b.name, a.number,
	DATEADD(ms, -1*a.number, GETDATE()), a.number, a.number, DATEADD(ms, -1*a.number, GETDATE()), a.number%255, NEWID(), b.name
FROM master.dbo.spt_values AS a
CROSS JOIN master.dbo.spt_values AS b
WHERE a.type = N'P'
  AND a.number < 1000
  AND b.name IS NOT NULL;
GO
ALTER INDEX ALL ON TestImplicitConverts REBUILD;
GO

The Test Harness

The test harness first connects to the instance of SQL Server and retrieves the column list from the database for the TestImplicitConverts table and stores them into an ArrayList that is then used to iterate over the tests, generating dynamic statements and at the same time using a different type for each test.  The types are selected from the SqlDbType enumeration in .NET and filter out the types that are not being tested as a part of this investigation.  Each test is executed with SET STATISTICS XML ON to collect the actual execution plan for each of the tests for processing to determine if the implicit conversion caused a scan to occur or not.

#Load the Showplan as a XML document for parsing
$xml = New-Object System.Xml.XmlDocument;
$xml.LoadXml($String);

#Setup the XmlNamespaceManager and add the ShowPlan Namespace to it.
$nsMgr = New-Object 'System.Xml.XmlNamespaceManager' $xml.NameTable;
$nsMgr.AddNamespace("sm", "<a href="http://schemas.microsoft.com/sqlserver/2004/07/showplan&quot;);">http://schemas.microsoft.com/sqlserver/2004/07/showplan");</a>

#Search for a Index Scan or Clustered Index Scan operation that is an Implicit Conversion
$HasImplictConvert = $xml.SelectNodes("//sm:RelOp[@LogicalOp='Index Scan' or @LogicalOp='Clustered Index Scan']/sm:IndexScan/sm:Predicate//sm:Convert[@Implicit='1']/sm:ScalarOperator/sm:Identifier/sm:ColumnReference", $nsMgr).Count -gt 0;

Then the results of each test is output to the host as a comma separated list that can then be placed in Excel and used to generate a chart of the implicit conversion effects.

Note: The full test harness and scripts are attached to this post.

Results

The results of the tests are mapped out in the two charts below.  The green blocks show where an implicit conversion occurs but still results in an index seek operation.  The yellow blocks show where the implicit conversion causes an index scan to occur.  The red blocks show where an operand clash occurs and implicit conversion is not actually supported by SQL Server for the data types, and the grey blocks are the same data type, so no conversion was required.

image

Figure 1 – Implicit Conversions using SQL_Latin_General_CP1_CI_AS

There is a difference between using the SQL_Latin_General_CP1_CI_AS collation and the Latin_General_CI_AS for the char and varchar data types, when converting to nchar or nvarchar.  In the SQL collation the conversion results in an index scan as shown above, but in the Windows collation, the scan does not occur and an index seek is still used to execute the query.

image

Figure 2 – Implicit Conversions using Latin_General_CI_AS

Conclusion

While implicit conversions occur in SQL Server to allow data evaluations against different data types, they can introduce performance problems for specific data type conversions that result in an index scan occurring during the execution.  Good design practices and code reviews can easily prevent implicit conversion issues from ever occurring in your design or workload.

Download scripts here!

Finding Key Lookups inside the Plan Cache

This is actually a blog post I thought I’d written more than two years ago, but this morning when I went looking for it after receiving a question by email  I realized that I’ve never blogged about this before.  At PASS Summit 2010 I presented a session on performance tuning SQL Server by digging into the plan cache, and since then I have blogged a number of other plan cache parsing scripts.  One of the demo scripts I showed at PASS was a query to find plans and statements in the plan cache that perform a Key Lookup operation.  An example statement that performs a Key Lookup is shown below.

USE [AdventureWorks];
GO

SELECT NationalIDNumber, HireDate, MaritalStatus
FROM HumanResources.Employee
WHERE NationalIDNumber = N'14417807';
GO

image

Taking a look at the execution plan XML we can see that there isn’t an actual Key Lookup operator, instead the Index Scan operator associated with the Clustered Index Seek will have a Lookup attribute that is set to a value of 1, as shown in the XML snip-it below:

<RelOp NodeId="3" PhysicalOp="Clustered Index Seek" LogicalOp="Clustered Index Seek" EstimateRows="1" EstimateIO="0.003125" EstimateCPU="0.0001581" AvgRowSize="17" EstimatedTotalSubtreeCost="0.0032831" TableCardinality="290" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row">
  <OutputList>
    <ColumnReference Database="[AdventureWorks]" Schema="[HumanResources]" Table="[Employee]" Column="MaritalStatus" />
    <ColumnReference Database="[AdventureWorks]" Schema="[HumanResources]" Table="[Employee]" Column="HireDate" />
  </OutputList>
  <IndexScan Lookup="1" Ordered="1" ScanDirection="FORWARD" ForcedIndex="0" ForceSeek="0" ForceScan="0" NoExpandHint="0" Storage="RowStore">
    <DefinedValues>
      <DefinedValue>
        <ColumnReference Database="[AdventureWorks]" Schema="[HumanResources]" Table="[Employee]" Column="MaritalStatus" />
      </DefinedValue>
      <DefinedValue>
        <ColumnReference Database="[AdventureWorks]" Schema="[HumanResources]" Table="[Employee]" Column="HireDate" />
      </DefinedValue>
    </DefinedValues>
    <Object Database="[AdventureWorks]" Schema="[HumanResources]" Table="[Employee]" Index="[PK_Employee_EmployeeID]" TableReferenceId="-1" IndexKind="Clustered" />
    <SeekPredicates>
      <SeekPredicateNew>
        <SeekKeys>
          <Prefix ScanType="EQ">
            <RangeColumns>
              <ColumnReference Database="[AdventureWorks]" Schema="[HumanResources]" Table="[Employee]" Column="EmployeeID" />
            </RangeColumns>
            <RangeExpressions>
              <ScalarOperator ScalarString="[AdventureWorks].[HumanResources].[Employee].[EmployeeID]">
                <Identifier>
                  <ColumnReference Database="[AdventureWorks]" Schema="[HumanResources]" Table="[Employee]" Column="EmployeeID" />
                </Identifier>
              </ScalarOperator>
            </RangeExpressions>
          </Prefix>
        </SeekKeys>
      </SeekPredicateNew>
    </SeekPredicates>
  </IndexScan>
</RelOp>

As a side note, the ShowplanXML uses the IndexScan XML element for both Seeks and Scans.  An Index Seek is identified by the SeekPredicates child element, whereas an Index Scan would have a Predicates child element instead. Another important note when dealing with ShowplanXML is that the attribute values of a cached execution plan differ from those of an Actual execution plan collected from SSMS.  In the above XML from the plan cache, the Lookup attribute has a value of 1.  However, in an Actual plan collected by SSMS, the value would be true as shown in the Actual XML below:

<IndexScan Lookup="true" Ordered="true" ScanDirection="FORWARD" ForcedIndex="false" ForceSeek="false" ForceScan="false" NoExpandHint="false">

When building an XML parser for querying the plan cache, the parser must check for a value of 1, not true, when looking for a Lookup attribute in the cached plan XML.  Given this information we can build a parser to search the plan cache and find the statements and columns associated with Key Lookups occurring.


SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

WITH XMLNAMESPACES
   (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
SELECT
    n.value('(@StatementText)[1]', 'VARCHAR(4000)') AS sql_text,
    n.query('.'),
    i.value('(@PhysicalOp)[1]', 'VARCHAR(128)') AS PhysicalOp,
    i.value('(./IndexScan/Object/@Database)[1]', 'VARCHAR(128)') AS DatabaseName,
    i.value('(./IndexScan/Object/@Schema)[1]', 'VARCHAR(128)') AS SchemaName,
    i.value('(./IndexScan/Object/@Table)[1]', 'VARCHAR(128)') AS TableName,
    i.value('(./IndexScan/Object/@Index)[1]', 'VARCHAR(128)') as IndexName,
    i.query('.'),
    STUFF((SELECT DISTINCT ', ' + cg.value('(@Column)[1]', 'VARCHAR(128)')
       FROM i.nodes('./OutputList/ColumnReference') AS t(cg)
       FOR  XML PATH('')),1,2,'') AS output_columns,
    STUFF((SELECT DISTINCT ', ' + cg.value('(@Column)[1]', 'VARCHAR(128)')
       FROM i.nodes('./IndexScan/SeekPredicates/SeekPredicateNew//ColumnReference') AS t(cg)
       FOR  XML PATH('')),1,2,'') AS seek_columns,
    i.value('(./IndexScan/Predicate/ScalarOperator/@ScalarString)[1]', 'VARCHAR(4000)') as Predicate,
	cp.usecounts,
    query_plan
FROM (  SELECT plan_handle, query_plan
        FROM (  SELECT DISTINCT plan_handle
                FROM sys.dm_exec_query_stats WITH(NOLOCK)) AS qs
        OUTER APPLY sys.dm_exec_query_plan(qs.plan_handle) tp
      ) as tab (plan_handle, query_plan)
INNER JOIN sys.dm_exec_cached_plans AS cp 
    ON tab.plan_handle = cp.plan_handle
CROSS APPLY query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/*') AS q(n)
CROSS APPLY n.nodes('.//RelOp[IndexScan[@Lookup="1"] and IndexScan/Object[@Schema!="[sys]"]]') as s(i)
OPTION(RECOMPILE, MAXDOP 1);

The first CROSS APPLY above breaks out the individual statements so that we can tie the Key Lookup operation to the specific statement that is causing it to occur.  The second CROSS APPLY breaks out the actual IndexScan element in the XML and filters out any Key Lookups that are happening in the ‘sys’ schema, since many of the system tables have frequent Key Lookup operations that occur.  Using the two XML fragments created by the CROSS APPLY operations we can extract the statement, the table and index the lookup operation occurs against, the output columns of the lookup operation and the seek predicate being used to perform the lookup.   With this information we can look at the indexing on our tables and determine if a change to a non-clustered index to make it covering is warranted to improve performance or not.

Hope this is helpful!