New Article – How expensive are column-side Implicit Conversions?

Last week I blogged about Implicit Conversions that cause Index Scans, and showed two charts for where implicit conversions could result in an index scan operation instead of an index seek during query execution.  As a part of writing that blog post I thought it would be interesting to also look at the performance overhead of the column-side implicit conversion occurring, given that I’ve seen it’s effects a number of times in the last few weeks.  Today my article How expensive are column-side Implicit Conversions? went live on the site to show using Performance Monitor information on CPU usage and batch requests per second, the effect of the performance overhead when column-side implicit conversions occur that cause index scans during query execution.

Implicit Conversions that cause Index Scans

Implicit conversions seem to be a growing trend in the SQL Server performance tuning work that I’ve been engaged in recently, and I’ve blogged in the past about ways to identify when implicit conversions are occurring using the plan cache.

For those of you who don’t know, implicit conversions occur whenever data with two different data types are being compared, based on the data type precedence.  The precedence establishes the hierarchy of of the types, and lower precedence data types will always be implicitly converted up to the higher precedence type.  These conversions increase CPU usage for the operation, and when the conversion occurs on a table column can also result in an index scan where an index seek would have been possible without the implicit conversion.

For a long time I’ve wanted to map out the most common data types and the effect of a column-side implicit conversion for creating an index seek versus an index scan and recently I finally got around to mapping it all out.

Setting up the tests

To map out the implicit conversion affects I created two databases using different collations, one using SQL_Latin_General_CP1_CI_AS and the other using Latin_General_CI_AS, and then created the following table in each of the databases.

CREATE TABLE dbo.TestImplicitConverts
	RowID int NOT NULL IDENTITY (1, 1),
	BigIntCol bigint NOT NULL,
	BitCol bit NOT NULL,
	CharCol char(10) NOT NULL,
	DateCol date NOT NULL,
	DateTimeCol datetime NOT NULL,
	DateTime2Col datetime2(7) NOT NULL,
	DateTimeOffsetCol datetimeoffset(7) NOT NULL,
	DecimalCol decimal(10, 2) NOT NULL,
	FloatCol float(53) NOT NULL,
	IntCol int NOT NULL,
	MoneyCol money NOT NULL,
	NCharCol nchar(10) NOT NULL,
	NumericCol numeric(10, 2) NOT NULL,
	NVarchrCol nvarchar(50) NOT NULL,
	RealCol real NOT NULL,
	SmallDateTimeCol smalldatetime NOT NULL,
	SmallIntCol smallint NOT NULL,
	SmallMoneyCol smallmoney NOT NULL,
	TimeCol time(7) NOT NULL,
	TinyIntCol tinyint NOT NULL,
	GUIDCol uniqueidentifier NOT NULL,
	VarcharCol varchar(50) NOT NULL,

I then created a nonclustered index on each of the columns in the test table so that a single column query filtering on each column could then generate an execution plan with a single index seek when using the matching data type for the filtering.

-- Create nonclustered indexes on all columns to test implicit conversion affects
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_BigIntCol ON dbo.TestImplicitConverts (BigIntCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_BitCol ON dbo.TestImplicitConverts (BitCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_CharCol ON dbo.TestImplicitConverts (CharCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_DateCol ON dbo.TestImplicitConverts (DateCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_DateTime2Col ON dbo.TestImplicitConverts (DateTime2Col);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_DateTimeCol ON dbo.TestImplicitConverts (DateTimeCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_DateTimeOffsetCol ON dbo.TestImplicitConverts (DateTimeOffsetCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_DecimalCol ON dbo.TestImplicitConverts (DecimalCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_FloatCol ON dbo.TestImplicitConverts (FloatCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_GUIDCol ON dbo.TestImplicitConverts (GUIDCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_NVarcharCol ON dbo.TestImplicitConverts (NVarchrCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_RealCol ON dbo.TestImplicitConverts (RealCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_SmallDateTimeCol ON dbo.TestImplicitConverts (SmallDateTimeCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_SmallIntCol ON dbo.TestImplicitConverts (SmallIntCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_IntCol ON dbo.TestImplicitConverts (IntCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_MoneyCol ON dbo.TestImplicitConverts (MoneyCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_NCharCol ON dbo.TestImplicitConverts (NCharCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_NumericCol ON dbo.TestImplicitConverts (NumericCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_SmallMoneyCol ON dbo.TestImplicitConverts (SmallMoneyCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_TimeCol ON dbo.TestImplicitConverts (TimeCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_TinyIntCol ON dbo.TestImplicitConverts (TinyIntCol);
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_VarcharCol ON dbo.TestImplicitConverts (VarcharCol);

Finally each of the tables was loaded with 467000 rows of random data and the indexes were rebuilt to remove any fragmentation before testing was started.

INSERT INTO dbo.TestImplicitConverts
(	BigIntCol, BitCol, CharCol, DateCol, DateTimeCol, DateTime2Col, DateTimeOffsetCol,
	DecimalCol, FloatCol, IntCol, MoneyCol, NCharCol, NumericCol, NVarchrCol, RealCol,
	SmallDateTimeCol, SmallIntCol, SmallMoneyCol, TimeCol, TinyIntCol, GUIDCol, VarcharCol)
SELECT a.number, a.number%1, CAST( AS CHAR(10)), DATEADD(ms, -1*a.number, GETDATE()), DATEADD(ms, -1*a.number, GETDATE()), DATEADD(ms, -1*a.number, GETDATE()), DATEADD(ms, -1*a.number, GETDATE()),
	a.number, a.number, a.number, a.number, CAST( AS NCHAR(10)), a.number,, a.number,
	DATEADD(ms, -1*a.number, GETDATE()), a.number, a.number, DATEADD(ms, -1*a.number, GETDATE()), a.number%255, NEWID(),
FROM master.dbo.spt_values AS a
CROSS JOIN master.dbo.spt_values AS b
WHERE a.type = N'P'
  AND a.number < 1000

The Test Harness

The test harness first connects to the instance of SQL Server and retrieves the column list from the database for the TestImplicitConverts table and stores them into an ArrayList that is then used to iterate over the tests, generating dynamic statements and at the same time using a different type for each test.  The types are selected from the SqlDbType enumeration in .NET and filter out the types that are not being tested as a part of this investigation.  Each test is executed with SET STATISTICS XML ON to collect the actual execution plan for each of the tests for processing to determine if the implicit conversion caused a scan to occur or not.

#Load the Showplan as a XML document for parsing
$xml = New-Object System.Xml.XmlDocument;

#Setup the XmlNamespaceManager and add the ShowPlan Namespace to it.
$nsMgr = New-Object 'System.Xml.XmlNamespaceManager' $xml.NameTable;
$nsMgr.AddNamespace("sm", "<a href=";);">");</a>

#Search for a Index Scan or Clustered Index Scan operation that is an Implicit Conversion
$HasImplictConvert = $xml.SelectNodes("//sm:RelOp[@LogicalOp='Index Scan' or @LogicalOp='Clustered Index Scan']/sm:IndexScan/sm:Predicate//sm:Convert[@Implicit='1']/sm:ScalarOperator/sm:Identifier/sm:ColumnReference", $nsMgr).Count -gt 0;

Then the results of each test is output to the host as a comma separated list that can then be placed in Excel and used to generate a chart of the implicit conversion effects.

Note: The full test harness and scripts are attached to this post.


The results of the tests are mapped out in the two charts below.  The green blocks show where an implicit conversion occurs but still results in an index seek operation.  The yellow blocks show where the implicit conversion causes an index scan to occur.  The red blocks show where an operand clash occurs and implicit conversion is not actually supported by SQL Server for the data types, and the grey blocks are the same data type, so no conversion was required.


Figure 1 – Implicit Conversions using SQL_Latin_General_CP1_CI_AS

There is a difference between using the SQL_Latin_General_CP1_CI_AS collation and the Latin_General_CI_AS for the char and varchar data types, when converting to nchar or nvarchar.  In the SQL collation the conversion results in an index scan as shown above, but in the Windows collation, the scan does not occur and an index seek is still used to execute the query.


Figure 2 – Implicit Conversions using Latin_General_CI_AS


While implicit conversions occur in SQL Server to allow data evaluations against different data types, they can introduce performance problems for specific data type conversions that result in an index scan occurring during the execution.  Good design practices and code reviews can easily prevent implicit conversion issues from ever occurring in your design or workload.

Download scripts here!

Finding Key Lookups inside the Plan Cache

This is actually a blog post I thought I’d written more than two years ago, but this morning when I went looking for it after receiving a question by email  I realized that I’ve never blogged about this before.  At PASS Summit 2010 I presented a session on performance tuning SQL Server by digging into the plan cache, and since then I have blogged a number of other plan cache parsing scripts.  One of the demo scripts I showed at PASS was a query to find plans and statements in the plan cache that perform a Key Lookup operation.  An example statement that performs a Key Lookup is shown below.

USE [AdventureWorks];

SELECT NationalIDNumber, HireDate, MaritalStatus
FROM HumanResources.Employee
WHERE NationalIDNumber = N'14417807';


Taking a look at the execution plan XML we can see that there isn’t an actual Key Lookup operator, instead the Index Scan operator associated with the Clustered Index Seek will have a Lookup attribute that is set to a value of 1, as shown in the XML snip-it below:

<RelOp NodeId="3" PhysicalOp="Clustered Index Seek" LogicalOp="Clustered Index Seek" EstimateRows="1" EstimateIO="0.003125" EstimateCPU="0.0001581" AvgRowSize="17" EstimatedTotalSubtreeCost="0.0032831" TableCardinality="290" Parallel="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row">
    <ColumnReference Database="[AdventureWorks]" Schema="[HumanResources]" Table="[Employee]" Column="MaritalStatus" />
    <ColumnReference Database="[AdventureWorks]" Schema="[HumanResources]" Table="[Employee]" Column="HireDate" />
  <IndexScan Lookup="1" Ordered="1" ScanDirection="FORWARD" ForcedIndex="0" ForceSeek="0" ForceScan="0" NoExpandHint="0" Storage="RowStore">
        <ColumnReference Database="[AdventureWorks]" Schema="[HumanResources]" Table="[Employee]" Column="MaritalStatus" />
        <ColumnReference Database="[AdventureWorks]" Schema="[HumanResources]" Table="[Employee]" Column="HireDate" />
    <Object Database="[AdventureWorks]" Schema="[HumanResources]" Table="[Employee]" Index="[PK_Employee_EmployeeID]" TableReferenceId="-1" IndexKind="Clustered" />
          <Prefix ScanType="EQ">
              <ColumnReference Database="[AdventureWorks]" Schema="[HumanResources]" Table="[Employee]" Column="EmployeeID" />
              <ScalarOperator ScalarString="[AdventureWorks].[HumanResources].[Employee].[EmployeeID]">
                  <ColumnReference Database="[AdventureWorks]" Schema="[HumanResources]" Table="[Employee]" Column="EmployeeID" />

As a side note, the ShowplanXML uses the IndexScan XML element for both Seeks and Scans.  An Index Seek is identified by the SeekPredicates child element, whereas an Index Scan would have a Predicates child element instead. Another important note when dealing with ShowplanXML is that the attribute values of a cached execution plan differ from those of an Actual execution plan collected from SSMS.  In the above XML from the plan cache, the Lookup attribute has a value of 1.  However, in an Actual plan collected by SSMS, the value would be true as shown in the Actual XML below:

<IndexScan Lookup="true" Ordered="true" ScanDirection="FORWARD" ForcedIndex="false" ForceSeek="false" ForceScan="false" NoExpandHint="false">

When building an XML parser for querying the plan cache, the parser must check for a value of 1, not true, when looking for a Lookup attribute in the cached plan XML.  Given this information we can build a parser to search the plan cache and find the statements and columns associated with Key Lookups occurring.


   (DEFAULT '')
    n.value('(@StatementText)[1]', 'VARCHAR(4000)') AS sql_text,
    i.value('(@PhysicalOp)[1]', 'VARCHAR(128)') AS PhysicalOp,
    i.value('(./IndexScan/Object/@Database)[1]', 'VARCHAR(128)') AS DatabaseName,
    i.value('(./IndexScan/Object/@Schema)[1]', 'VARCHAR(128)') AS SchemaName,
    i.value('(./IndexScan/Object/@Table)[1]', 'VARCHAR(128)') AS TableName,
    i.value('(./IndexScan/Object/@Index)[1]', 'VARCHAR(128)') as IndexName,
    STUFF((SELECT DISTINCT ', ' + cg.value('(@Column)[1]', 'VARCHAR(128)')
       FROM i.nodes('./OutputList/ColumnReference') AS t(cg)
       FOR  XML PATH('')),1,2,'') AS output_columns,
    STUFF((SELECT DISTINCT ', ' + cg.value('(@Column)[1]', 'VARCHAR(128)')
       FROM i.nodes('./IndexScan/SeekPredicates/SeekPredicateNew//ColumnReference') AS t(cg)
       FOR  XML PATH('')),1,2,'') AS seek_columns,
    i.value('(./IndexScan/Predicate/ScalarOperator/@ScalarString)[1]', 'VARCHAR(4000)') as Predicate,
FROM (  SELECT plan_handle, query_plan
        FROM (  SELECT DISTINCT plan_handle
                FROM sys.dm_exec_query_stats WITH(NOLOCK)) AS qs
        OUTER APPLY sys.dm_exec_query_plan(qs.plan_handle) tp
      ) as tab (plan_handle, query_plan)
INNER JOIN sys.dm_exec_cached_plans AS cp 
    ON tab.plan_handle = cp.plan_handle
CROSS APPLY query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/*') AS q(n)
CROSS APPLY n.nodes('.//RelOp[IndexScan[@Lookup="1"] and IndexScan/Object[@Schema!="[sys]"]]') as s(i)

The first CROSS APPLY above breaks out the individual statements so that we can tie the Key Lookup operation to the specific statement that is causing it to occur.  The second CROSS APPLY breaks out the actual IndexScan element in the XML and filters out any Key Lookups that are happening in the ‘sys’ schema, since many of the system tables have frequent Key Lookup operations that occur.  Using the two XML fragments created by the CROSS APPLY operations we can extract the statement, the table and index the lookup operation occurs against, the output columns of the lookup operation and the seek predicate being used to perform the lookup.   With this information we can look at the indexing on our tables and determine if a change to a non-clustered index to make it covering is warranted to improve performance or not.

Hope this is helpful!