Implicit Conversions that cause Index Scans

Implicit conversions seem to be a growing trend in the SQL Server performance tuning work that I’ve been engaged in recently, and I’ve blogged in the past about ways to identify when implicit conversions are occurring using the plan cache.

For those of you who don’t know, implicit conversions occur whenever data with two different data types are being compared, based on the data type precedence.  The precedence establishes the hierarchy of of the types, and lower precedence data types will always be implicitly converted up to the higher precedence type.  These conversions increase CPU usage for the operation, and when the conversion occurs on a table column can also result in an index scan where an index seek would have been possible without the implicit conversion.

For a long time I’ve wanted to map out the most common data types and the effect of a column-side implicit conversion for creating an index seek versus an index scan and recently I finally got around to mapping it all out.

Setting up the tests

To map out the implicit conversion affects I created two databases using different collations, one using SQL_Latin_General_CP1_CI_AS and the other using Latin_General_CI_AS, and then created the following table in each of the databases.

CREATE TABLE dbo.TestImplicitConverts
(
	RowID int NOT NULL IDENTITY (1, 1),
	BigIntCol bigint NOT NULL,
	BitCol bit NOT NULL,
	CharCol char(10) NOT NULL,
	DateCol date NOT NULL,
	DateTimeCol datetime NOT NULL,
	DateTime2Col datetime2(7) NOT NULL,
	DateTimeOffsetCol datetimeoffset(7) NOT NULL,
	DecimalCol decimal(10, 2) NOT NULL,
	FloatCol float(53) NOT NULL,
	IntCol int NOT NULL,
	MoneyCol money NOT NULL,
	NCharCol nchar(10) NOT NULL,
	NumericCol numeric(10, 2) NOT NULL,
	NVarchrCol nvarchar(50) NOT NULL,
	RealCol real NOT NULL,
	SmallDateTimeCol smalldatetime NOT NULL,
	SmallIntCol smallint NOT NULL,
	SmallMoneyCol smallmoney NOT NULL,
	TimeCol time(7) NOT NULL,
	TinyIntCol tinyint NOT NULL,
	GUIDCol uniqueidentifier NOT NULL,
	VarcharCol varchar(50) NOT NULL,
	CONSTRAINT PK_TestImplicitConverts PRIMARY KEY CLUSTERED (RowID)
);
GO

I then created a nonclustered index on each of the columns in the test table so that a single column query filtering on each column could then generate an execution plan with a single index seek when using the matching data type for the filtering.

-- Create nonclustered indexes on all columns to test implicit conversion affects
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_BigIntCol ON dbo.TestImplicitConverts (BigIntCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_BitCol ON dbo.TestImplicitConverts (BitCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_CharCol ON dbo.TestImplicitConverts (CharCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_DateCol ON dbo.TestImplicitConverts (DateCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_DateTime2Col ON dbo.TestImplicitConverts (DateTime2Col);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_DateTimeCol ON dbo.TestImplicitConverts (DateTimeCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_DateTimeOffsetCol ON dbo.TestImplicitConverts (DateTimeOffsetCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_DecimalCol ON dbo.TestImplicitConverts (DecimalCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_FloatCol ON dbo.TestImplicitConverts (FloatCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_GUIDCol ON dbo.TestImplicitConverts (GUIDCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_NVarcharCol ON dbo.TestImplicitConverts (NVarchrCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_RealCol ON dbo.TestImplicitConverts (RealCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_SmallDateTimeCol ON dbo.TestImplicitConverts (SmallDateTimeCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_SmallIntCol ON dbo.TestImplicitConverts (SmallIntCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_IntCol ON dbo.TestImplicitConverts (IntCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_MoneyCol ON dbo.TestImplicitConverts (MoneyCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_NCharCol ON dbo.TestImplicitConverts (NCharCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_NumericCol ON dbo.TestImplicitConverts (NumericCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_SmallMoneyCol ON dbo.TestImplicitConverts (SmallMoneyCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_TimeCol ON dbo.TestImplicitConverts (TimeCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_TinyIntCol ON dbo.TestImplicitConverts (TinyIntCol);
GO
CREATE NONCLUSTERED INDEX IX_TestImplicitConverts_VarcharCol ON dbo.TestImplicitConverts (VarcharCol);
GO

Finally each of the tables was loaded with 467000 rows of random data and the indexes were rebuilt to remove any fragmentation before testing was started.

INSERT INTO dbo.TestImplicitConverts
(	BigIntCol, BitCol, CharCol, DateCol, DateTimeCol, DateTime2Col, DateTimeOffsetCol,
	DecimalCol, FloatCol, IntCol, MoneyCol, NCharCol, NumericCol, NVarchrCol, RealCol,
	SmallDateTimeCol, SmallIntCol, SmallMoneyCol, TimeCol, TinyIntCol, GUIDCol, VarcharCol)
SELECT a.number, a.number%1, CAST(b.name AS CHAR(10)), DATEADD(ms, -1*a.number, GETDATE()), DATEADD(ms, -1*a.number, GETDATE()), DATEADD(ms, -1*a.number, GETDATE()), DATEADD(ms, -1*a.number, GETDATE()),
	a.number, a.number, a.number, a.number, CAST(b.name AS NCHAR(10)), a.number, b.name, a.number,
	DATEADD(ms, -1*a.number, GETDATE()), a.number, a.number, DATEADD(ms, -1*a.number, GETDATE()), a.number%255, NEWID(), b.name
FROM master.dbo.spt_values AS a
CROSS JOIN master.dbo.spt_values AS b
WHERE a.type = N'P'
  AND a.number < 1000
  AND b.name IS NOT NULL;
GO
ALTER INDEX ALL ON TestImplicitConverts REBUILD;
GO

The Test Harness

The test harness first connects to the instance of SQL Server and retrieves the column list from the database for the TestImplicitConverts table and stores them into an ArrayList that is then used to iterate over the tests, generating dynamic statements and at the same time using a different type for each test.  The types are selected from the SqlDbType enumeration in .NET and filter out the types that are not being tested as a part of this investigation.  Each test is executed with SET STATISTICS XML ON to collect the actual execution plan for each of the tests for processing to determine if the implicit conversion caused a scan to occur or not.

#Load the Showplan as a XML document for parsing
$xml = New-Object System.Xml.XmlDocument;
$xml.LoadXml($String);

#Setup the XmlNamespaceManager and add the ShowPlan Namespace to it.
$nsMgr = New-Object 'System.Xml.XmlNamespaceManager' $xml.NameTable;
$nsMgr.AddNamespace("sm", "<a href="http://schemas.microsoft.com/sqlserver/2004/07/showplan&quot;);">http://schemas.microsoft.com/sqlserver/2004/07/showplan");</a>

#Search for a Index Scan or Clustered Index Scan operation that is an Implicit Conversion
$HasImplictConvert = $xml.SelectNodes("//sm:RelOp[@LogicalOp='Index Scan' or @LogicalOp='Clustered Index Scan']/sm:IndexScan/sm:Predicate//sm:Convert[@Implicit='1']/sm:ScalarOperator/sm:Identifier/sm:ColumnReference", $nsMgr).Count -gt 0;

Then the results of each test is output to the host as a comma separated list that can then be placed in Excel and used to generate a chart of the implicit conversion effects.

Note:  The full test harness and scripts are attached to this post.

Results

The results of the tests are mapped out in the two charts below.  The green blocks show where an implicit conversion occurs but still results in an index seek operation.  The yellow blocks show where the implicit conversion causes an index scan to occur.  The red blocks show where an operand clash occurs and implicit conversion is not actually supported by SQL Server for the data types, and the grey blocks are the same data type, so no conversion was required.

image

Figure 1 – Implicit Conversions using SQL_Latin_General_CP1_CI_AS

There is a difference between using the SQL_Latin_General_CP1_CI_AS collation and the Latin_General_CI_AS for the char and varchar data types, when converting to nchar or nvarchar.  In the SQL collation the conversion results in an index scan as shown above, but in the Windows collation, the scan does not occur and an index seek is still used to execute the query.

image

Figure 2 – Implicit Conversions using Latin_General_CI_AS

Conclusion

While implicit conversions occur in SQL Server to allow data evaluations against different data types, they can introduce performance problems for specific data type conversions that result in an index scan occurring during the execution.  Good design practices and code reviews can easily prevent implicit conversion issues from ever occurring in your design or workload.

Download scripts here!

13 thoughts on “Implicit Conversions that cause Index Scans

  1. Great post! I’ve seen this first hand at my current client in a bad way. Fixing the config file for the jTDS driver to NOT use unicode on all of the web servers switched all index scans to seeks as the tables all used varchar instead of nvarchar. Any advice as to how to tune queries that use sp_prepare and sp_execute? Running a trace through DETA gets me nowhere- all queries have to be manually tuned. Thanks! –Todd

    1. Todd, exactly what did you change in the config file ? We are running into this problem at my present employer with the jTds driver.

  2. Great post Jonathan!

    I’ve been using your script to regularly query implicit conversions in the plan cache, and have seen a reduction of index scans and, by extension, query execution times, as a result.

    Thanks!

    Andre Ranieri

  3. Nice buildup of the case, well explained.
    And it contained the Powershell code I happened to be looking for at this moment.

    Thank you for another great post!

  4. I have similar script but that is pure sql script and that looks at the Index scans due to Implicit conversion or Index scan due to usage of function. It provides other details like function or conversion data types, Possible impact as well like how many number of rows are being processed each time during index scan etc. There is a column in output “MostImportantFactor_total_row_processed” which could be used to determine which sql’s needs to be fixed first. The worst thing of the Index scan(due to conversion of function) is that number of CPU operation it has to perform to compare the data and could kill the CPU. One small query caused my 24 cpu machine to have cpu usage of more than 80% just because of the Index scan due to conversion (it was using function as well). Fixing this brought CPU to around 2%.

    It might run for long time if the cache plan is quite big. Thus if running on Prod be careful. In case cacheplan is quite big then I usually prefer to get the column,index,table ,function, predicate details in one go from query_plan dmv and then join this data with the other tables to get more information.

    Sometime, I go to query_stats dmv and then find queries with highest CPU time and then look whether index scan is happening for that query. This Makes sure that I query minimal data from these dmv’s.

    declare @databasename sysname
    set @databasename= quotename(db_name())
    select @databasename
    –find all the plans and query which are using the function call on the column which has index and this in turn is causing the index scan instead of index seek
    ;WITH XMLNAMESPACES
    (DEFAULT ‘http://schemas.microsoft.com/sqlserver/2004/07/showplan’)
    select
    ColDetails.value(‘(@Schema)[1]’,’sysname’) as SchemaName
    ,ColDetails.value(‘(@Table)[1]’,’sysname’) as TableName
    ,ColDetails.value(‘(@Column)[1]’,’sysname’) as ColumnName
    ,ColDetails.value(‘(@Alias)[1]’,’sysname’) as TableAliasInQuery
    ,PhysicalOperator.value(‘(.//Object/@Index)[1]’,’sysname’) as IndexName
    ,Intrinsic.value(‘(@FunctionName)[1]’,’varchar(200)’) as FunctionNameUsed
    ,PhysicalOperator.value(‘(.//IndexScan/Predicate/ScalarOperator/@ScalarString)[1]’,’nvarchar(max)’) as PredicateHavingFunction
    ,qs.execution_count * ps.total_row_count as MostImportantFactor_total_row_processed
    ,ps.total_row_count
    ,qs.execution_count * used_page_count as MostImportantFactor_possible_number_of_logical_reads
    ,ps.used_page_count
    ,cast(substring(cast(st.text as nvarchar(max)),qs.statement_start_offset/2 + 1,(case when qs.statement_end_offset = -1 then datalength(cast(st.text as nvarchar(max))) else qs.statement_end_offset end – qs.statement_start_offset)/2) as nvarchar(max)) as QueryText
    ,datalength(st.text) SQLtextLength
    ,st.text FullSqltext
    ,qp.query_plan as QueryPlan
    ,cp.usecounts,cp.refcounts,cp.objtype
    ,qs.*
    from
    sys.dm_exec_cached_plans cp
    cross apply sys.dm_exec_query_plan(cp.plan_handle) qp
    cross apply query_plan.nodes(‘/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple’) as Statements(IndividualStatement)
    cross apply IndividualStatement.nodes(‘.//RelOp[@PhysicalOp=”Index Scan” or @PhysicalOp=”Clustered Index Scan”]’) as RelOp(PhysicalOperator)
    cross apply PhysicalOperator.nodes(‘.//ScalarOperator/Intrinsic’) as Conversion(Intrinsic)
    cross apply Intrinsic.nodes(‘./ScalarOperator/Identifier/ColumnReference’) as Cols(ColDetails)
    inner join sys.tables t on quotename(t.name) = ltrim(rtrim(ColDetails.value(‘(@Table)[1]’,’sysname’))) and t.type = ‘U’
    inner join sys.indexes id on t.object_id = id.object_id
    and quotename(id.name) = ltrim(rtrim(PhysicalOperator.value(‘(.//Object/@Index)[1]’,’sysname’)))
    left join sys.dm_exec_query_stats qs on cp.plan_handle= qs.plan_handle
    left join (select object_id,index_id,sum(used_page_count) as used_page_count, sum(row_count) as total_row_count from sys.dm_db_partition_stats group by object_id,Index_id) as ps
    on id.object_id = ps.object_id and id.index_id = ps.index_id
    outer apply sys.dm_exec_sql_text(cp.plan_handle) st
    where
    1=1
    and ColDetails.exist(‘.[@Database=sql:variable(“@databasename”)]’) =1
    and cp.cacheobjtype = ‘Compiled Plan’

    –find all the plans and query which are being compared with wrong data type in queries and thus casuing the conversion implicit or explicit on column which has index and this in turn is causing the index scan instead of index seek

    ;WITH XMLNAMESPACES
    (DEFAULT ‘http://schemas.microsoft.com/sqlserver/2004/07/showplan’)
    select

    ColDetails.value(‘(@Schema)[1]’,’sysname’) as SchemaName
    ,ColDetails.value(‘(@Table)[1]’,’sysname’) as TableName
    ,ColDetails.value(‘(@Column)[1]’,’sysname’) as ColumnName
    ,ColDetails.value(‘(@Alias)[1]’,’sysname’) as TableAliasInQuery
    ,PhysicalOperator.value(‘(.//Object/@Index)[1]’,’sysname’) as IndexName
    ,Cnvrt.value(‘(@DataType)[1]’,’sysname’) as ConvertedDataType
    ,ty.name as ColumnDataType
    ,Cnvrt.value(‘(@Implicit)[1]’,’sysname’) as IsImplicit
    ,qs.execution_count * ps.total_row_count as MostImportantFactor_total_row_processed
    ,ps.total_row_count
    ,qs.execution_count * used_page_count as MostImportantFactor_possible_number_of_logical_reads
    ,ps.used_page_count
    ,PhysicalOperator.value(‘(.//Predicate/ScalarOperator/@ScalarString)[1]’,’nvarchar(max)’) as PredicateHavingFunction
    ,cast(substring(cast(st.text as nvarchar(max)),qs.statement_start_offset/2 + 1,(case when qs.statement_end_offset = -1 then datalength(cast(st.text as nvarchar(max))) else qs.statement_end_offset end – qs.statement_start_offset)/2) as nvarchar(max)) as QueryText
    ,datalength(st.text) SQLtextLength
    ,st.text FullSqltext
    ,qp.query_plan as QueryPlan
    ,cp.usecounts,cp.refcounts,cp.objtype
    ,qs.*
    from
    sys.dm_exec_cached_plans cp
    cross apply sys.dm_exec_query_plan(cp.plan_handle) qp
    cross apply query_plan.nodes(‘/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple’) as Statements(IndividualStatement)
    cross apply IndividualStatement.nodes(‘.//RelOp[@PhysicalOp=”Index Scan” or @PhysicalOp=”Clustered Index Scan”]’) as RelOp(PhysicalOperator)
    cross apply PhysicalOperator.nodes(‘.//ScalarOperator/Convert’) as Conversion(Cnvrt)
    cross apply Cnvrt.nodes(‘./ScalarOperator/Identifier/ColumnReference’) as Cols(ColDetails)
    inner join sys.tables t on quotename(t.name) = ltrim(rtrim(ColDetails.value(‘(@Table)[1]’,’sysname’))) and t.type = ‘U’
    inner join sys.indexes id on t.object_id = id.object_id
    and quotename(id.name) = ltrim(rtrim(PhysicalOperator.value(‘(.//Object/@Index)[1]’,’sysname’)))
    inner join sys.all_columns ac on t.object_id = ac.object_id and ac.name = ltrim(rtrim(ColDetails.value(‘(@Column)[1]’,’sysname’)))
    left join sys.types ty on ac.user_type_id = ty.user_type_id
    left join sys.dm_exec_query_stats qs on cp.plan_handle= qs.plan_handle
    left join (select object_id,index_id,sum(used_page_count) as used_page_count, sum(row_count) as total_row_count from sys.dm_db_partition_stats group by object_id,Index_id) as ps on id.object_id = ps.object_id and id.index_id = ps.index_id
    outer apply sys.dm_exec_sql_text(cp.plan_handle) st
    where
    1=1
    and ColDetails.exist(‘.[@Database=sql:variable(“@databasename”)]’) =1
    and cp.cacheobjtype = ‘Compiled Plan’

  5. Very nice explaination.
    I have a similar issue when i use SQL_Latin_General_CP1_CI_AS collation.
    Please suggest a code fix for this for my indexes to work.

    1. The only code fix is to match your filter parameter types to the column data types. Or change your underlying data types to match what the parameterization is going to pass through in your code for filtering parameters. You can’t do anything other than these two things.

  6. Great post, thank you, helped a lot with a recent issue I was having.

    1 question: How come varchar in Y Axis paired with nvarchar in X axis results in an implicit conversion, but nvarchar in Y axis paired with varchar in AX axis does not? What’s the difference between the 2 scenarios ?

    Thanks !

    1. If the column is varchar and the parameter is nvarchar it requires a column conversion up to unicode. If the column is nvarchar and the parameter is varchar, the parameter side get’s converted up to unicode, so it doesn’t cause a scan in the second case.

    1. If the column is non-indexed then you would be getting a scan anyway unless there was a supporting index for one of the other filter criteria that would allow an index seek with residual predicate filtering

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Bitten by SSD Bit Rot

Back in 2012, I wrote a post titled Looking at External Disk Performance using USB 3.0 and eSATA with SSD, where I tested a number

Explore

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.