I had a question from a reader about contradiction detection.
The basic idea is to determine that this kind of query:
SELECT * FROM TABLE WHERE col1 > 5 and col1 < 0;
…is utterly meaningless and requires no work to be done. It will always return zero rows. In fact, in many cases the SQL Server QP will detect cases like this and actually remove the table from the query completely. In its place, you have this magical “constant scan” operator, which is really just an in-memory row generator. So, you can read from it instead of from a table.
This optimization is done in many places in the SQL Server QP, and it is great because it requires no I/O, no locks, and less CPU than actually reading from a table.
The question I had from a user had to do with cases when this optimization does NOT happen.
Since the reader sent me a perfect example, I’ll just copy it up here (note – I am happy to credit readers if they would like credit, but I don’t post their name unless they want me to do so).
–drop table #tmp
create table #tmp (id Int Identity(1,1) Primary key, name VarChar(250))
go
insert into #tmp(name) values(NEWID())
insert into #tmp(name) values(NEWID())
go
set statistics profile on
go
— Execution plan create a Constant Scan
select * from #tmp
where id = 1 and id = 5
go
set statistics profile off
GO
select * from #tmp where id = 1 and id = 5
|–Constant Scan
–drop table tmp
create table tmp (id Int Identity(1,1) Primary key, name VarChar(250))
go
insert into tmp(name) values(NEWID())
insert into tmp(name) values(NEWID())
go
set statistics profile on
— Why execution plan does not create a
Constant Scan for this case?
select * from tmp
where id = 1 and id = 5
go
set statistics profile off
SELECT * FROM [tmp] WHERE [id]=@1 AND [id]=@2
|–Clustered Index Seek(OBJECT:([t1].[dbo].[tmp].[PK__tmp__3213E83F15502E78]), SEEK:([t1].[dbo].[tmp].[id]=CONVERT_IMPLICIT(int,[@1],0)), WHERE:([t1].[dbo].[tmp].[id]=CONVERT_IMPLICIT(int,[@2],0)) ORDERED FORWARD)
So this is interesting – in one case the constant scan is done, while in the other case it is not…. hrm. what’s up with that?
Well, some of you are familiar with autoparameterization – a feature that can take unparameterized queries, try to make them parameterized, and then avoid compilation for multiple similar simple queries. Autoparameterization is tied to the “trivial plan” feature, meaning that the optimizer feels that there isn’t really a plan choice and thus going through the complete cost-based optimization process is unnecessary.
The difference you see is based on autoparameterization and whether it happens or not. Now, the autoparameterization rules are fairly arcane at this point – they’ve been around, in some form, for various releases dating back to SQL 7.0. TempDB has a different set of rules about whether a query should be autoparameterized since these tables are globally shared. So, in the #tmp case the table isn’t being autoparameterized, the rest of the logic to simplify the query is run, and this rule detects the contradiction.
In the latter case, the query IS auto-parameterized. The optimizer doesn’t see two parameters as always being a contradiction (you could pass the same value for both parameters), and then the query is not simplified.
For what it is worth, a somewhat internal knowledge of what causes a plan to be considered trivial or not can influence whether you see one plan or the other. For example, query hints make a plan not trivial (why hint if there is no plan choice?), and as such you could do something like this:
select * from tmp
where id = 1 and id = 5 option (maxdop 1)
This will get you the constant scan in all databases. Adding enough rows to get the cost above the threshold for parallelism (in a parallel-enabled machine + SKU of SQL) would also do it.
Remember, trivial plan was developed before contradiction detection, so this should be viewed in that context. It’s a bit odd, but the ecosystem developed this way and you should be aware of that when developing SQL Server apps.
I enjoy the questions – keep them coming!
Thanks,
Conor