SQL 2008 CTP6 Filtered Indexes in Action – disjunction of ranges and index choice

After fighting off a cold all week, I’ve had some more time to go play with CTP6 on my secondary machine.

Today we’ll do some experiments to see how the Optimizer picks plans for filtered indexes.  This will help you figure out how to build your database schemas and queries to take advantage of this new kind of index.

I was involved in the inception of the idea to build filtered indexes into SQL Server, but I didn’t have anything to do with the implementation.  So, what I’ll cover today is just me playing with a feature to see what it can do. 

I thought that it would be good to play with disjoint ranges and see whether they can be “covered” by a filtered index.  In my last blog post, I covered how a basic, single range case *is* covered by a filtered index when the predicates match up correctly.  In this post, I’ll try multiple ranges to see what happens.

Same database setup as before (just a table with wide rows and enough rows to make index choice obvious to the optimizer):

use t1

create database t1
use t1
drop table t1
create table t1(col1 int, col2 nchar(2000), col3 time)

create index i1 on t1(col1) where col1 > 5 and col1 < 20



create index i1 on t1(col1) where col1 > 5 and col1 < 20
declare @p int
set @p =0
while @p < 20000
begin
insert into t1(col1) values (rand()*10000)
set @p=@p+1
end
create index i2 on t1(col1) where col1 > 25 and col1 < 40


OK I added a second index over a different range in the same column. This is not a case that the
SQL Server 2005 optimizer had to handle.

query 1: a query with a disjunction (OR) of ranges (BETWEEN/AND).
select col1 from t1 where (col1 > 5 and col1 < 20) or (col1 > 25 and col1 < 40)

|--Table Scan(OBJECT:([t1].[dbo].[t1]), WHERE:([t1].[dbo].[t1].[col1]>(5) AND [t1].[dbo].[t1].[col1]<(20) OR [t1].[dbo].[t1].[col1]>(25) AND [t1].[dbo].[t1].[col1]<(40)))

Hrm. Well, there doesn't seem to be code in CTP that does index unions for cases like this, unfortunately.
Hinting isn't an option either because the index hint code returns an error if the hinted index does not
completely cover the index. On my machine, I get:

(43 row(s) affected)

SQL Server Execution Times:
CPU time = 47 ms, elapsed time = 2661 ms.

That elapsed time is mostly reading pages off my IDE drive. To be clear, this is the _cold cache_ time,
which means that I ran "DBCC dropcleanbuffers" before my run. This throws out all of the buffer pool
pages and forces a read from disk. One assumption in the costing model for the SQL Server QP is that the
pages are _not_ in memory already (and there are cases where this becomes interesting). For this example,
it just means that we have a big table scan with lots of IOs to do (and is therefore slow).

Second example: IN lists

select col1 from t1 where col1 in (6, 7, 8, 26)

|--Table Scan(OBJECT:([t1].[dbo].[t1]), WHERE:([t1].[dbo].[t1].[col1]=(6) OR [t1].[dbo].[t1].[col1]=(7) OR [t1].[dbo].[t1].[col1]=(8) OR [t1].[dbo].[t1].[col1]=(26)))

Well, this is consistent.  IN is generally just a list of ORs as far as the QP is concerned.  Runtime for this query should be very similar to what we saw in our last example:

(7 row(s) affected)

 SQL Server Execution Times:
   CPU time = 63 ms,  elapsed time = 2708 ms.

OK, so what’s an enterprising SQL developer to do?  Well, you can rewrite the query as a UNION ALL as long as you know that the ranges are disjoint and cover the predicate properly.

select col1 from t1 with (index=i1) where (col1 > 5 and col1 < 20) 
union all 
select col1 from t1 with (index=i2) where (col1 > 25 and col1 < 40)

SQL Server parse and compile time:
   CPU time = 0 ms, elapsed time = 0 ms.

(43 row(s) affected)

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

That’s a lot nicer.

I have no idea if this is something that they intend to add for this release of the product.  I’m not really even sure how common this is.  How often do you want to apply a disjunction of ranges to multiple partial indexes where you don’t want to create an index over the whole table?  No clue.  If you have such an example, I’d love to hear about it – drop me a line at conor@sqlskills.com

In the meantime, this is just a post on something interesting I learned while playing with the new CTP6 for SQL Server 2008 that I thought you might like to know.

Happy Querying!

Conor Cunningham

Other articles

New blog location

Here is the new blog  – please update those readers. http://blogs.msdn.com/conor_cunningham_msft/default.aspx I’m getting settled into working for Microsoft again – it’s basically like drinking from

Explore

The Trouble with Triggers

(Apologies to Star Trek). I received a question about trigger performance, especially the do’s and dont’s about how they are used. Let’s first start by

Explore

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.