(I got CTP 6 running on another machine, so I’m working except for parallel plans now since it’s only a single-proc. You guys will have to wait on that operator of the day article for now
Kudos to my former teammembers for getting filtered indexes into CTP6. It’s pretty darn neat, and I’ll show you a few tidbits that are interesting.
So, filtered indexes are not really a new “feature” in some sense. You can do everything in a filtered index with an indexed view. Of course, indexed view matching is only supported in the Enterprise edition of the software, so perhaps not everyone has seen those benefits.
Indexed views are a tricky feature – generalized tree pattern matching is hard (read: CPU expensive), and if you’ve looked in the BOL the list of restrictions is so long that reminds me of filling out tax forms. However, the other side effect that occurs when you have indexed views is that the optimizer has to go through additional phases of its search in order to apply them. The optimizer has a couple of buckets of rules it can run, and most plans find a nice plan in the first or second round of rules. Generalized indexed view matching is restricted to the last set of buckets, which usually means that there are a lot more rules that have to run before that view can be matched. The bottom line is that the compilation cost isn’t bad for a single query (usually), but it’s the sort of thing that can make the difference between an application that can be used in UI response-time requirements or not.
Enter the filtered index. This is a recognition that there are a lot of indexed views that don’t need tons and tons of fancy equations, joins, etc. These are single-tabled views that have simple predicates. Once you get into this ballpark, you can bolt this on to the super-efficient index matching code and you can enable a whole new class of application from what you could build previously. This is why I am excited about this feature.
I haven’t looked to see where this feature will fall into the SKU matrix yet, and I’m sure that they’re still pondering that very question. However, you guys should play with this on CTP6 – it’s nifty!
So, first things first. Let’s build one of these guys and see if we can match it:
create database t1
use t1
drop table t1
create table t1(col1 int, col2 nchar(2000), col3 time)
create index i1 on t1(col1) where col1 > 5 and col1 < 20
declare @p int
set @p =0
while @p < 20000
begin
insert into t1(col1) values (rand()*10000)
set @p=@p+1
end
ok so I’ve filled this table up with a lot of useless data and created a nice little filtered index.
select * from t1 where col1 > 5 and col1 < 20
|–Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1000], [Expr1007]) OPTIMIZED WITH UNORDERED PREFETCH)
|–Compute Scalar(DEFINE:([Expr1006]=BmkToPage([Bmk1000])))
| |–Index Scan(OBJECT:([t1].[dbo].[t1].[i1]))
|–RID Lookup(OBJECT:([t1].[dbo].[t1]), SEEK:([Bmk1000]=[Bmk1000]) LOOKUP ORDERED FORWARD)
select * from t1 where col1 > 5 and col1 < 10
|–Nested Loops(Inner Join, OUTER REFERENCES:([Bmk1000]) OPTIMIZED)
|–Compute Scalar(DEFINE:([Expr1006]=BmkToPage([Bmk1000])))
| |–Index Seek(OBJECT:([t1].[dbo].[t1].[i1]), SEEK:([t1].[dbo].[t1].[col1] < (10)) ORDERED FORWARD)
|–RID Lookup(OBJECT:([t1].[dbo].[t1]), SEEK:([Bmk1000]=[Bmk1000]) LOOKUP ORDERED FORWARD)
so in the first example I’ve created a query against the table that directly matches the index condition. It matches and even generates an index scan (slightly faster since it doesn’t have to navigate down the b-tree to start returning rows).
The second example is interesting because is a proper subset of the index. The indexed view matching code can do subsumption as well. It generates a seek in this case and returns only part of the filtered index.
Both cases do RID lookups back to the heap since I did SELECT * and the index is not “covering”.
The other data point I’ll leave you with this evening is that the showplan_xml has something interesting in it:
StmtSimple StatementText=”select * from t1 where col1 > 5 and col1 < 10“ StatementId=”1“
StatementCompId=”1“ StatementType=”SELECT“ StatementSubTreeCost=”0.0474183“
StatementEstRows=”7“ StatementOptmLevel=”FULL“
StatementOptmEarlyAbortReason=”GoodEnoughPlanFound“
ParameterizedText=”(@1 tinyint,@2 tinyint)SELECT
* FROM [t1] WHERE [col1]>@1 AND [col1]<@2“>
Well, there are two interesting things. First, StatementOptmLevel=FULL means that we didn’t get a trivial plan. There is a parameterized text field, but I don’t think that is being used here. these two queries are showing up as different plans in the plan cache (and obviously they have different plans). So, absent forced autoparam, I don’t think that trivial plan is working on these guys. This makes sense since there is a cost-based plan choice.
That’s it for me tonight guys. Happy Querying!
Conor
PS: Thanks for all of the comments and posts trying to help me get back to a working state. I’ve tried many of the suggestions without luck, but I’m going to try playing with the orca tool, perhaps tomorrow or so, to see if I can get my main machine back up and working.
5 Responses to Filtered Indexes Make Me Drool…
test
Sorry for the test. I spent 15 minutes writing a lengthy reply and when I clicked Save Comment instead of posting I was taken back to your blog’s home page and the comment was gone. This is the second time that’s happened to me on your blog, so I’m betting there is some kind of timeout problem…
Anyway, the gist of my response was that I’m not quite as stoked about the feature as you are, for two reasons:
A) I don’t understand why the filter predicates need to be so limiting. For example, I can do X >= 1 AND X <= 10, but I can’t do X BETWEEN 1 AND 10. This makes no sense… I expect to be able to filter on pretty much any expression that would be valid for a CHECK constraint.
B) I was overly optimistic and hoping that the QP would have learned some new tricks to use for these indexes — namely, in-flight dynamic index selection, so that you could set up a filtered index, and then have the QP pick it at runtime based on, e.g., some value selected from another table. Similar to partition elimination. I think that without the ability to do that, the feature is not nearly as compelling. Any idea if that kind of functionality is slated for SQL11 or beyond?
I blame Paul for all blog issues
. Actually, I blame Paul for most things, including taxes and the weather. Apologies for the issue you’ve been seeing.
Re A:
I don’t have any special knowledge on why between is disallowed. I’ll keep playing with the feature and post up if I learn any more. I think that you should probably consider this to be closer to indexed views rather than CHECK constraints. CHECK constraints are just implemented and maintained on UPDATE/INSERT/DELETE. Filtered indexes also have to be matched, which is actually a far trickier problem in terms of the trade-offs. My only uninformed guess is that perhaps it is a parsing issue or a testing issue.
Re B:
Obviously I can’t comment on things I know about future product releases, and I don’t have any idea whether this is planned for the next release of SQL Server. I can tell you that SQL Server is a complex system and getting filtered indexes implemented was not a minor enterprise. I’m guessing that the first release of the feature covers the core use cases and that’s about it. Later releases will get fancier. If you look at partitioning, for example, this model holds.
As for dynamic plans with filtered indexes, it’s an interesting idea but there are often some complexities around this. For example, it will impact the startup cost and memory grant requirements. Think about the cost for starting up a distributed partitioned view query – that uses a form of dynamic plan selection. If your statistics are bad this kind of feature may be visibly slower than not doing dynamic index selection. I’m not saying that it can’t be done, but I do know that SQL Server’s statistics, while good, don’t have enough granularity to cover all of the cases to avoid all of the problems that can happen – it’s difficult to avoid causing 5% of the popluation’s queries to be slower than needed, and that can impact your thinking when you are working on a product this size.
I’ll leave it at that – I don’t have any more "public" thoughts to share.
Hmmm… When does your NDA expire?
I know this isn’t the focus of your post, but I wanted to chime in RE: indexed views and non-Enterprise skus. I’m working with the Standard and Express skus and always follow the nomenclature of appending the word ‘Base’ to any view with an index. Each of these indexed views receives a simple wrapper view without the ‘Base’ whose sole purpose is to stick in a WITH(NOEXPAND). E.g. I would create MyAggViewBase and MyAggView, so my team knows if they see a view ending in *Base they had better use the non-Base version of it
Another side benefit is the name tells you clearly that there is an index present so you had better not do an ALTER on the view since this will blow away my indexes. When a view definition tweak is needed we just create a new MyAggViewBase2, alter MyAggView to work off of the new base view, and then drop the old indexed view.
Just some observations from the trenches where the bean counters like us to stay in the cheap sql boxes