Along the same lines of improving database design and getting better performance on SQL Server (which [IMO] DOES take an experienced SQL Server database developer – but, we'll talk more about "whose job this really is" in many more posts and probably even a RunAs – which Richard and I just setup to record on Thursday (Mar 12)), I started thinking about how I could convince people of why they NEED a database developer. So, I thought I'd ask this VERY important question…
What percentage of data IS selective enough to use a nonclustered index which doesn't cover the query… in other words (just in case you're not entirely sure of what I mean :)), think of indexes in the back of a book… if you need to go to the back of the book to reference a bunch of data (this is called a [bookmark] lookup in SQL Server), there's a point where the randomness of the lookups (especially if you think in terms of many rows on a page) becomes too expensive. For example, imagine that the index is customer name and the data (the book) is customer orders – and, each page (of this rather weird book ;)), has 20 orders on it. Doing a query to lookup customer number 12's orders might be really easy (if they only have only a few orders) BUT, what if the query is "show me all of the orders for people that have an 'e' in their name". First, the number of people have have an 'e' in their name is probably better than 50% (that's TOTALLY a guess) and, if there's 20 orders per page then a lookup from the index into the book would require SQL Server to touch every page roughly 10 times. If the table has 50,000 pages (therefore 1 million rows – at 20 rows per page), then to find the 500,000 rows (remember, I'm estimating half), SQL Server would have to do 500,000 bookmark lookups. For a table with only 50,000 pages that's terribly expensive.
So, here's the question – what's the tipping point? When is a nonclustered index on customer name NOT going to be used to lookup rows of sales orders? I'm going to use a survey to see what you think and then within a week, I'll give the specific SQL Server math AND a query you can run within your own DBs to see EVERY one of your table's "tipping points". It's really interesting and I think will really help you to understand why SQL Server might not be using those nonclustered indexes………….
Cheers,
kt
2 thoughts on “Why aren’t those nonclustered indexes being used?”
I touched on this one a while back because I was tired of people telling me it was xx% (very high)
http://sqlinthewild.co.za/index.php/2009/01/09/seek-or-scan/