I've devoted the last few blog entries to describing what's in that spatial index analysis proc output and how to use it and interpret the results. Just wanted to describe the rest of the information and how the information relates to what you'd find in a query plan that uses a spatial index.

The procs return:
1. Information about the index
2. Information about the query sample
3. How efficient the index is when used against the query sample

The information you get about the index is:
-  Bounding Box dimensions - BOL says they are NULL for geography but they are always -180, -90, 180, 90 for geography
-  GridLevelSize - 4 levels
-  Cells_Per_Object - maximum tessellated cells per row, specified in DDL
-  Page and Row counts for the spatial index
-  Avg number of cells per base row (how many of the CellsPerObject were actually used)
-  Height, Width, Area of a cell - Geography is always the same, but its nice to know how big the cells are compared to the other levels. Not sure what the unit of measure is for geography (BOL says "depends on SRID"), by looking at the number it doesn't look like it could be meters.
-  CellAreaToBoundingBox percentage - 4 levels
-  Total SRIDs found

The most interesting is likely the info about the number of cells in each level (including level0 - off the bounding box). Any levels that are 0 rows don't appear in the XML output. You get info for:
-Number of Object Cells for Index
-Number of Interior Cells
-Number of Intersecting Cells
-Number of Border Cells
-also some numbers about cells normalized to the leaf grid

The cells are useful in how they relate to the primary filter. The interior, intersecting, and border cells are used by the internal filter.

For the query sample, you get the same cells numbers. No "cells normalized to the leaf" however. BTW, for the geography index ignore the BOL comment that indicates that if Level0 of the query sample is 1 then index is not useable for this query. For geography its always 1.

This relates to the following information that appears in a query plan that uses spatial indexes.
1. All of the query samples (in the spatial predicate) are tessellated using the same parameters (density/max cells)
2. The table is joined against all parent/children of the index (the index levels are actually a hierarchy)
3. Index ranges are calculated
4. A join of the tesselated samples on index is done. This results in the number of rows selected by the primary filter.
5. The internal and secondary filters are applied in a separate query iterator (of type Filter) later on in the plan.

The upshot of this is that, not only is the index dimensions used to tessellate the base table rows, the same dimensions as the index are used to tessellate the query sample for the spatial predicate (query sample, in the sproc). So, for example, I'd always thought that spatial indexes over points should specify all-HIGH density and maybe 2 max cells per object. But if your query sample contains large polygons, those same dimensions will be used against the query sample too. So maybe all-HIGH density for point-based indexes isn't a hard and fast rule.

Hopefully this should give you enough information to investigate your indexes and queries using the spatial analysis stored procs. As far as patterns for "which specifications of index is best", more patterns will show up as I investigate more different patterns of spatial data. For now, I'm dividing these into three main categories:

1. Spatial index over table of points, Query sample is polygon.
2. Spatial index over table of polygons, Query sample is point.
3. Spatial index over table of polygons, Query sample is polygon.

With the simplifying premise that linestrings are thin like a point (ie they won't fill a cell), but if you put a STBuffer around them, they're more like polygons.

Happy index hunting.

Categories:
SQL Server 2008

A little more about the query sample that gets fed into the spatial index procs.

Query sample is a singleton geometry or geography. It's not a query. So it's not as straightforward as "here's a spatial query that could use an index, run this query and show me the figures". So this procedure is not going to work any differently for a query that use STIntersects vs STDistance vs... It's going to show you how an index on table A would be utilized given a single operation.

It maps most closely when your query is something like:
select ... from tablea a where a.geog.STIntersects(@g)=1

Is this case @g is your query sample, and Number_Of_Output_Rows is the number of rows the query returns. But what if you have a conditional query or query that has more than one row as output, like

select ...
from tablea a,
     tableb b
where a.id = @someid and b.geog.STIntersects(a.geog)=1

In this case you could specify your query sample like this:
DECLARE @my_query_sample geography = (select geog from tablea a where a.id = @someid)

...And you're looking to an analyze usage on an geography index on tableb. But this will cause an error if "select geog from tablea a where a.id = @someid" returns more than one row. How do I use the sprocs in this case? Immediately you think:
DECLARE @my_query_sample geography;
SELECT @my_query_sample = geog from tablea a where a.id = @someid

or even:
select @my_query_sample = a.geog
from tablea a,
     tableb b
where a.id = @someid and b.geog.STIntersects(a.geog)=1

Let's analyze this way of specifying query sample. If your query sample looks like it's going to represent more than one row, note that the spatial index is always use in (at least I've only seen it used in) a SEEK. So if your query window-producing query represents N rows, you need to choose one row. Hopefully a representative row. This would work if the "top(1) without an order by" row was representative:

DECLARE @my_query_sample geography = (select top(1) geog from tablea a where a.id = @someid)

And realize that you're going to have to do this operation (repsented by the output) N times. Let's make this (hopefully) more clear.

select a.zipcode, b.point
from zipcodes a
     points b
where a.zipcode IN (select zipcode from zipcodes where state = 'OR')
and a.geog.STIntersects(b.geog)

There are 430 zipcodes in Oregon. There are 1000 points in your table. You have an index on zipcodes and one on points. Remember that STIntersects is commutative (a.geog.STIntersects(b.geog) = b.geog.STIntersects(a.geog)) which index is "better"? Notice that the query returns 1 column from zipcodes and 1 column from points, so you can't just STUnion all of the Oregon zipcodes together. You'd get the same points that way, but have no idea which zipcode goes with which point. So choices are:

1. Representative query sample for zipcodes in Oregon, analyze point index, realize that you're going to do this "query" 430 times.
DECLARE @my_query_sample geography = (select top(1) geog from zipcodes a
 where a.zipcode IN (select zipcode from zipcodes where state = 'OR') )
2. Representative query sample for point, analyze zipcode index, realize that you're going to do this "query" 1000 times.
DECLARE @my_query_sample geography = (select top(1) geog from points)

Be aware that doing 430 or 1000 seeks against the spatial index might "scare off" the query optimizer from choosing that plan. You could also declare you're query sample like this:

DECLARE @geog GEOGRAPHY
select @geog = a.geog --- or @geog = b.geog
from zipcodes a
     points b
where a.zipcode IN (select zipcode from zipcodes where state = 'OR')
and a.geog.STIntersects(b.geog)

But that's still going to give you one representative combination point or zipcode as a query sample (ie @geog is still a singleton). If all of your points are in Oregon, it's going to produce a similar analytic output, and if all your points are not in Oregon, it might be better to specify it that way. But then you would have to know how many points in Oregon to deduce how many seeks. And you can't check the ouput against "Number_Of_Rows_Returned" because you don't know which row it actually chose.

If you are just needing points, not zipcode-point combinations you could STUnion together all the zipcodes in Oregon and run the query that way. The spatial analysis procs give you a 1-1 mapping against that query. Or use a state table rather than a zipcode table. Of course this is predicated on the fact that zipcodes don't overlap multiple states.

Hope this answers the question, "what do I use as a query sample to troubleshoot query X and index Y?".

Categories:
SQL Server 2008

Last post discussed the filters. The procs report some raw numbers and some derived numbers. Here's a cheat sheet, although the info is all in the BOL.

N = Number of rows in the table
O = Number of rows output
P = Number of rows selected by primary filter (by the index)
S = Number of rows selected by internal filter (by the index optimizations)

Then,

P-S       = Number_Of_Times_Secondary_Filter_Is_Called (number of times they ran the expensive operation)
(N-P)/N = Percentage_Of_Rows_NotSelected_By_Primary_Filter
S/P       = Percentage_Of_Primary_Filter_Rows_Selected_By_internal_Filter
S/O       = Internal_Filter_Efficiency %
O/P       = Primary_Filter_Efficiency %

<Percentage_Of_Rows_NotSelected_By_Primary_Filter>9.942714746408677e+001
<Percentage_Of_Primary_Filter_Rows_Selected_By_Internal_Filter>9.095871716137836e+001
<Internal_Filter_Efficiency>9.593378913278158e+001
<Primary_Filter_Efficiency>9.481405663596043e+001

So how efficient is the index? Let's change this info into statements, using the values from the previous blog entry:

N = 511650, O = 2779, P = 2931, S = 2666

<Number_Of_Times_Secondary_Filter_Is_Called>265</Number_Of_Times_Secondary_Filter_Is_Called>
<Percentage_Of_Rows_NotSelected_By_Primary_Filter>9.942714746408677e+001
<Percentage_Of_Primary_Filter_Rows_Selected_By_Internal_Filter>9.095871716137836e+001
<Internal_Filter_Efficiency>9.593378913278158e+001
<Primary_Filter_Efficiency>9.481405663596043e+001

1. They had to run STIntersects 265 times, rather than 511650 times if they didn't use the index.
2. 99.4% of the rows were eliminated by using the index
3. The Internal_Filter (internal optimizations) eliminated 90% of the times they would have had to run Intersects, even if the used the index (primary_filter).
4. The Internal_Filter was able to "deduce" 95.9% of the rows in the answer.
5. The Primary_Filter was able to find 94.8% of the rows in the answer.

Pretty good index, huh? And pretty good internal filter. Probably is going to be used.

Let's look at one more, but this time from using a geoemtry index. The index is over in geometry column that contains only points. The query sample is a polygon.

<Base_Table_Rows>11267
<Number_Of_Rows_Selected_By_Primary_Filter>265
<Number_Of_Rows_Selected_By_Internal_Filter>149
<Number_Of_Times_Secondary_Filter_Is_Called>116
<Number_Of_Rows_Output>413
<Percentage_Of_Rows_NotSelected_By_Primary_Filter>9.764799857992367e+001
<Percentage_Of_Primary_Filter_Rows_Selected_By_Internal_Filter>5.622641509433962e+001
<Internal_Filter_Efficiency>3.607748184019371e+001
<Primary_Filter_Efficiency>1.558490566037736e+002

Note that in this case, the internal filter only elimates 56.2% of the rows that the primary filter finds. It only finds 36% of the output rows. But note that in this case, primary filter efficiency = 155%. How can that be? Well, this is geometry and the primary filter didn't find all the rows in the answer. There must be some outside the bounding box. Remember that Primary_Filter_Efficiency is defined as Output_Rows / Primary_Filter rows.

More coming...

Categories:
SQL Server 2008

The spatial index diagnostic procs' most enlightening pieces of information have to do with the filter counts and efficiencies reported at the end of the set of columns or XML document output. But what are these filters exactly and what do the results mean?

The proc output refers to three filters:
1. Primary Filter
2. Internal Filter
3. Secondary Filter

Remember that the spatial indexes are based on tessellation or tiling. Issac has some nice diagrams that show how the tiling works in theory. Please review his indexing posts before reading these and when/if you get confused by the terminology. Depending on the size and type of the spatial object, the object may completely cover a tile (think big polygon), partially cover (intersect) a tile (think point) or not cover a tile. It can also hit the border of a tile. The tiles are referred to in the spatial procs as cells.

Remember also that the point of a spatial index is to reduce the number of Intersect operations (or other operations covered by the spatial index) actually performed by using tiling. I liked Isaac's analogy that "if I want to find roads that intersect Madison WI, I only have to look at roads that intersect Wisconsin". So the index (primary filter) selects all roads that intersect Wisconsin. It can get false positives (roads that intersect Wisconsin but not Madison). The "Secondary Filter" is this case is the Intersects operation itself.

So what's the "internal filter"?  If you think of the primary filter as filtering out 100 polygons out of 10000, they'd still have to run Intersects on the 100 polygons to actually see if they qualify. But say that the index and query sample both *completely* cover a cell. Or index completely covers the cell and query sample touches it. We know that the cell qualifies for Intersect in that case, we don't need to run Intersect on the query sample for that combination. That's the Internal Filter.

So let's look at some sample output. In this case the index was over a geography column that contains only points. The query sample is a polygon. 

<Base_Table_Rows>511650</Base_Table_Rows>
<Number_Of_Rows_Selected_By_Primary_Filter>2931</Number_Of_Rows_Selected_By_Primary_Filter>
<Number_Of_Rows_Selected_By_Internal_Filter>2666</Number_Of_Rows_Selected_By_Internal_Filter>
<Number_Of_Times_Secondary_Filter_Is_Called>265</Number_Of_Times_Secondary_Filter_Is_Called>
<Number_Of_Rows_Output>2779</Number_Of_Rows_Output>

So: 511650 rows in the table. Using the index (primary filter) eliminates all but 2931 of them. The internal filter determines that 2666 are true hits, not false positive. They have to run Intersect 265 times (2931 - 2666) times. And the number of rows returned is 2779, so there were some false positives. Good thing they ran Intersect. BTW. Using the method Filter() instead of Intersect() just means they're returning all the rows selected by primary filter (2931).

Finally, remember if you're using a geography index, there is no bounding box, so no part of the rows or query sample can be outside the index. With geometry, some of the rows or query sample can be outside the bounding box. This means that with geography, Rows_Selected_By_Primary_Filter >= Number_Of_Rows_Output. With geometry, Number_Of_Rows_Output may be > Rows_Selected_By_Primary_Filter as well.

Categories:
SQL Server 2008

I've been looking at the spatial index stored procedures (sp_help_spatial_geography_index_xml and friends) a little harder recently, in an attempt to help answer two questions.

1. If I have a spatial query, why does/doesn't it use my spatial index?
2. If I have a specific spatial query in mind, what are the best values to choose for my spatial index density?

In addition to the diagnostic information presented in the procs, there are a few things to keep in mind when attempting to answer the first question. Currently, the query optimizer is estimating "high" when it comes to using the spatial index, so there are occasions were the spatial index should be used and isn't. Isaac mentioned this in his PDC talk, and mentioned that this would be changed in the next refreshed. As far as I could see, it isn't in SQL Server 2008 CU2.

Also, remember that SQL Server will never use cardinality estimates for a parameter set in a variable by the SET statement unless you use OPTION(RECOMPILE) on the statement. This is a general SQL Server behavior, not a spatial behavior because of when SQL Server creates the query plan. So if you want cardinality estimates for the variable @g in the query "select geog from table where geog.Intersects(@g)=1" you must either pass @g in as a stored procedure parameter or use sp_executesql. Realize that if you use a parameterized query in a client app, the client API stacks will change the parameterized query to invoke sp_executesql, so you're OK here. According to Michael Rys' SQLPASS talk, one spatial-specific behavior is that cardinality won't be used if @g if specified as a constant, that is "select...where geog.Intersects('POINT(1 1)')=1 " either. Finally SQL Server doesn't change the query plan in a parameter value changes, general behavior, not spatial-specific.

Back to the diagnostic stored procs.

The information that the spatial index procs provide consists of three main categories:
1. Information about the index itself
2. Information about the query sample
3. How efficient the query index is when used against the query sample

The query sample is specified as an input parameter to the diagnostic procs, and would be "@g" in the Intersects queries above.

If you've read my previous blog post on the diagnostic procs, read it again, I've revised it. Filters are the subject of the next post.

Categories:
SQL Server 2008

Sorry to appear after a blog drought with theory meta-type blog entries. Too much time at conferences pondering technologies, I guess.

The "relational database bigots" I hang out with don't like LINQ at all. They hope it would shrivel up in a corner and become part of the fad-technology graveyard. Or they're waiting to make big bucks fixing the performance problems they think will ensue.

For the life of me I can't figure out why. I think they're reacting to LINQ to SQL and Entity Framework, layers on top of relational data access that happen to use LINQ to obtain sets of data. Every layer of abstraction is going to make it easier to write less-performant SQL. Granted. But think of it a different way....

You, my SQL-centric friends, have spent much of your career specializing in this declarative query language. SQL. The best language so far (modulo C.J. Date's third manifesto "D" language) to query the relational data structures  at the center of your universe. LINQ is just "SQL for the rest of us(them)". Disguising iterators, everyone is now using SQL (well, at least the SELECT...FROM...WHERE...GROUP BY...ORDER BY part) to query everything and anything. You were right to invest such effort into this query language. With LINQ, it's become universal.

The impression I get (I could be wrong) is that LINQ might become the next OLE DB. OLE DB was supposed to go beyond relational data sources to provide extensions to the column-row paradigm and provide "data access" over non-relational systems like "file system" (OLE DB provider for WebDav), Active Directory, Search Server (fulltext search worked through OLE DB once), Exchange, etc, etc. (Mostly) similar access to every source of data. You wouldn't believe what data sources folks wrote OLE DB providers for in the good-ol days. Anything that could be cast to an API that could implement (or be cast to an implementation of) the Connection/Session/Command/Rowset co-types (or even just IRowset) was fair game.
 
LINQ is a system that provides a *SQL-like* (high-level) language for anything (including collections of objects/.NET types, XML, etc, etc). Anything that has any API that can implement IQueryable/IUpdateable. Yes, LINQ to SQL (or LINQ to Entities) doesn't always generate exactly the most optimal SQL that generates the best query plans. NO argument. But T-SQL/PL-SQL/etc doesn't always generate the most optimal query plans for the storage engine 100% of the time either. Although they will get closer than LINQ-to-database, because they (T-SQL, etc) are more tightly coupled to the relational model and query optimizer than LINQ to SQL is. Every level of abstraction (translation) pays a performance penalty; the goal is to minimize it, while making the abstraction similar. And all of the same kinds of providers that speak the LINQ language are appearing. Just like the OLE DB providers did.

So perhaps, the next version of SQL Server will support not only OLE DB-based Linked Servers, but also "LINQ-ed" servers (bad pun intended) as non-SQL Server rowset sources. Hmmm...

Categories:

I've been following the ADO.NET Entity Framework since its inception. Never did buy the idea about it being "more than an ORM", thought it was just marketing hype. After all, if it looks like an ORM and smells like an ORM, then... But, the more I've been wallowing around in it and thinking about its position in the world in general, the more I'd thought about what could drive someone to the "more than an ORM" conclusion. Here's three things that come to mind.

1. It's built on the ADO.NET provider model. This is actually interesting for two reasons. First, anyone with an ADO.NET data provider can plug in without having to change all the tools. More importantly, it provides the EF an entry to any tool that consumes ADO.NET data providers. Imagine being able to  do reporting against your object model with SSRS or Crystal Reports. Or use your CDS (conceptual data store) as an integration source.  Or integration with Office. Granted that currently these utilities don't know anything about IExtendedDataRecord currently, but Microsoft's Report Builder let's you build a user-friendly object model for reporting...why not use a model that already looks like your business objects?

2. Lack of IExtendedDataRecord support brings us to point two. The ESQL language extensions. You can use ESQL to provide a query over the data source, flattening the hierarchies out when needed. And that's just what most tools need, a connection string and a query string. That produces rowsets (even flat rowsets) to be consumed by tools that don't care a bit about object models. Without waiting for all these tools to add support for an object-oriented conceptual data source.

3. Finally, I came across this one. The writer was writing about PowerShell (specifically, why PowerShell in SQL Server when there already is T-SQL), but here's an interesting quote:

Within Microsoft we have a concept called Common Engineering Criteria (CEC). The goal of CEC is to establish consistency across IT products (Windows Server, Exchange, SQL Server, BizTalk, System Center, etc). [some text deleted] ...One of the latest CEC requirements is support for PowerShell. [more text deleted] ...Transact-SQL is a good scripting language but it’s not model driven. We will continue to invest in and evolve T-SQL. But we’ll also continue to invest in and evolve the models over DDL and DML. The model over DDL is SMO. The model over DML is EDM. (emphasis mine)

While this doesn't say (or even imply) that EDM is part of the Common Engineering Criteria, the idea of it a "model-driven DML" is intruiging.

So perhaps "more than an ORM" isn't just marketing hype. We'll just have to see.

Categories:

End of the conference season 2008. At least for me, my friends in Belgium are looking forward to their SQL Server Day, coming up soon.

I was able to catch up with a lot of friends at TechEd EMEA ITForum, TechEd EMEA for Developers, and SQLPASS, including some folks I hadn't seen in years. Just wanted to reiterate how good it was to see you, how much fun it was to hang out. I heard about of a lot of exciting new upcoming technologies and was able to discuss them with the industry's experts one-on-one.

My contribution was some hopefully useful information that made it easier to understand the zen of the new SQL Server 2008 features, the motivation for migration, and a lot of cool tools for your planning, deploying, and troubleshooting toolbox. All the sample code that I can give away is posted on the SQLskills website. The folks at the SQLPASS preconference even received DVDs. I hope you had as good a time as I did.

Categories:

I've been doing talks and demos on SQL Server 2008 Extended Events for a while now, it's one of my favorite parts of the product. Per session waitstats, SQL stack, built-in system health session, what's not to like?

Before my talk at TechEd EMEA for IT Professionals last week (which went really well), I discovered something that was very helpful in getting things to run more smoothly. Since the betas, I've always used multi-part names, not only for events, actions, and providers, but for predicates too. To be clearer. Lately, I'd been running into a weird problem. Consider the following event session:

create event session errorsession on server
add event sqlserver.error_reported
(
action
(   
sqlserver.session_id, 
sqlserver.sql_text    
)
where sqlserver.error_reported.error = 547
and package0.counter <= 3 
)
add target package0.ring_buffer
go

About one third of the time, running this DDL would produce:

Msg 25706, Level 16, State 8, Line 1
The event attribute or predicate source, "sqlserver.error_reported.error", could not be found.

Then, I'd move the DDL code to another window, execute it again, and it would work. Sometimes starting up the session:

alter event session errorsession on server state=start

would produce the error. And, of course that attribute DID exist, when I got past the weird error, it resultiing session worked like a charm. You can see the attribute/field in sys.dm_xe_object_columns. Hmmmm, scratches head...

After looking at the definition of the system_health session, I decided to try a one-part attribute name. Instead of:

...where sqlserver.error_reported.error = 547 and...
-- how about
...where error = 547 and...

Bingo! Now CREATE and ALTER SESSION work first time and every time. Don't know why this should happen, but at least there's a syntax correction that works. For the folks at the talk, all my event session code will be up on the SQLskills website shortly, under the "Past Conferences" section. With ONE-PART attribute names in predicates.

BTW, a few other things to remember to save you some head-scratching.

When using the ETW target, your SQL Server service account (which DOES have least-privileges...right?) needs to be a member of the Performance Monitor Users and Performance Log Users Windows groups.

When using a file-based target, the service account has to have permissions on the directory where you're writing the file. This sounds obvious, but folks forget that the SQL Server service account isn't all-powerful. Or at least *shouldn't be*, if you're running SQL Server as local system or administrator, FIX IT, using SQL Server Configuration Manager (not Control Panel/Services).

Use an asynchronous target with default event latency rather than a synchronous target, for less overhead, better throughput. However the ETW target is synchronous only.

All of the events, targets, etc are sharable, mix-n-match, across all packages WITH ONE EXCEPTION. The items in the SecAudit package are private. Folks always want to use SecAudit.asynchronous_security_audit_event_log_target, ... you can't. It's for the new Auditing feature only.

All that said, Happy extended eventing!

Categories:
SQL Server 2008

I remember hearing during the SQL Server 2008 beta that there would be diagnostic stored procedures that would produce information about spatial indexes and help in troubleshooting why an index was not being used. In addition, these procedures would help in determining the best spatial index. A few weeks ago I had the good fortune to run across these procedures.

sp_help_spatial_geometry_index and sp_help_spatial_geography_index put out information as columns
sp_help_spatial_geometry_index_xml and sp_help_spatial_geography_index_xml put out XML format

Each procedure takes a table name, index name, and sample query. There's an option for verbose or basic properties. The parameters and output are doc'd in BOL. Rather than repeat the doc, I'd like to walk through a problem and show how you'd interpret the output.

Say that you are trying to decide between an index with high and medium density to run a query similar to the following:

Declare @geometry geometry =
 Geometry::STGeomFromText('POLYGON((500000 300000,500000 340000,540000 340000,540000 300000,500000 300000))', 0)
exec sp_executesql N'select * from spatial_table where LOCATION.STIntersects(@g) = 1', N'@g geometry', @geometry

Create both indexes and run the procedure against each one, using the sample query. There is a lot of very detailed information about the index itself because I was running in verbose mode, including information about the number of cells at each level of tesselation. Before looking at this information, it would be good to review Isaac's excellent blog post on how tesselation works. The most useful info though, was contained in the "Number of rows selected by ... filter" properties and filter efficiency properties. In my case there they were

High density index:
Number_Of_Rows_Selected_By_Primary_Filter:  2931
Number_Of_Rows_Selected_By_Internal_Filter: 2399
Number_Of_Times_Secondary_Filter_Is_Called:  532

Medium density index:

Number_Of_Rows_Selected_By_Primary_Filter:  2931
Number_Of_Rows_Selected_By_Internal_Filter: 2666
Number_Of_Times_Secondary_Filter_Is_Called:  265

Note that "primary filter" in this case is the spatial index in question, the goal is to have the rows selected by the primary or internal filter rather than run the secondary filter (the actual Intersects operation). Note that there can be false positives, the actual number of rows returned was not 2931 but 2779.

The best way to judge the effectiveness of the index, however, is with the property Internal_Filter_Efficiency or Primary_Filter_Efficiency.  Internal_Filter_Efficiency, for example, showed the high density index at 86% and the medium denisty index at 96%. Clearly, the medium density index is a better choice for this query. In addition, looking at the statistics on the index itself shows that the medium density index is a smaller index with less rows and index row per base row. The figures were:

High density:
Total_Primary_Index_Rows: 5273534 rows
Average_Number_Of_Index_Rows_Per_Base_Row : 10

Medium density:
Total_Primary_Index_Rows: 1965865 rows
Average_Number_Of_Index_Rows_Per_Base_Row : 3

There is also a property, Total_Number_Of_ObjectCells_In_Level0_For_QuerySample, that will tell you if the index *can* be used. A non-zero value indicates this and properties that will assist in determining whether to adjust the bounding box on a geometry index.

Hope this will give you a good head start in working with these excellent informational and diagnostic stored procedures.

 

Categories:
SQL Server 2008

Theme design by Nukeation based on Jelle Druyts