There is a wonderful new DMF that is available in Service Pack 2 for SQL 2008R2: sys.dm_db_stats_properties.  There are many other DMFs and DMVs that include “stats” in their name – sys.dm_db_index_physical_stats, dm_exec_query_stats and sys.dm_os_wait_stats to name a few.But how many of those are about the statistics that describe data distribution in a column?  Up until now, not a one, and as someone who really likes statistics, you might start to understand why this is my new favorite DMF.

First, you will need SP2 to utilize this DMF, and note that it is not available in 2012 yet.  (Edit: After verifying with Microsoft, it will be available in SP1 of 2012.)  If you haven’t downloaded SP2 for 2008R2 yet, you can get it here.  Once you have that installed, we can review what information we can gather from sys.dm_db_stats_properties.  The basic syntax is as follows:

sys.dm_db_stats_properties
(object_id, stats_id)

Ok, so we need the ID of the object and the statistic.  I’m using a copy of an AdventureWorks database, and for this example I’ll use the Sales.SalesOrderDetail table.  We can get the IDs we need with a query against sys.objects and sys.stats:

SELECT sch.name +
‘.’ + so.name, so.object_id, ss.name, ss.stats_id

FROM sys.stats ss

JOIN sys.objects so ON ss.object_id = so.object_id

JOIN sys.schemas sch ON so.schema_id = sch.schema_id

WHERE so.name =  N’SalesOrderDetail’;

Let’s take object_id and stats_id and plug them into the DMF:

SELECT *

FROM sys.dm_db_stats_properties(642101328,1);

And my output:

stats New Statistics DMF in SQL Server 2008R2 SP2

You should recognize some of the information you could previously only find in DBCC SHOW_STATISTICS:

last_udpated: the
last time the statistic was updated

rows: the number
of rows in the table at the time of the update, or if there is a filter
applied, this number will be the number of rows that meet the filter criteria

rows_sampled: the
number of rows actually sampled

steps: the number
of steps in the histogram

unfiltered_rows: the
number of rows in the table at the time of the update

As with DBCC SHOW_STATISTICS, the only way you know whether a full scan was performed when the statistic was updated is to compare rows against rows_sampled. 

Then, there’s one more column in the table that could provide incredibly useful for some people:

modification_counter:
number of modifications for the column which leads the statistic, since the
last update

This means you no longer have to go to sys.sysrscols to monitor changes!  This is excellent if you want to do targeted optimizations for volatile tables.  I’ve talked about this many times with clients, and it has always been a challenge to easily find the statistics most affected by inserts, updates and deletes.  You can monitor index usage using  sys.dm_db_index_usage_stats, but using the modification_counter in sys.dm_db_stats_properties really provides the ability to drill down into specific statistics and target updates appropriately.  In addition, sys.dm_db_stats_properties allows you to quickly find statistics that are outdated.  Previously, the only way to do this was to roll through DBCC SHOW_STATISTICS for every statistic, logging the output to a table.  With sys.dm_db_stats_properties
we can capture this information with simpler methods.  Here’s a sample query you can use to find the statistics that haven’t been updated in over 30 days:

SELECT

    sch.name + ‘.’ + so.name AS
“Table”,

    ss.name AS
“Statistic”
,

      CASE

            WHEN ss.auto_Created = 0 AND ss.user_created = 0 THEN ‘Index Statistic’

            WHEN ss.auto_created = 0 AND ss.user_created = 1 THEN ‘User Created’

            WHEN ss.auto_created = 1 AND ss.user_created = 0 THEN ‘Auto Created’

            WHEN ss.AUTO_created = 1 AND ss.user_created = 1 THEN ‘Not Possible?’

      END AS
“Statistic Type”,

    CASE

            WHEN ss.has_filter = 1 THEN ‘Filtered Index’

            WHEN ss.has_filter = 0 THEN ‘No Filter’

      END AS
“Filtered?”,

    CASE

            WHEN ss.filter_definition
IS NULL THEN

            WHEN ss.filter_definition
IS NOT NULL THEN ss.filter_definition

      END AS “Filter
Definition”,

    sp.last_updated AS
“Stats Last Updated”,

    sp.rows AS “Rows”,

    sp.rows_sampled AS
“Rows Sampled”,

    sp.unfiltered_rows AS
“Unfiltered Rows”,

      sp.modification_counter AS
“Row Modifications”,

      sp.steps AS
“Histogram Steps”

FROM sys.stats ss

JOIN sys.objects so ON ss.object_id = so.object_id

JOIN sys.schemas sch ON so.schema_id = sch.schema_id

OUTER APPLY sys.dm_db_stats_properties(so.object_id, ss.stats_id) AS sp

WHERE so.TYPE = ‘U’

AND sp.last_updated <
getdate() - 30

ORDER BY sp.last_updated
DESC
;

You can take this query one step further and modify it to show statistics that have a row_modfications value above a specific threshold (WHERE sp.modification_counter > 10000) or statistics that don’t exist at all (WHERE sp.last_updated IS NULL).At its core, this DMF can make statistics management much easier, and in a future post I will share a script that can be used to update statistics based on different factors such as last updated date and number of modifications.  Until then, have fun playing with this in your environment!