Figuring Out When Wait Statistics Were Last Cleared

One of the topics covered during IE2 is Waits and Queues, and this week one of the attendees asked if there was a way to know when wait statistics were last cleared. It turns out there is a post about it on the SQLOS & Cloud Infrastructure Team Blog, and what you do is look for the SQLTRACE_INCREMENTAL_FLUSH_SLEEP wait_type in sys.dm_os_wait_stats. The value for wait_tims_ms is the time since wait stats were last cleared.

Since I talk about clearing wait stats in my DBCC session I decided this information would be useful to include, and then decided to just write a query to figure it out.  The query is below, as is a query to show the last time the instance was restarted. Edit: Note that this query is only for SQL Server 2008R2 and higher.  The SQLTRACE_INCREMENTAL_FLUSH_SLEEP wait_type did not exist in previous versions of SQL Server.  If I can find an alternate wait_type for use, I will post back.

Just to be clear, I am not advocating that you immediately clear wait stats on your server.  In fact, I’d be more interested to know whether wait stats have been cleared since the instance was last restarted.  Let me back up a bit…  If you’re not familiar with wait statistics, I suggest starting with Paul’s post, Wait statistics, or please tell me where it hurts.  Whether you have performance issues on your server or not, I highly recommend looking at wait stats regularly to understand what your “normal” waits are.  If you are having issues, waits can be a starting point to sorting out what’s causing the problem.  But wait stats should not be viewed in a vacuum – always use other counters or information to correlate findings.  Once you have a basic understanding of your wait stats, you may decide to capture them on a regular basis.  But wait stats are cumulative, and will only reset when you restart the instance, or if you clear them manually.  They can be cleared using:

DBCC SQLPERF ('sys.dm_os_wait_stats', CLEAR);

There are a variety of reasons to clear wait stats, but it shouldn’t be done arbitrarily.  You may clear them because you’ve made a change to the system and want to understand if and how that change affects wait statistics.  You may clear wait stats every morning at 5 AM because backups and other maintenance tasks run at night, but you want to understand performance during business hours when users are the system and therefore decide to filter out (or investigate separately) waits from those tasks.  In the latter case, you probably have a scheduled job to clear the wait stats, and other DBAs may or may not know about this job.  If another DBA decides to clear wait stats for some reason, without telling you, they may affect the wait information you’ve been capturing.  As such, I would always verify with colleagues whether or not wait stats are cleared regularly and this script will give you an idea of whether are being cleared or not.  However, even if they haven’t been cleared since the last restart, I would still perform some due diligence and verify it won’t alter anyone else’s data.  I am interested to hear different ideas for how this information could be used, so feel free to post back or email me.  And if you’re not currently looking at your wait stats regularly, please start!

/* when were wait stats last cleared? */
SELECT
[wait_type],
[wait_time_ms],
DATEADD(ms,-[wait_time_ms],getdate()) AS [Date/TimeCleared],
CASE
WHEN [wait_time_ms] < 1000 THEN CAST([wait_time_ms] AS VARCHAR(15)) + ' ms'
WHEN [wait_time_ms] between 1000 and 60000 THEN CAST(([wait_time_ms]/1000) AS VARCHAR(15)) + ' seconds'
WHEN [wait_time_ms] between 60001 and 3600000 THEN CAST(([wait_time_ms]/60000) AS VARCHAR(15)) + ' minutes'
WHEN [wait_time_ms] between 3600001 and 86400000 THEN CAST(([wait_time_ms]/3600000) AS VARCHAR(15)) + ' hours'
WHEN [wait_time_ms] > 86400000 THEN CAST(([wait_time_ms]/86400000) AS VARCHAR(15)) + ' days'
END [TimeSinceCleared]
FROM [sys].[dm_os_wait_stats]
WHERE [wait_type] = 'SQLTRACE_INCREMENTAL_FLUSH_SLEEP';

/* check SQL Server start time - 2008 and higher */
SELECT
[sqlserver_start_time]
FROM [sys].[dm_os_sys_info];

New Statistics DMF in SQL Server 2008R2 SP2

There is a wonderful new DMF that is available in Service Pack 2 for SQL 2008R2: sys.dm_db_stats_properties. There are many other DMFs and DMVs that include “stats” in their name – sys.dm_db_index_physical_stats, dm_exec_query_stats and sys.dm_os_wait_stats to name a few.But how many of those are about the statistics that describe data distribution in a column? Up until now, not a one, and as someone who really likes statistics, you might start to understand why this is my new favorite DMF.

First, you will need SP2 to utilize this DMF, and note that it is not available in 2012 yet. (Edit: After verifying with Microsoft, it will be available in SP1 of 2012.)  If you haven’t downloaded SP2 for 2008R2 yet, you can get it here. Once you have that installed, we can review what information we can gather from sys.dm_db_stats_properties. The basic syntax is as follows:

sys.dm_db_stats_properties
(object_id, stats_id)

Ok, so we need the ID of the object and the statistic. I’m using a copy of an AdventureWorks database, and for this example I’ll use the Sales.SalesOrderDetail table. We can get the IDs we need with a query against sys.objects and sys.stats:

SELECT sch.name +
‘.’ + so.name, so.object_id, ss.name, ss.stats_id

FROM sys.stats ss

JOIN sys.objects so ON ss.object_id = so.object_id

JOIN sys.schemas sch ON so.schema_id = sch.schema_id

WHERE so.name = N’SalesOrderDetail’;

Let’s take object_id and stats_id and plug them into the DMF:

SELECT *

FROM sys.dm_db_stats_properties(642101328,1);

And my output:

You should recognize some of the information you could previously only find in DBCC SHOW_STATISTICS:

last_udpated: the
last time the statistic was updated

rows: the number
of rows in the table at the time of the update, or if there is a filter
applied, this number will be the number of rows that meet the filter criteria

rows_sampled: the
number of rows actually sampled

steps: the number
of steps in the histogram

unfiltered_rows: the
number of rows in the table at the time of the update

As with DBCC SHOW_STATISTICS, the only way you know whether a full scan was performed when the statistic was updated is to compare rows against rows_sampled.

Then, there’s one more column in the table that could provide incredibly useful for some people:

modification_counter:
number of modifications for the column which leads the statistic, since the
last update

This means you no longer have to go to sys.sysrscols to monitor changes! This is excellent if you want to do targeted optimizations for volatile tables. I’ve talked about this many times with clients, and it has always been a challenge to easily find the statistics most affected by inserts, updates and deletes. You can monitor index usage using sys.dm_db_index_usage_stats, but using the modification_counter in sys.dm_db_stats_properties really provides the ability to drill down into specific statistics and target updates appropriately. In addition, sys.dm_db_stats_properties allows you to quickly find statistics that are outdated. Previously, the only way to do this was to roll through DBCC SHOW_STATISTICS for every statistic, logging the output to a table. With sys.dm_db_stats_properties
we can capture this information with simpler methods. Here’s a sample query you can use to find the statistics that haven’t been updated in over 30 days:

SELECT

   sch.name + ‘.’ + so.name AS
“Table”,

   ss.name AS
“Statistic”
,

     CASE

           WHEN ss.auto_Created = 0 AND ss.user_created = 0 THEN ‘Index Statistic’

           WHEN ss.auto_created = 0 AND ss.user_created = 1 THEN ‘User Created’

           WHEN ss.auto_created = 1 AND ss.user_created = 0 THEN ‘Auto Created’

           WHEN ss.AUTO_created = 1 AND ss.user_created = 1 THEN ‘Not Possible?’

     END AS
“Statistic Type”,

   CASE

           WHEN ss.has_filter = 1 THEN ‘Filtered Index’

           WHEN ss.has_filter = 0 THEN ‘No Filter’

     END AS
“Filtered?”,

   CASE

           WHEN ss.filter_definition
IS NULL THEN

           WHEN ss.filter_definition
IS NOT NULL THEN ss.filter_definition

     END AS “Filter
Definition”,

   sp.last_updated AS
“Stats Last Updated”,

   sp.rows AS “Rows”,

   sp.rows_sampled AS
“Rows Sampled”,

   sp.unfiltered_rows AS
“Unfiltered Rows”,

     sp.modification_counter AS
“Row Modifications”,

     sp.steps AS
“Histogram Steps”

FROM sys.stats ss

JOIN sys.objects so ON ss.object_id = so.object_id

JOIN sys.schemas sch ON so.schema_id = sch.schema_id

OUTER APPLY sys.dm_db_stats_properties(so.object_id, ss.stats_id) AS sp

WHERE so.TYPE = ‘U’

AND sp.last_updated <
getdate() 30

ORDER BY sp.last_updated
DESC
;

You can take this query one step further and modify it to show statistics that have a row_modfications value above a specific threshold (WHERE sp.modification_counter > 10000) or statistics that don’t exist at all (WHERE sp.last_updated IS NULL).At its core, this DMF can make statistics management much easier, and in a future post I will share a script that can be used to update statistics based on different factors such as last updated date and number of modifications. Until then, have fun playing with this in your environment!