RML Utilities and SQL Server 2012

RML Utilities is a free utility created by Microsoft and used by the SQL Server support team.  It’s available for download and if you work with SQL Trace data at all I’d recommend giving it a look. I know, I know, Trace is deprecated in SQL Server 2012.  But not everyone is running 2012 so until then, Trace files will still exist.

I first heard about RML Utilities years ago from Andrew Kelly ( b | t ) and have used it to analyze trace data from customers, which is really just a fraction of what it can do.  Today I’ve been working with it to compare two trace files.  Yes, that’s right…compare two trace files.  Imagine you run a load test in your production environment, then run that same test in your development environment which might have different hardware or updated code.  You can capture PerfMon metrics to compare Memory, CPU and I/O…but what about the performance of individual queries?  You can get that from the DMVs, but it’s really nice if you can look at the information side by side.  And I don’t mean two instances of Profiler open, I mean really compare the data.  RML Utilities will do that for you!

But, before I could get RML Utilities to do that, I had to get it to process a trace file.  This was my issue today.  I had two VMs, one with 2008R2 and 2012 installed, one with only 2012 installed.  The readtrace.exe file is used to process trace files, and the commands require the trace file (input), an output directory, a SQL Server instance and a database.  I had everything set up, but my process would fail with this set of messages:

09/27/12 15:49:44.262 [0X00000AA4] Readtrace a SQL Server trace processing utility.
Version 9.01.0109 built for x64.
Copyright (c) Microsoft Corporation 1997-2008. All rights reserved.
…extra text removed…
09/27/12 15:49:44.263 [0X00000AA4] –IC:\PerfLogs\Trace\File_RML_2008R2.trc
09/27/12 15:49:44.263 [0X00000AA4] –oC:\SQLskills\RMLOutput
09/27/12 15:49:44.263 [0X00000AA4] –SWIN2008R2PS\SQL2012
09/27/12 15:49:44.263 [0X00000AA4] –dBaselineData
09/27/12 15:49:44.263 [0X00000AA4] Using language id (LCID): 1024 [English_United States.1252] for character formatting with NLS: 0×00060101 and Defined: 0×00060101
09/27/12 15:49:44.263 [0X00000AA4] Attempting to cleanup existing RML files from previous execution
09/27/12 15:49:44.264 [0X00000AA4] Using extended RowsetFastload synchronization
09/27/12 15:49:44.264 [0X00000AA4] Establishing initial database connection:
09/27/12 15:49:44.264 [0X00000AA4] Server: WIN2008R2PS\SQL2012
09/27/12 15:49:44.264 [0X00000AA4] Database: BaselineData
09/27/12 15:49:44.265 [0X00000AA4] Authentication: Windows
09/27/12 15:49:44.266 [0X00000AA4] Unable to connect to the specified server.
09/27/12 15:49:44.266 [0X00000AA4] ERROR: Performance analysis failed to initialize.  See previous errors and correct the problem before retrying.
09/27/12 15:49:44.266 [0X00000AA4] *******************************************************************************
* ReadTrace encountered one or more ERRORS. An error condition typically      *
* stops processing early and the ReadTrace output may be unusable.            *
* Review the log file for details.                                            *
*******************************************************************************
09/27/12 15:49:44.266 [0X00000AA4] ***** ReadTrace exit code: –9

My first thought was that it was having an issue with SQL Server 2012.  The documentation does not list SQL Server 2012 as a supported version.  So I tried it on my local machine against a 2012 instance, and it worked.  I reviewed the error message: “Unable to connect to the specified server.”  This made no sense, so I verified that I could connect with sqlcmd. Then I tried it against my other VM, thinking maybe it was something with VMWare.  That worked.  Then I started comparing the output.  From the successful process, I had the following:

09/27/12 15:59:07.349 [0X00000CD4] Readtrace a SQL Server trace processing utility.
Version 9.01.0109 built for x64.
Copyright (c) Microsoft Corporation 1997-2008. All rights reserved.
…extra text removed…
09/27/12 15:59:07.350 [0X00000CD4] –IC:\PerfLogs\Trace\File_RML_2008R2.trc
09/27/12 15:59:07.350 [0X00000CD4] –oC:\SQLskills\RMLOutput
09/27/12 15:59:07.350 [0X00000CD4] –SWIN2008R2-1\SQL2012
09/27/12 15:59:07.350 [0X00000CD4] –dBaselineData
09/27/12 15:59:07.350 [0X00000CD4] Using language id (LCID): 1024 [English_United States.1252] for character formatting with NLS: 0×00060101 and Defined: 0×00060101
09/27/12 15:59:07.350 [0X00000CD4] Attempting to cleanup existing RML files from previous execution
09/27/12 15:59:07.351 [0X00000CD4] Using extended RowsetFastload synchronization
09/27/12 15:59:07.351 [0X00000CD4] Establishing initial database connection:
09/27/12 15:59:07.351 [0X00000CD4] Server: WIN2008R2-1\SQL2012
09/27/12 15:59:07.351 [0X00000CD4] Database: BaselineData
09/27/12 15:59:07.351 [0X00000CD4] Authentication: Windows
09/27/12 15:59:07.390 [0X00000CD4] Using SQL Client version 10
09/27/12 15:59:07.390 [0X00000CD4] Creating or clearing the performance database
09/27/12 15:59:07.892 [0X00000CD4] Processing file: C:\PerfLogs\Trace\File_RML_2008R2.trc (SQL 2008)
09/27/12 15:59:07.892 [0X00000CD4] Validating core events exist
09/27/12 15:59:07.892 [0X00000CD4] Validating necessary events exist for analysis
…more extra text removed…

My issue was that RML Utilities uses version 10 of the Client.  On the VM I only had SQL 2012 installed, which is version 11.  Once I downloaded and installed the 2008R2 Client (from here, you just need sqlncli_amd64.msi for an x64 machine), everything ran fine.  Thus, even though SQL 2012 is not listed as a supported version for RML Utilities, it looks like it will run as long as you have version 10 of the Client installed.  And as for comparing those files?  That’s for another post!

A Poll: Recommendations for New Speakers

This morning I wrote a post on my original blog about presenting. I’m cross-posting here because I want to ask other speakers, and my SQLskills colleagues, for the number one piece of advice they provide to new speakers. Write a quick post, or leave a comment. New presenters want to hear from you, and I know that I can always get better. Don’t over think it; just share what comes to mind first. Go!

Customizing the Default Counters for Performance Monitor

I am a huge fan of Performance Monitor (PerfMon). Yes, I know, that’s a geeky statement, but I don’t care. There is such a wealth of information available from PerfMon; you can use it to look at performance real-time, or to capture metrics about performance over time.  And, the functionality is built in to Windows. It’s there no matter what Windows server you’re working on – and when you work on a lot of different servers having a tool you can consistently rely on is extremely useful.

But one challenge I had with PerfMon that took me a while to figure out was how to change the default settings. When I start PerfMon, it only shows the % Processor Time counter. Now that’s a useful counter, especially when there’s a performance issue going on, but I also want to look at other counters such as memory utilization and disk latency. When the system is having a problem, I dislike spending an extra minute or two to add all the counters I want to see. I want them to just be there when I open PerfMon.

For those of you running Windows 2008 and higher (and Windows Vista and higher for workstations), there is an easy solution.  If you’re on Windows XP or Windows 2003, I have a solution for you, too, it’s just a few extra steps.

Windows 2008+ and Windows Vista+

On your local or machine or server, select Start | Run and then open up Performance Monitor with the /sys Command-Line option: perfmon /sys

This opens Performance Monitor in a stand-alone mode (if you enter just perfmon, you get additional options such as Data Collector Sets and Reports).  Within PerfMon, add the counters you want to monitor.  You can either click on the green plus (+) to add counters, or right-click in the graph and select Add Counters…  Once you have added the counters, close PerfMon.  Trust me.  Just close it.

Go back to Start | Run and enter perfmon /sys again.  The counters you added should be selected.  Running Process Monitor on my machine showed that PerfMon saved a configuration file (Perfmon.PerfmonCfg) in Users\<username>\AppData\Local.  The location may vary depending on OS or roaming profiles (if you cannot find it, simply run Process Monitor and filter on perfmon.exe to find where it writes the file).

You can take this one step further by creating multiple .PerfmonCfg files – and they can be stored anywhere.  Once you have selected the counters in PerfMon, select File | Save Settings As… and create a new .PerfmonCfg file with the appropriate name, either locally or in a share.  Modify the counters as needed, then save the configuration as a different .PerfmonCfg file.  When you want to launch PerfMon for a specific .PerfmonCfg file, just double-click on the file.  A great benefit of the different files is that you can share them between servers.  However, take note of how you add the counters.  If you add a counter for a specific drive letter that doesn’t exist on every server, the counter will appear in the list, but no data will appear in the graph.

Windows 2003 and Windows XP

If you’re still running Windows XP or Windows 2003, don’t despair, I have another method.

On your local or machine or server, open up Performance Monitor (Start | Run| perfmon). Add the counters you want to monitor.  Once you have the counters added, right-click again in the graphing area and select Save As… and save it as a .html file. Then close PerfMon.

Start up PerfMon again, you should see that you only have the % Processor Time counter. Open the .html file you just saved in a text editor. Highlight everything (CTRL + A) and then copy it (CTRL + C). Switch back to PerfMon, click in the graph area and paste (CTRL + V). You can also click on the paste icon in the toolbar. The counters will immediately show up.  As with the configuration file, you can create multiple .html files to save locally or share.

Many routes to the same goal

Having the ability to open PerfMon with specific counters already selected is one of those little things that just makes my life easier.  Because there are usually many different ways to accomplish the same task in Windows (think of how you open Windows Explorer – do you right-click on the Start button and select Open Windows Explorer, or do you use CTRL + E, or something else?), I’m interested to know if there are other methods people have used to get counters to show up by default.  Feel free to leave a comment or send me an email if you know of other options.  I hope this helps!

Figuring Out When Wait Statistics Were Last Cleared

One of the topics covered during IE2 is Waits and Queues, and this week one of the attendees asked if there was a way to know when wait statistics were last cleared. It turns out there is a post about it on the SQLOS & Cloud Infrastructure Team Blog, and what you do is look for the SQLTRACE_INCREMENTAL_FLUSH_SLEEP wait_type in sys.dm_os_wait_stats. The value for wait_tims_ms is the time since wait stats were last cleared.

Since I talk about clearing wait stats in my DBCC session I decided this information would be useful to include, and then decided to just write a query to figure it out.  The query is below, as is a query to show the last time the instance was restarted.  Edit: Note that this query is only for SQL Server 2008R2 and higher.  The SQLTRACE_INCREMENTAL_FLUSH_SLEEP wait_type did not exist in previous versions of SQL Server.  If I can find an alternate wait_type for use, I will post back.

Just to be clear, I am not advocating that you immediately clear wait stats on your server.  In fact, I’d be more interested to know whether wait stats have been cleared since the instance was last restarted.  Let me back up a bit…  If you’re not familiar with wait statistics, I suggest starting with Paul’s post, Wait statistics, or please tell me where it hurts.  Whether you have performance issues on your server or not, I highly recommend looking at wait stats regularly to understand what your “normal” waits are.  If you are having issues, waits can be a starting point to sorting out what’s causing the problem.  But wait stats should not be viewed in a vacuum – always use other counters or information to correlate findings.  Once you have a basic understanding of your wait stats, you may decide to capture them on a regular basis.  But wait stats are cumulative, and will only reset when you restart the instance, or if you clear them manually.  They can be cleared using:

DBCC SQLPERF ('sys.dm_os_wait_stats', CLEAR);

There are a variety of reasons to clear wait stats, but it shouldn’t be done arbitrarily.  You may clear them because you’ve made a change to the system and want to understand if and how that change affects wait statistics.  You may clear wait stats every morning at 5 AM because backups and other maintenance tasks run at night, but you want to understand performance during business hours when users are the system and therefore decide to filter out (or investigate separately) waits from those tasks.  In the latter case, you probably have a scheduled job to clear the wait stats, and other DBAs may or may not know about this job.  If another DBA decides to clear wait stats for some reason, without telling you, they may affect the wait information you’ve been capturing.  As such, I would always verify with colleagues whether or not wait stats are cleared regularly and this script will give you an idea of whether are being cleared or not.  However, even if they haven’t been cleared since the last restart, I would still perform some due diligence and verify it won’t alter anyone else’s data.  I am interested to hear different ideas for how this information could be used, so feel free to post back or email me.  And if you’re not currently looking at your wait stats regularly, please start!

/* when were wait stats last cleared? */
SELECT
[wait_type],
[wait_time_ms],
DATEADD(ms,-[wait_time_ms],getdate()) AS [Date/TimeCleared],
CASE
WHEN [wait_time_ms] < 1000 THEN CAST([wait_time_ms] AS VARCHAR(15)) + ' ms'
WHEN [wait_time_ms] between 1000 and 60000 THEN CAST(([wait_time_ms]/1000) AS VARCHAR(15)) + ' seconds'
WHEN [wait_time_ms] between 60001 and 3600000 THEN CAST(([wait_time_ms]/60000) AS VARCHAR(15)) + ' minutes'
WHEN [wait_time_ms] between 3600001 and 86400000 THEN CAST(([wait_time_ms]/3600000) AS VARCHAR(15)) + ' hours'
WHEN [wait_time_ms] > 86400000 THEN CAST(([wait_time_ms]/86400000) AS VARCHAR(15)) + ' days'
END [TimeSinceCleared]
FROM [sys].[dm_os_wait_stats]
WHERE [wait_type] = 'SQLTRACE_INCREMENTAL_FLUSH_SLEEP';

/* check SQL Server start time - 2008 and higher */
SELECT
[sqlserver_start_time]
FROM [sys].[dm_os_sys_info];

New Statistics DMF in SQL Server 2008R2 SP2

There is a wonderful new DMF that is available in Service Pack 2 for SQL 2008R2: sys.dm_db_stats_properties.  There are many other DMFs and DMVs that include “stats” in their name – sys.dm_db_index_physical_stats, dm_exec_query_stats and sys.dm_os_wait_stats to name a few.But how many of those are about the statistics that describe data distribution in a column?  Up until now, not a one, and as someone who really likes statistics, you might start to understand why this is my new favorite DMF.

First, you will need SP2 to utilize this DMF, and note that it is not available in 2012 yet.  (Edit: After verifying with Microsoft, it will be available in SP1 of 2012.)  If you haven’t downloaded SP2 for 2008R2 yet, you can get it here.  Once you have that installed, we can review what information we can gather from sys.dm_db_stats_properties.  The basic syntax is as follows:

sys.dm_db_stats_properties
(object_id, stats_id)

Ok, so we need the ID of the object and the statistic.  I’m using a copy of an AdventureWorks database, and for this example I’ll use the Sales.SalesOrderDetail table.  We can get the IDs we need with a query against sys.objects and sys.stats:

SELECT sch.name +
‘.’ + so.name, so.object_id, ss.name, ss.stats_id

FROM sys.stats ss

JOIN sys.objects so ON ss.object_id = so.object_id

JOIN sys.schemas sch ON so.schema_id = sch.schema_id

WHERE so.name =  N’SalesOrderDetail’;

Let’s take object_id and stats_id and plug them into the DMF:

SELECT *

FROM sys.dm_db_stats_properties(642101328,1);

And my output:

stats New Statistics DMF in SQL Server 2008R2 SP2

You should recognize some of the information you could previously only find in DBCC SHOW_STATISTICS:

last_udpated: the
last time the statistic was updated

rows: the number
of rows in the table at the time of the update, or if there is a filter
applied, this number will be the number of rows that meet the filter criteria

rows_sampled: the
number of rows actually sampled

steps: the number
of steps in the histogram

unfiltered_rows: the
number of rows in the table at the time of the update

As with DBCC SHOW_STATISTICS, the only way you know whether a full scan was performed when the statistic was updated is to compare rows against rows_sampled. 

Then, there’s one more column in the table that could provide incredibly useful for some people:

modification_counter:
number of modifications for the column which leads the statistic, since the
last update

This means you no longer have to go to sys.sysrscols to monitor changes!  This is excellent if you want to do targeted optimizations for volatile tables.  I’ve talked about this many times with clients, and it has always been a challenge to easily find the statistics most affected by inserts, updates and deletes.  You can monitor index usage using  sys.dm_db_index_usage_stats, but using the modification_counter in sys.dm_db_stats_properties really provides the ability to drill down into specific statistics and target updates appropriately.  In addition, sys.dm_db_stats_properties allows you to quickly find statistics that are outdated.  Previously, the only way to do this was to roll through DBCC SHOW_STATISTICS for every statistic, logging the output to a table.  With sys.dm_db_stats_properties
we can capture this information with simpler methods.  Here’s a sample query you can use to find the statistics that haven’t been updated in over 30 days:

SELECT

    sch.name + ‘.’ + so.name AS
“Table”,

    ss.name AS
“Statistic”
,

      CASE

            WHEN ss.auto_Created = 0 AND ss.user_created = 0 THEN ‘Index Statistic’

            WHEN ss.auto_created = 0 AND ss.user_created = 1 THEN ‘User Created’

            WHEN ss.auto_created = 1 AND ss.user_created = 0 THEN ‘Auto Created’

            WHEN ss.AUTO_created = 1 AND ss.user_created = 1 THEN ‘Not Possible?’

      END AS
“Statistic Type”,

    CASE

            WHEN ss.has_filter = 1 THEN ‘Filtered Index’

            WHEN ss.has_filter = 0 THEN ‘No Filter’

      END AS
“Filtered?”,

    CASE

            WHEN ss.filter_definition
IS NULL THEN

            WHEN ss.filter_definition
IS NOT NULL THEN ss.filter_definition

      END AS “Filter
Definition”,

    sp.last_updated AS
“Stats Last Updated”,

    sp.rows AS “Rows”,

    sp.rows_sampled AS
“Rows Sampled”,

    sp.unfiltered_rows AS
“Unfiltered Rows”,

      sp.modification_counter AS
“Row Modifications”,

      sp.steps AS
“Histogram Steps”

FROM sys.stats ss

JOIN sys.objects so ON ss.object_id = so.object_id

JOIN sys.schemas sch ON so.schema_id = sch.schema_id

OUTER APPLY sys.dm_db_stats_properties(so.object_id, ss.stats_id) AS sp

WHERE so.TYPE = ‘U’

AND sp.last_updated <
getdate() - 30

ORDER BY sp.last_updated
DESC
;

You can take this query one step further and modify it to show statistics that have a row_modfications value above a specific threshold (WHERE sp.modification_counter > 10000) or statistics that don’t exist at all (WHERE sp.last_updated IS NULL).At its core, this DMF can make statistics management much easier, and in a future post I will share a script that can be used to update statistics based on different factors such as last updated date and number of modifications.  Until then, have fun playing with this in your environment!