A quick and somewhat nuanced tip today… When evaluating sys.dm_io_virtual_file_stats against collected perfmon counter data, be careful not to erroneously invalidate the latency findings by measuring and comparing against the averages of already-averaged collected performance monitor data (e.g. Avg. Disk sec/Write, Avg. Disk sec/Read ).  I’ve seen scenarios where there is true I/O latency that has been identified in dm_io_virtual_file_stats, but when the DBA goes to other teams to discuss, they lose the argument due to bad comparisons.

To illustrate the point, I created a new database on one of my older, non-RAID USB drives:

CREATE DATABASE [SampleIOHammering]
CONTAINMENT = NONE
ON  PRIMARY
( NAME = N'SampleIOHammering',
FILENAME = N'F:\SampleIOHammering.mdf' ,
SIZE = 8388608KB , FILEGROWTH = 1048576KB )
LOG ON
( NAME = N'SampleIOHammering_log',
FILENAME = N'F:\SampleIOHammering_log.ldf' ,
SIZE = 1048576KB ,
FILEGROWTH = 1048576KB );
GO

ALTER DATABASE [SampleIOHammering] SET RECOVERY SIMPLE;
GO

I’ve intentionally used slow disk and the database creation operation took 7 minutes and 5 seconds (no IFI being used).

Immediately after creating the database, I executed the following query (concentrating on write activity):

SELECT  file_id,
io_stall_write_ms,
num_of_writes
FROM    sys.dm_io_virtual_file_stats(DB_ID('SampleIOHammering'), NULL);

This shows the following results – which don’t truly represent the full lifecycle of the file creation and the associated latency:

file_id io_stall_write_ms num_of_writes
1 76 10
2 25 15

How can I tell that the accumulated stats are representative of the actual database creation?  I did so through associated perfmon counters which were being collected via a data collector set concurrently… Here were the averages for the drive over the database creation period (and this is the average over the averages):

image thumb Avoid false negatives when comparing sys.dm io virtual file stats data to perfmon counter data

So we’re definitely seeing latency that is beyond the recommended number represented in average disk seconds per write, and if anything, that latency isn’t yet fully reflected in the virtual file stats DMV (yet).

Next I created a new table and loaded it with 1.6 million rows from another database:

USE [SampleIOHammering];
GO

CREATE TABLE [dbo].[charge]
(
[charge_no] INT NOT NULL,
[member_no] INT NOT NULL,
[provider_no] INT NOT NULL,
[category_no] INT NOT NULL,
[charge_dt] 
2014-12-20, 31
NOT NULL, [charge_amt] [money] NOT NULL, [statement_no] INT NOT NULL, [charge_code] CHAR(2) NOT NULL ); GO INSERT  [dbo].[charge] SELECT  1.[charge_no], 1.[member_no], 1.[provider_no], 1.[category_no], 1.[charge_dt], 1.[charge_amt], 1.[statement_no], 1.[charge_code] FROM    [Credit].[dbo].[charge] AS c; GO

This shows the following virtual stats results:

file_id io_stall_write_ms num_of_writes
1 8917 79
2 575633 3783

Unlike the database creation, we’re now seeing some real stat accumulations. If we divide the IO stall write ms by the number of writes, we see about 152 ms as the average latency for the transaction log file (file 2).

And now if we look at perfmon counters, we see the following associated latency data averages across the collection time period for my INSERT:

image thumb Avoid false negatives when comparing sys.dm io virtual file stats data to perfmon counter data

Focusing on Avg. Disk sec/Write – we see 52 milliseconds certainly isn’t ideal – but at this point you may have the storage or Windows admin say something like “This doesn’t match your average of 152 milliseconds…  Your DMV must be wrong.”

But what if you switched from using an average of  the perfmon data to the maximum?

image thumb Avoid false negatives when comparing sys.dm io virtual file stats data to perfmon counter data

Making this change shifts the numbers as follows:

image thumb Avoid false negatives when comparing sys.dm io virtual file stats data to perfmon counter data

Notice now that the maximum average disk seconds per write is 156 milliseconds at peak – which is more in the ballpark of our 152 millisecond average we saw from a virtual file stats perspective.  Granted it is the maximum measurement – but it brings us back into alignment with the two different data sources. Again, this is a nuanced point – but I’ve noticed that sometimes non-SQL folks are doubtful about the SQL perspective of things, and so it helps to make sure you are comparing data in a fair way.