How to examine IO subsystem latencies from within SQL Server

(New for 2020: we’ve published a range of SQL Server interview candidate screening assessments with our partner Kandio, so you can avoid hiring an ‘expert’ who ends up causing problems. Check them out here.)

(Check out our Pluralsight online training course: SQL Server: Improving Storage Subsystem Performance.)

Edit: I blogged an update script to capture a snapshot of I/O latencies in this blog post.

Over the last few months I’ve been lecturing at classes and conferences about getting SQL Server’s view of the I/O subsystem and what latencies it is experiencing, so time for a blog post to help everyone else.

Most SQL Server’s today are I/O bound – that’s a generally agreed-on statement by DBAs and consultants in the field. This means that the major factor in server performance is its ability to perform I/O operations quickly. If the I/O subsystem cannot keep up with the demand being placed on it, then the SQL Server workload will suffer performance problems.

Now, saying that, one trap that many people fall into is equating increased I/O subsystem latencies with poor I/O subsystem performance. This is often not the case at all. It’s usually the case that the I/O subsystem performs fine when the designed-for I/O workload is happening, but becomes the performance bottleneck when the I/O workload increases past the I/O subsystem design point. The I/O workload increase is what’s causing the problem, not the I/O subsystem – if you design an I/O subsystem to support 1000 IOPS (I/O operations per second – and making sure you’re using the right I/O size and the workload characteristics make sense to be defined in terms of the number of random IOPS) and SQL Server is trying to push 2000 IOPS, performance is going to suffer.

If you find that I/O latency has increased, look to a change in SQL Server behavior before blaming the I/O subsystem. For instance:

Query plan changes from out-of-date statistics, code changes, implicit conversions, poor indexing that cause table scans rather than index seeks
Additional indexes being created that cause increased index maintenance workload and logging
Access pattern/key changes that cause page splits and hence extra page reads/writes and logging
Adding change data capture or change tracking that causes extra writes and logging
Enabling snapshot isolation that causes tempdb I/Os, plus potentially page splits
Decreased server memory leading to a smaller buffer pool and increased lazy writer and read activity

And a whole host of other reasons can lead to an increased I/O workload where it’s not the I/O subsystem’s fault.

On the other hand, however, it may very well be the I/O subsystem that has an issue if the SQL Server workload is the same. A SAN administrator may have decided to give some of the space on one of the SQL Server LUNs to another server, which can lead to an overload.

Using performance counters, you can see at the physical disk level what the latencies are (looking at the Avg. Disk sec/Read and Avg. Disk sec/Write counters) but if you have your databases spread across a few LUNs, that doesn’t help you pinpointing which database files are experiencing latency issues and driving the most I/Os to the I/O subsystem.

sys.dm_io_virtual_file_stats

This is where sys.dm_io_virtual_file_stats comes in. It was introduced in SQL Server 2005 as a beefed-up replacement for fn_virtualfilestats and shows you how many I/Os have occurred, with latencies for all files. You can give it a database ID and a file ID, but I found it most useful to look at all the files on the server and order by one of the statistics.

Here’s the default output (with some column names changed slightly to make things fit nicely):

SELECT * FROM sys.dm_io_virtual_file_stats (NULL, NULL);
GO

db_id file_id sample_ms reads bytes_read io_stall_read_ms writes bytes_written io_stall_write_ms io_stall size_on_disk_bytes file_handle
----- ------- --------- ----- ---------- ---------------- ------ ------------- ----------------- -------- ------------------ ------------------
1     1       14433009  52    3227648    58               10     90112         12                70       515964928          0x000000000000013C
1     2       14433009  55    1241088    36               28     122880        115               151      12648448           0x0000000000000474
2     1       14433009  22    1261568    22               4      147456        1                 23       8388608            0x000000000000065C
2     2       14433009  6     385024     2                7      154112        3                 5        524288             0x0000000000000660
3     1       14433009  31    1859584    18               1      8192          0                 18       1310720            0x0000000000000524
3     2       14433009  6     385024     5                3      16384         1                 6        524288             0x000000000000056C
4     1       14433009  139   8175616    369              13     106496        45                414      260767744          0x0000000000000958
4     2       14433009  136   712704     485              7      53248         25                510      104595456          0x00000000000009F0
5     1       14433009  20    1138688    204              0      0             0                 204      419430400          0x0000000000000980
5     2       14433009  13    411136     466              1      2048          13                479      104857600          0x0000000000000A44
6     1       14433009  15    811008     81               1      8192          8                 89       1310720            0x0000000000000954
6     2       14433009  7     386560     89               4      11264         80                169      516096             0x00000000000009EC
7     1       14433009  15    811008     80               1      8192          9                 89       1310720            0x0000000000000970
7     2       14433009  7     386560     130              3      9216          17                147      516096             0x00000000000009B4
8     1       14433009  16    876544     37               1      8192          103               140      136577024          0x0000000000000988
8     2       14433009  9     229888     11               4      10752         59                70       5242880            0x00000000000009E4
.
.

This isn’t that useful because a) I don’t have database IDs and file paths memorized, and b) it gives aggregate latencies (io_stall_read_ms and io_stall_write_ms).

Viewing Aggregate Information

What I usually do is use the script below – part of my standard set of scripts I use when doing a server health check for a client. It’s based in part on code from by good friend Jimmy May (blog|twitter), with a bunch of tweaks. It allows me to filter on read or write latencies and it joins with sys.master_files to get database names and file paths.

SELECT
    [ReadLatency] =
        CASE WHEN [num_of_reads] = 0
            THEN 0 ELSE ([io_stall_read_ms] / [num_of_reads]) END,
    [WriteLatency] =
        CASE WHEN [num_of_writes] = 0
            THEN 0 ELSE ([io_stall_write_ms] / [num_of_writes]) END,
    [Latency] =
        CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0)
            THEN 0 ELSE ([io_stall] / ([num_of_reads] + [num_of_writes])) END,
    [AvgBPerRead] =
        CASE WHEN [num_of_reads] = 0
            THEN 0 ELSE ([num_of_bytes_read] / [num_of_reads]) END,
    [AvgBPerWrite] =
        CASE WHEN [num_of_writes] = 0
            THEN 0 ELSE ([num_of_bytes_written] / [num_of_writes]) END,
    [AvgBPerTransfer] =
        CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0)
            THEN 0 ELSE
                (([num_of_bytes_read] + [num_of_bytes_written]) /
                ([num_of_reads] + [num_of_writes])) END,
    LEFT ([mf].[physical_name], 2) AS [Drive],
    DB_NAME ([vfs].[database_id]) AS [DB],
    [mf].[physical_name]
FROM
    sys.dm_io_virtual_file_stats (NULL,NULL) AS [vfs]
JOIN sys.master_files AS [mf]
    ON [vfs].[database_id] = [mf].[database_id]
    AND [vfs].[file_id] = [mf].[file_id]
-- WHERE [vfs].[file_id] = 2 -- log files
-- ORDER BY [Latency] DESC
-- ORDER BY [ReadLatency] DESC
ORDER BY [WriteLatency] DESC;
GO

ReadLatency WriteLatency Latency AvgBPerRead AvgBPerWrite AvgBPerTransfer Drive DB                      physical_name
----------- ------------ ------- ----------- ------------ --------------- ----- ----------------------- ------------------------------------------
2           103          8       54784       8192         52043           D:    HotSpot                 D:\SQLskills\HotSpot_data.mdf
7           48           13      63061       5632         54857           C:    DemoCorruptSystemTables C:\SQLskills\corrupt_log.LDF
32          41           33      43861       2048         39680           D:    DemoSystemIndex3        D:\SQLskills\DemoSystemIndex3_LOG.ldf
27          40           28      43861       1024         39577           D:    DemoSystemIndex5        D:\SQLskills\DemoSystemIndex5_LOG.ldf
23          39           24      43861       7168         40192           D:    DemoSystemIndex4        D:\SQLskills\DemoSystemIndex4_LOG.ldf
7           38           10      43861       4608         39936           D:    DemoSystemIndex1        D:\SQLskills\DemoSystemIndex1_LOG.ldf
11          25           12      43861       7168         40192           C:    FSWaits                 C:\SQLskills\FSWaits_log.ldf
12          20           13      55222       6656         49152           C:    newdatabase             C:\SQLskills\newdatabase_log.LDF
12          20           15      55222       2816         36165           C:    LockResDemo             C:\SQLskills\LockResDemo_log.LDF
14          17           15      45933       2816         30254           C:    TempdbTest              C:\SQLskills\TempdbTest_log.LDF
1           14           5       25543       2688         18510           C:    HotSpot                 C:\SQLskills\HotSpot_log.ldf
40          14           39      31625       7680         29915           D:    RecompileTest           D:\SQLskills\RecompileTest_log.ldf
9           13           10      51858       7680         46336           D:    DemoFatalCorruption1    D:\SQLskills\DemoFatalCorruption1_log.ldf
35          13           34      31625       2048         29513           D:    GUIDTest                D:\SQLskills\GUIDTest_log.ldf
12          13           12      16861       1024         15872           C:    FNDBLogTest             C:\SQLskills\FNDBLogTest_log.LDF
.
.

This is much more useful as it allows me to quickly see where the read and write hot spots are and then drill into a database to see what’s going on, and if nothing out of the ordinary, ask the SAN admin to move those hot spot files to dedicated and/or faster storage.

Making Use Of This Information

Beware: This information is the aggregate of all I/Os performed since the database was brought online, so the I/O times reported by the script are averages. The longer the time the data set covers, the more likely that transient latency spikes will be masked. The best way to use this data for analysis is to periodically capture snapshots of the sys.dm_db_io_virtual_file stats data and then examine the averages over that small time period. That way you’ll be able to see latency spikes. [Edit 2014: I blogged a script to capture a snapshot of I/O latencies here.]

One question I’m sure you’re going to ask is “what is an acceptable read or write latency?” and the answer is a big “it depends!”. It depends on the I/O subsystem and how it’s been configured. The key is producing a performance baseline for when things are running acceptably well and then seeing where the results from the DMV deviate from your baseline. On well-configured storage which isn’t being overloaded I’d expect to see single-digit ms for either read or write latency, but this will vary based on the rotation speed and technology of the drives (SCSI vs SATA vs SSD etc).

Enjoy!

72 thoughts on “How to examine IO subsystem latencies from within SQL Server”

Ray Herring says:

April 11, 2011 at 10:11 am

WriteLatency = CASE WHEN io_stall_write_ms = 0
THEN 0 ELSE (io_stall_write_ms / num_of_writes) END

I think you want to test num_of_writes = 0 in this case.

Reply
1. Robert says:
  
  October 18, 2019 at 1:02 pm
  
  I think the original condition is to avoid division by 0
  
  Reply
Manjot says:

April 11, 2011 at 2:30 pm

Hi,

Thanks for the script.
How can I find out "I/O subsystem configuration"? Do I need to ask the windows engineers who look after the server?

Reply
paul says:

April 11, 2011 at 10:15 am

Indeed – blame messing around with formatting. Thanks!

Reply
alex says:

April 11, 2011 at 5:38 pm

This will display the file name instead of outputting the full path:

SUBSTRING(mf.physical_name, len(mf.physical_name)-CHARINDEX(‘\’,REVERSE(mf.physical_name))+2, 100)

I find this script to be quite useful when capturing data in segments especially during a job execution and peak usage.

Reply
Wes Brown says:

April 11, 2011 at 2:22 pm

Hey Paul!
I blogged about this earlier in the year too. I’ve got a method to collect the data over time.

http://sqlserverio.com/2011/02/08/gather-virtual-file-statistics-using-t-sql-tsql2sday-15/

Reply
paul says:

April 11, 2011 at 5:14 pm

@Manjot Yes – you’ll need to ask whoever manages the I/O subsystem.

Reply
amosquera says:

April 12, 2011 at 3:47 am

Hi,

Could be operations like DBCC CEHCKDB, REBUILD, REORGANIZE REBUILD INDEX, UPDATE STATISTICS, … spoil this measures ?.

Reply
paul says:

April 12, 2011 at 3:51 am

Well they won’t spoil the measurement, but they will add a lot of extra I/O workload and that needs to be taken into account when designing your I/O subsystem. An alternative is to offload the CHECKDB workload to another system after restoring a full backup there.

Reply
Aukcje Alegro says:

April 16, 2011 at 1:14 am

I am quite interesting in this topic hope you will write more on it in future posts

Reply
scorellis says:

March 21, 2013 at 12:45 pm

hi Paul,

i am trying to estimate the total bytes/sec of throughput that i am getting from my sql server and am wondering if it is accurate to say that my total bytes/sec of throughput would be = num_of_bytes_written/io_stall_write_ms?

thank you
scorellis

Reply
1. Paul Randal says:
  
  March 23, 2013 at 12:48 pm
  
  No – dividing by sample_ms is the correct way, but that only gives you an average, not your peak throughput. You need to use a benchmarking tool to get that.
  
  Reply
  1. Jean-Philippe says:
    
    March 27, 2013 at 1:28 am
    
    Hi
    I often use the sys.dm_io_virtual_file_stats DMV to calculate Avg Service Time for a file and also for a drive (by aggregation).
    Recently, i have compared DMV latency results (ReadLatency and WriteLatency for each drive) and perfmon counters (Avg. Disk sec/Read and Avg. Disk sec/Write)
    It showed [Avg Time ms/Read] was more important in DMV results than perfmon counters.
    what is the reason for that ?
    
    Thanks
    
    Reply
    1. Paul Randal says:
      
      March 27, 2013 at 8:31 am
      
      Not sure what you mean by ‘more important’? If you mean that the avg time was higher in the DMV than in perfmon, could be that there are other uses of that drive doing I/Os that will be taken into account by perfmon, but not by the DMV, which is only measuring I/O latency for those files managed by SQL on that drive. Remember also that the DMV is aggregated data since the database came online whereas perfmon is a rolling average – that could also make perfmon lower if there have been some spikes of high I/O latency in the past that the DMV is still taking into account.
      
      Reply
      1. Jean-Philippe says:
        
        March 28, 2013 at 3:21 am
        
        Hello and thank you for your reply
        In fact, I have a process that collects data from the dmv dm_io_virtual_file_stats every minute and that makes the difference between each collection.
        I therefore compare the results of the dmv (select Drive, sum (io_stall_read_ms) / sum (num_of_reads) from group by dm_io_virtual_file_stats_diff Drive) with perfmon counters (Avg. Disk sec / Read)
        From a general way, we have similar time but when high latency disk, the time seen through my collection “DMV” are more important than the perfmon data (20-40%).
        I am sure there is no other activity on the monitored drive (just SQLServer activity)
        Is it possible to explain this phenomenon?
        Is that these 2 two types of counters (perfmon and DMV) is based on the same metric?
        I hope they are not stupid questions
        
        Thank you for your help
        Jean-Philippe
        
        NB : sorry for my english
      2. Paul Randal says:
        
        March 28, 2013 at 9:41 am
        
        They’re not actually the same metric. The DMV is measuring everything from SQL Server to the disk and back. The PerfMon counters are just measuring the time Windows sees for the I/O. There could things like filter drivers etc in the way. Any significant difference between them can be explained by something taking time between SQL Server and Windows issuing the I/O.
Karthik says:

April 2, 2013 at 12:01 am

Dear Paul,
I am seeing lot of negative values in Latency value of the given query. Is that the indication of an IO issue? Also I am seeing lot of events like this in eventlog and sql error log. SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file. Please advice.

Reply
1. Paul Randal says:
  
  April 2, 2013 at 7:51 am
  
  I don’t see how you can get negative values for latencies, unless the DMV is returning negative values for total latencies, which would be a bug. If you’re seeing I/O stall warnings in the error log, that’s usually a sign of problems with your I/O subsystem.
  
  Reply
2. Peter says:
  
  May 28, 2013 at 8:15 am
  
  Hello,
  
  Since the dmv gives cumulative numbers, I persist the results each hour and compare sequential results. This way we found out our latencies are “acceptable” during production hours, while they are terrible during night due to index rebuilds, integrity checks, database backups, system backups, etl processes etc.
  
  While creating the view (eg ([io_stall_read_ms] – previous.[io_stall_read_ms]) / ([num_of_reads] – previous.[number_of_reads]) I also saw negative values, caused by the fact the services had been restarted (counters are reset) between 2 measurements. After testing on service restart using (SELECT crdate FROM sys.sysdatabases WHERE lower(NAME) = ‘tempdb’) I still saw some negative numbers. It seems to me this is caused by database restore, so now I just test whether bytesread and byteswritten are higher in the previous measurement. In that case I assume the values represent the first measurement, I can compare with.
  
  Kind Regards
  Peter.
  
  Reply
  1. Allen McGuire says:
    
    March 20, 2014 at 7:40 am
    
    I just started doing the same Peter. Doing any sort of computation on the DMV results is very, very misleading. Where the DMV was telling me I had IO wait times for writes of >200ms for each of my TempDB data files, my new hourly/differential report indicates the actual to be <5ms throughout the day with very few exceptions thus far. Since I just set this up three days ago, I need to gather more data but I see nothing that would explain why the DMV is returning such a high value. I don't do any crazy maintenance on the weekends – time will tell I suppose.
    
    Reply
Eric M says:

July 1, 2013 at 2:19 pm

I’m seeing high avg. disk sec /read , and sec /write, but no high page waits in perfmon. Trying to figure out how to compare those values, vs dm_io_pending_io_pending requests, and sys.dm_io_virtual_file_stats. My issue is I have four data files two split across two LUNS each total 660 Gigs. Plan is to move off to four LUNS put one data file per drive on an upgraded o/s, with more memory, cpu etc. My question is there a rule of thumb as to when to add more data files to your databases. And during a rebuild operation does it split the load across the number of data files defined in your database?

Reply
1. Paul Randal says:
  
  July 3, 2013 at 9:38 am
  
  What do you mean by ‘page waits in perfmon’? Are you looking at wait statistics? You should be seeing PAGEIOLATCH_XX waits. Too complex a topic for a comment. Read through my benchmarking and IO subsystem series. Yes, during a rebuild the allocations will come from all files in the filegroup where the index is stored.
  
  Reply
Brian ODwyer says:

December 19, 2013 at 10:57 am

The sample_ms column has a bug http://connect.microsoft.com/SQLServer/feedback/details/795556 which does return negagive numbers.

There are two fixes depends on version of SQL server, 2005 uses one method 2008 and higher another.

Reply
Binny Mathew says:

January 27, 2014 at 2:26 am

Thank you for the query in the blog. I ran the above query and found that the write latency is quite high(532) for
tempdb and for another user database which stores images in this case 19. We have a clustered environment running
in sql2008R2. Unfortunately all the mdf ldf and tempdb had been installed in one lun(Drive:E). How can we improve the
I/o performance although the users have not complained so far.

Reply
1. Paul Randal says:
  
  January 27, 2014 at 8:25 am
  
  Move the files to different parts of the I/O subsystem, for instance.
  
  Reply
dan says:

March 6, 2014 at 3:39 pm

Dear Paul,
I ran your query on my laptop, got:
ReadLatency 29
wrtieLatency 422
Latency 57
AvgBPerRead 63666
AvgBPerWrite 8192
AvgBperTranfer 59745

I think wrtieLatency wrtieLatency 422 is high on AdventureWorks .mdf file. Now what should I do to fix it? Can you please help me to understand?

Reply
1. Paul Randal says:
  
  March 9, 2014 at 12:50 pm
  
  Yup – that’s very high. You’re going to have to follow the steps in the blog post to reduce the amount of writes or address a shortfall in the I/O subsystem.
  
  Reply
Nick says:

April 23, 2014 at 3:02 am

Paul, thanks for your ever useful information. I was wondering if you could suggest where to look in my situation. We have our 64bit Enterprise, 96GBRam DWH server, connected to an array of SAN disks. the disk IO hasn’t been great so I ran your query – the top 13 files:

ReadLatency WriteLatency Latency AvgBPerRead AvgBPerWrite AvgBPerTransfer Drive
29 4857 2598 65169 68134 66747 G:
29 4856 2598 65167 68138 66748 G:
28 4833 2585 65175 68132 66749 G:
28 4820 2576 65156 68127 66736 G:
726 535 694 427252 129080 376745 G:
548 4 525 316994 34130 305151 G:
403 5 401 199901 28582 199270 E:
74 680 376 63511 65044 64275 E:
375 16 375 265342 8192 265248 E:
356 10 350 285793 8199 281173 E:
331 11 330 217433 8192 216726 E:
272 7 261 180452 18909 173827 E:
289 28 260 197289 11691 176490 E:

Which look pretty awful.

However, I then used SQLIO to test performance straight on the disks (G: and E:). The performance was excellent for example:

IOs/sec: 106183.79
MBs/sec: 829.56
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 0
Max_Latency(ms): 785
histogram:
ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 94 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

There’s a mismatch between the actual performance of the SAN (which is excellent) and SQL Server’s experience of it and I cannot figure out what steps to take next. What would impact SQL Server’s use of the disks?

Thanks for your help !!!!

N

Reply
1. Paul Randal says:
  
  April 24, 2014 at 8:29 am
  
  Is your SQLIO test properly simulating your production workload? That’s all I can think of. You might also try taking two snapshots of the DMV data and then diff’ing them to see what I/O characteristics are over a period of time, rather than for all time. Could be that performance sucked earlier and now it’s fine.
  
  Reply
  1. Nick says:
    
    April 28, 2014 at 6:26 am
    
    Thanks, Paul. I guess it isn’t. Maybe SQLIOSim is a better tool to try.
    Cheers,
    Nick
    
    Reply
Juan Cabrera says:

July 11, 2014 at 12:34 pm

Paul, for calculating read(or write) latency you have the following:
[ReadLatency] =
CASE WHEN [num_of_reads] = 0
THEN 0 ELSE ([io_stall_read_ms] / [num_of_reads]) END

Why are we dividing by num_of_reads? Shouldn’t we dividing by sample_ms instead? Don’t we want to see the average number of read stalls in the span of time? I am sure there is an explanation, and it is possible that sleep depravity has caused brain latency in my head :)

thanks,

Juan Cabrera.

Reply
1. Paul Randal says:
  
  July 11, 2014 at 12:54 pm
  
  No – we’re looking for ms per read and ms per write – i.e. average read/write latency.
  
  Reply
  1. Juan Cabrera says:
    
    July 14, 2014 at 8:15 am
    
    Yup, sleep depravation alright. After a restful weekend it makes perfect sense.
    
    thanks,
    
    Juan.
    
    Reply
Pawel says:

September 23, 2014 at 10:31 am

Has anyone taken this script and turned it into a way to collect and aggregate the stats? Would be interested in how this data trends over time. As opposed to seeing just the total.

Reply
1. Paul Randal says:
  
  September 24, 2014 at 1:27 pm
  
  I don’t know – should be easy to do.
  
  Reply
eric81 says:

October 8, 2014 at 10:25 am

Paul, does the DMV io stats match to what we’re seeing in windows perfmon? for example I noticed 123 ms read latency but when I go to the DMV I don’t see anything that high. Just asking

Reply
1. Paul Randal says:
  
  October 8, 2014 at 10:34 am
  
  You have to remember that the DMV is the aggregate over the time the database has been online, so the average may mask any transient latency spikes.
  
  Reply
Robert says:

October 27, 2014 at 8:06 am

Just to confirm, these are cumulative stats.

So to get a true estimate of latency, if I sample every hour, subtract sample 1 (baseline) from sample 2 (an hour later). At least this will get me the latency at that point in time, correct?

Reply
1. Paul Randal says:
  
  October 28, 2014 at 9:49 pm
  
  Correct
  
  Reply
chung says:

October 30, 2014 at 1:55 pm

Does any know how to get Avg. Disk sec / Read from DMV instead of Perform Monitor? I don’t see this information in any DMVs.

Reply
1. Paul Randal says:
  
  October 30, 2014 at 2:04 pm
  
  That’s what the script in the blog post gives you – avg. read latency.
  
  Reply
chung says:

October 30, 2014 at 2:15 pm

Ah, I see. The name is confusing. Thanks Paul!!!!

Reply
chung says:

October 30, 2014 at 2:19 pm

Paul,

One more question. We are using AZURE DB and I find out these DMVs won’t work anymore. Do you have other script for AZURE? Thanks in advance.

Chung.

Reply
1. Paul Randal says:
  
  October 30, 2014 at 2:54 pm
  
  Sorry – I don’t do Azure at all so I don’t know how to do that in Azure.
  
  Reply
chung says:

October 31, 2014 at 7:28 am

Paul,

You mentioned that ” if you design an I/O subsystem to support 1000 IOPS (I/O operations per second) and SQL Server is trying to push 2000 IOPS, performance is going to suffer.” Is there a DMVs I can use to find the IOPS of my current DB is pushing? How do I find out if I push more IOPS than the system can support?

Thanks in advance.

Chung.

Reply
1. Paul Randal says:
  
  October 31, 2014 at 12:21 pm
  
  You can use the IO count fields in the DMV from this article to see how many IOs per second, but to get a really detailed view, you may need to move to using Extended Events and aggregating the IO data from there – but that’s likely overkill.
  
  Reply
Awadh says:

February 12, 2015 at 3:11 pm

Hi there,

when I run the script, I got two entries as follows:
1-
17 ,0 ,17, 515665, 8192, 515665 ,C: local C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\Research2.mdf

2-
8 ,0, 8, 5347, 3379, 5335, C: local C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\Research2_log.ldf
the query is read intensive one as you can see from entry 1 but why there is ave write latency when there is no write latency?

for entry 2 as you can see, latency time is shorter than entry 1 should I include its data in my analysis?

the query runs for 9 days before network failure occurs.

thank for your help

Reply
1. Paul Randal says:
  
  February 19, 2015 at 12:54 am
  
  I don’t see any avgwritelatency in the data you posted above…
  
  Reply
Harsh Jain says:

February 13, 2015 at 3:12 am

Hi Paul,

Is there any DMV/script by which we can know, which query is consuming high I/O of storage?

Reply
1. Paul Randal says:
  
  February 15, 2015 at 1:36 pm
  
  Look in one of the query stats DMVs for the I/O counts.
  
  Reply
dan says:

March 10, 2015 at 10:41 am

Hi Paul,
I have first row with:
Readlatency 623
Writelatency 266
Latency 569
AvgBPerRead 51856
AvgBperwrite 40577
AvgBperTransfer 50166
What do you see what’s happening? What are the suggestions?

Reply
1. dan says:
  
  March 10, 2015 at 10:43 am
  
  ++ Physical name
  G:\DATA\mydb.mdf
  
  Reply
2. dan says:
  
  March 10, 2015 at 10:43 am
  
  sorry, forgot to add below:
  Physical name
  G:\DATA\mydb.mdf
  
  Reply
Balaji Ram says:

July 29, 2015 at 1:30 pm

Hello Paul, Thanks for the very informative blogpost. We are trying to set up our new server. we configured RAID 10 SSDs(8 * 800GB) for tempdb. Even with SSDs, We are seeing slow write latency for LOG file. We did a single threaded test(no other user query was running) and did a

select * from into #t test(9GB table, table is already entirely on buffer cache).

We are consistenly seeing Write IO latency for temdb log file between 6ms and 7ms.

Can you please advise if this the best we can get out of SSD for LOG file write latency? what are we doing wrong? Would putting log file on seperate SSD LUN help(thought it wouldn’t as we don’t have spinning discs)? Any other suggestion to help improve this writelog latency? Thanks.

Reply
1. Paul Randal says:
  
  August 3, 2015 at 6:45 pm
  
  Are you sure you’ve configured the array correctly? I can get log write latency of 1ms or less on my laptop’s internal SSD. I would get an IO subsystem expert involved to help you configure things correctly.
  
  Reply
dan says:

May 9, 2016 at 6:56 am

HI Paul,
I ran the script on my PROD and got:
all 6 .mdf(.ndf) of TempDB files have ‘ReadLatency’ (6), WriteLatency (106), Latency (77)….. and a .ldf file has ‘ReadLatency (3), Write latency (6) and ‘Latency (6)….
Looks like it’s a problem but no complain from customer.
If I want to find cause where should I look into?
Thanks

Reply
1. Paul Randal says:
  
  May 9, 2016 at 7:47 am
  
  I’d be looking at why tempdb is being used so much (temp tables, memory spills, snapshot isolation), and probably also seeing whether there’s I/O subsystem contention.
  
  Reply
  1. dan says:
    
    May 9, 2016 at 10:12 am
    
    In My case should ‘snapshot isolation’ be turned ON or OFF (it’s off now) on TempDB? (just want to know your suggestion I am not changing on Production.)
    What’s the best way (script) to find ‘memory spills’?
    Thanks in advance.
    
    Reply
    1. Paul Randal says:
      
      May 9, 2016 at 10:17 am
      
      I don’t know anything about your workload, but usually people don’t use either flavor of snapshot isolation in tempdb. Lots of different ways to look at spills – Google ‘sql server memory spills’ and read some of the posts out there.
      
      Reply
Lakshmi Saripalli says:

June 15, 2016 at 1:47 pm

Hi Paul,
I used the query on this blog post to look at IO latency on my Prod server and found that my Write Latency is high.
Then I used your script that waits for 30 minutes and used that and it shows write latency as zero.
What Am I doing wrong?I was expecting fairly similar numbers from both queries.

Reply
1. Paul Randal says:
  
  June 15, 2016 at 1:51 pm
  
  How are you measuring write latency with the wait stats script? And did you capture a 30-minute snapshot of both at the same time?
  
  Reply
  1. Lakshmi Saripalli says:
    
    June 16, 2016 at 10:11 am
    
    Isnt this part of the query where it tells me write latencies?
    [WriteLatency] =
    CASE WHEN [num_of_writes] = 0
    THEN 0 ELSE ([io_stall_write_ms] / [num_of_writes]) END,
    
    I used this.
    The second query outputs the delta between two time periods(30 minutes apart ) and I shouldn’t be seeing too much delta am I correct?
    I first ran the query to give me all latencies and then I ran the 30 min delay query.
    
    Reply
    1. Paul Randal says:
      
      June 17, 2016 at 12:41 am
      
      Yes. But I asked ‘How are you measuring write latency with the wait stats script?’. And you didn’t run the queries at the same time – so you can’t correlate the I/O latencies with the waits, as they were measured at different times…
      
      Reply
      1. Lakshmi says:
        
        June 21, 2016 at 12:36 pm
        
        Paul,
        I’m confused.I thought your script was just doing that-using the DMV to measure write latencies. Looks like I’m really missing something big here. Could you please point me to an article or help me understand this better please?
      2. Paul Randal says:
        
        June 22, 2016 at 7:32 am
        
        Oh – I thought you said you measured latencies and then measured wait stats – my bad. Ok – so the script that just looks at the DMV output is showing average latencies since the server was started/database came online. The 30 minute script shows the I/Os that occurred just in those 30 minutes. If there were no I/Os, or no I/O contention, it’s entirely possible that the second script shows low latencies but the overall aggregate shows higher averages. Or vice-versa too.
Lakshmi Saripalli says:

June 23, 2016 at 6:24 am

Thank you Paul, that explains everything. The server latencies were high since it was up from a long time and I have Ola’s maintenance plan with index rebuilds that may have spiked it. over a period of 30 minutes at various time of the business day the delta of latencies are almost minimal(single digit), but my writes and sorts in Tempdb are way high for few of our stored proc’s in an application and now trying to test using your queries if putting TEmpdb on a separate LUN will reduce them.Thank you very much once again.

Reply
Nauman says:

September 9, 2016 at 2:24 am

Hello Paul,

I have separate tempdb data drive T and separate tempdb db Log drive, all ssd. DMV ‘sys.dm_io_virtual_file_stats’ (using your above query) showing T Drive tempdb data drive write-latency is 83. Perfmon and Resource monitor is not showing any disk latency at the same time. how can i prove that DMV results are correct. I’m arguing with system administrators. there is nothing else on Tempdb data drive T.

Thanks,
Nauman.

Reply
1. Paul Randal says:
  
  September 11, 2016 at 12:16 pm
  
  Remember that this is based on a metric that’s aggregated since SQL Server started. You’ll get more accurate results capturing the metrics for a period of time (e.g. a day). See https://www.sqlskills.com/blogs/paul/capturing-io-latencies-period-time/
  
  Reply
Murali says:

October 2, 2018 at 1:05 pm

Hi Paul,

I understand the Read and Write Latency columns in the script. What’s this “Latency” column mean. Thank you for your time.

Reply
1. Paul Randal says:
  
  October 2, 2018 at 1:08 pm
  
  Average latency over all reads and writes.
  
  Reply
Ken says:

December 5, 2018 at 6:04 am

CHECKDB – High tempdb file write latency when checking document table

Hi Paul,

I am taking over the management of a server (Windows 2016, SQL Server 2017 Standard) which contains a 108 GByte database of which about 76 GBytes is a table holding documents. (These are mainly PDFs within the range 6 Mbytes to 40 Mbytes.) This table has its own filegroup.

Using this script, the baseline write latency on the tempdb files is about 12ms. When I setup and run my standard maintenance routines, the write latency on the tempdb files increases to over 3100ms.

I have narrowed the issue done to CHECKDB. When I run CHECKDB WITH PHYSICAL_ONLY followed by CHECKFILEGROUP on all the filegroups, except the filegroup containing the document table, the write latency of the tempdb files remains about 12ms. When I run CHECKFILEGROUP on the filegroup containing the document table the write latency of the tempdb files increases to over 3100ms.

Have you come across this before?

I am unable to off load the CHECKDB process to another server as, for various reasons, the data needs to be encrypted when it is not on the production server. (ie The test database is on the production instance and my automated backup restore check is also done on the production instance.) There is no money for more hardware.

I am inclined to ignore the high write latency on tempdb as CHECKFILEGROUP on the filegroup containing the documents table takes less than 9 minutes and is done in the middle of the night when the system is barely, if ever, used. Am I missing something?

Thanks,
Ken

Reply
1. Paul Randal says:
  
  December 10, 2018 at 12:14 pm
  
  Yes – it’s not CHECKDB as such, it’s the fact that the large LOB columns need to materialized wholly in memory, and these are temporarily stored in tempdb (all operations on LOB columns do this). Solutions: offload the checks, as you have done, or move the LOBs into FILESTREAM storage.
  
  Reply

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

How to examine IO subsystem latencies from within SQL Server

sys.dm_io_virtual_file_stats

Viewing Aggregate Information

Making Use Of This Information

72 thoughts on “How to examine IO subsystem latencies from within SQL Server”

Leave a Reply Cancel reply

Other articles

2024: the year in books

2023 (not a typo!): the year in books

Black Friday Super Sale!

Summer School Savings SALE is live!

The Curious Case of… missing log information

The Curious Case of… missing 832 errors during memory corruption

Imagine feeling confident enough to handle whatever your database throws at you.