Capturing IO latencies for a period of time

In both my wait statistics pre-conference workshops at the PASS Summit and SQLintersection I promised to do a bunch of blog posts. The second one on the list is a simple script to allow you to capture all the reads, writes, and I/O latencies that occurred over a period of time.

The script does the following:

Creates two temporary tables
Captures the output from sys.dm_io_virtual_file_stats into the first table
Waits for a configurable delay (line 41 in the script – I made it 30 minutes in the example)
Captures the output from sys.dm_io_virtual_file_stats into the second table
Provides my usual virtual file stats output on the results

The original code and explanation about using the DMV are here.

Enjoy!

/*============================================================================
  File:     ShortPeriodIOLatencies.sql

  Summary:  Short snapshot of I/O latencies

  SQL Server Versions: 2005 onwards
------------------------------------------------------------------------------
  Written by Paul S. Randal, SQLskills.com

  (c) 2014, SQLskills.com. All rights reserved.

  For more scripts and sample code, check out http://www.SQLskills.com

  You may alter this code for your own *non-commercial* purposes (e.g. in a
  for-sale commercial tool). Use in your own environment is encouraged.
  You may republish altered code as long as you include this copyright and
  give due credit, but you must obtain prior permission before blogging
  this code.

  THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF
  ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED
  TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
  PARTICULAR PURPOSE.
============================================================================*/
 

IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
    WHERE [name] = N'##SQLskillsStats1')
    DROP TABLE [##SQLskillsStats1];

IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
    WHERE [name] = N'##SQLskillsStats2')
    DROP TABLE [##SQLskillsStats2];
GO

SELECT [database_id], [file_id], [num_of_reads], [io_stall_read_ms],
	   [num_of_writes], [io_stall_write_ms], [io_stall],
	   [num_of_bytes_read], [num_of_bytes_written], [file_handle]
INTO ##SQLskillsStats1
FROM sys.dm_io_virtual_file_stats (NULL, NULL);
GO

WAITFOR DELAY '00:30:00';
GO

SELECT [database_id], [file_id], [num_of_reads], [io_stall_read_ms],
	   [num_of_writes], [io_stall_write_ms], [io_stall],
	   [num_of_bytes_read], [num_of_bytes_written], [file_handle]
INTO ##SQLskillsStats2
FROM sys.dm_io_virtual_file_stats (NULL, NULL);
GO

WITH [DiffLatencies] AS
(SELECT
-- Files that weren't in the first snapshot
        [ts2].[database_id],
        [ts2].[file_id],
        [ts2].[num_of_reads],
        [ts2].[io_stall_read_ms],
		[ts2].[num_of_writes],
		[ts2].[io_stall_write_ms],
		[ts2].[io_stall],
		[ts2].[num_of_bytes_read],
		[ts2].[num_of_bytes_written]
    FROM [##SQLskillsStats2] AS [ts2]
    LEFT OUTER JOIN [##SQLskillsStats1] AS [ts1]
        ON [ts2].[file_handle] = [ts1].[file_handle]
    WHERE [ts1].[file_handle] IS NULL
UNION
SELECT
-- Diff of latencies in both snapshots
        [ts2].[database_id],
        [ts2].[file_id],
        [ts2].[num_of_reads] - [ts1].[num_of_reads] AS [num_of_reads],
        [ts2].[io_stall_read_ms] - [ts1].[io_stall_read_ms] AS [io_stall_read_ms],
		[ts2].[num_of_writes] - [ts1].[num_of_writes] AS [num_of_writes],
		[ts2].[io_stall_write_ms] - [ts1].[io_stall_write_ms] AS [io_stall_write_ms],
		[ts2].[io_stall] - [ts1].[io_stall] AS [io_stall],
		[ts2].[num_of_bytes_read] - [ts1].[num_of_bytes_read] AS [num_of_bytes_read],
		[ts2].[num_of_bytes_written] - [ts1].[num_of_bytes_written] AS [num_of_bytes_written]
    FROM [##SQLskillsStats2] AS [ts2]
    LEFT OUTER JOIN [##SQLskillsStats1] AS [ts1]
        ON [ts2].[file_handle] = [ts1].[file_handle]
    WHERE [ts1].[file_handle] IS NOT NULL)
SELECT
	DB_NAME ([vfs].[database_id]) AS [DB],
	LEFT ([mf].[physical_name], 2) AS [Drive],
	[mf].[type_desc],
	[num_of_reads] AS [Reads],
	[num_of_writes] AS [Writes],
	[ReadLatency(ms)] =
		CASE WHEN [num_of_reads] = 0
			THEN 0 ELSE ([io_stall_read_ms] / [num_of_reads]) END,
	[WriteLatency(ms)] =
		CASE WHEN [num_of_writes] = 0
			THEN 0 ELSE ([io_stall_write_ms] / [num_of_writes]) END,
	-- [Latency] =
		-- CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0)
			-- THEN 0 ELSE ([io_stall] / ([num_of_reads] + [num_of_writes])) END,
	[AvgBPerRead] =
		CASE WHEN [num_of_reads] = 0
			THEN 0 ELSE ([num_of_bytes_read] / [num_of_reads]) END,
	[AvgBPerWrite] =
		CASE WHEN [num_of_writes] = 0
			THEN 0 ELSE ([num_of_bytes_written] / [num_of_writes]) END,
	-- [AvgBPerTransfer] =
		-- CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0)
			-- THEN 0 ELSE
				-- (([num_of_bytes_read] + [num_of_bytes_written]) /
				-- ([num_of_reads] + [num_of_writes])) END,
	[mf].[physical_name]
FROM [DiffLatencies] AS [vfs]
JOIN sys.master_files AS [mf]
	ON [vfs].[database_id] = [mf].[database_id]
	AND [vfs].[file_id] = [mf].[file_id]
-- ORDER BY [ReadLatency(ms)] DESC
ORDER BY [WriteLatency(ms)] DESC;
GO

-- Cleanup
IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
    WHERE [name] = N'##SQLskillsStats1')
    DROP TABLE [##SQLskillsStats1];

IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
    WHERE [name] = N'##SQLskillsStats2')
    DROP TABLE [##SQLskillsStats2];
GO

Posted Under:

Buffer Pool, Example Scripts, IO Subsystems, Performance Tuning, Wait Stats

27 thoughts on “Capturing IO latencies for a period of time”

LondonDBA says:

November 12, 2014 at 3:17 pm

Thanks, Paul, this and the waits stats script will be very useful. Just a quick simplication that I think works. In that WITH clause you can replace this

SELECT
— Files that weren’t in the first snapshot
[ts2].[database_id],
[ts2].[file_id],
[ts2].[num_of_reads],
[ts2].[io_stall_read_ms],
[ts2].[num_of_writes],
[ts2].[io_stall_write_ms],
[ts2].[io_stall],
[ts2].[num_of_bytes_read],
[ts2].[num_of_bytes_written]
FROM [##SQLskillsStats2] AS [ts2]
LEFT OUTER JOIN [##SQLskillsStats1] AS [ts1]
ON [ts2].[file_handle] = [ts1].[file_handle]
WHERE [ts1].[file_handle] IS NULL
UNION
SELECT
— Diff of latencies in both snapshots
[ts2].[database_id],
[ts2].[file_id],
[ts2].[num_of_reads] – [ts1].[num_of_reads] AS [num_of_reads],
[ts2].[io_stall_read_ms] – [ts1].[io_stall_read_ms] AS [io_stall_read_ms],
[ts2].[num_of_writes] – [ts1].[num_of_writes] AS [num_of_writes],
[ts2].[io_stall_write_ms] – [ts1].[io_stall_write_ms] AS [io_stall_write_ms],
[ts2].[io_stall] – [ts1].[io_stall] AS [io_stall],
[ts2].[num_of_bytes_read] – [ts1].[num_of_bytes_read] AS [num_of_bytes_read],
[ts2].[num_of_bytes_written] – [ts1].[num_of_bytes_written] AS [num_of_bytes_written]
FROM [##SQLskillsStats2] AS [ts2]
LEFT OUTER JOIN [##SQLskillsStats1] AS [ts1]
ON [ts2].[file_handle] = [ts1].[file_handle]
WHERE [ts1].[file_handle] IS NOT NULL

with this:

SELECT
— Diff of latencies in both snapshots
[ts2].[database_id],
[ts2].[file_id],
[ts2].[num_of_reads] – COALESCE([ts1].[num_of_reads],0) AS [num_of_reads],
[ts2].[io_stall_read_ms] – COALESCE([ts1].[io_stall_read_ms],0) AS [io_stall_read_ms],
[ts2].[num_of_writes] – COALESCE([ts1].[num_of_writes],0) AS [num_of_writes],
[ts2].[io_stall_write_ms] – COALESCE([ts1].[io_stall_write_ms],0) AS [io_stall_write_ms],
[ts2].[io_stall] – COALESCE([ts1].[io_stall],0) AS [io_stall],
[ts2].[num_of_bytes_read] – COALESCE([ts1].[num_of_bytes_read],0) AS [num_of_bytes_read],
[ts2].[num_of_bytes_written] – COALESCE([ts1].[num_of_bytes_written],0) AS [num_of_bytes_written]
FROM [##SQLskillsStats2] AS [ts2]
LEFT OUTER JOIN [##SQLskillsStats1] AS [ts1]
ON [ts2].[file_handle] = [ts1].[file_handle]

Reply
Kevin says:

November 13, 2014 at 6:57 am

Thanks Paul!

Reply
thierry says:

November 17, 2014 at 3:02 am

very cool, just this morning I was thinking about setting up something like this. You saved me the work (again) and coming from sqlskills I can also safely assume it’s working as expected :) Thx again sir!

Reply
Yasser Salem says:

January 21, 2015 at 9:06 am

thanks Paul for the script. I ran it for 5 hours on one of my PROD instances to track an application job. Read latencies comes up extremely very much higher than what is reported from Perfmon counters of the same LUN and during the same time period. (i.e. Sec/Transfer, Sec/Write & Sec/Read). Is there a way to explain that .. or is there other factors that I can consider for more deep look at this?

Thanks and best regards ..

Reply
thierry says:

February 26, 2015 at 1:24 am

Hi Paul,

I was wondering if we could replace the delay statement by a t-sql statement which we wanted to know the i/o latency for (assuming I’m on an idle instance, no users, jobs… but me). Or is that a bit short-sighted?

Chrz
Thierry

Reply
1. Paul Randal says:
  
  February 26, 2015 at 6:47 pm
  
  You could, but only if you empty the buffer pool first using DBCC DROPCLEANBUFFERS. Note that this will not likely be a valid, run-time test as in a real workload some of the pages may already be in memory.
  
  Reply
  1. thierry says:
    
    March 2, 2015 at 2:30 am
    
    Thx Paul for the fast reply.
    The reason I asked is I wanted a quick way to see the difference between emc xtremeio cache enabled/disabled.
    I executed dropcleanbuffers in between each run as you mention (and checked using dm_os_buffer_descriptors)
    Regards
    Thierry
    
    Reply
Gaby A. says:

May 6, 2015 at 1:06 pm

Great script, and is a useful tool we can use to be a “Canary in the coalmine” script we can use to anticipate severe I/O issues before they actually impact our online environment. Question, I’m noticing your script does not do a division across 30 minutes to scale it. Does that mean if I ran this in 5 minute increments, the data would still be the same? Sorry about the question and hopefully not missing anything here.

Reply
1. Paul Randal says:
  
  May 7, 2015 at 5:10 am
  
  Yup – no scaling division necessary. It’s giving you the aggregate write latencies over (say) 30 minutes divided by the number of writes over that time.
  
  Reply
Gaby A. says:

May 7, 2015 at 10:29 am

Thanks very much. And also, to scale this into a trend, I was thinking about aggregating it by drive letter (e.g. use AVG on the values) but I’m thinking in OLTP, this may not be a good idea, and take the time to develop a good reporting mechanism to take that into account. Your thoughts on this?

Reply
1. Paul Randal says:
  
  May 7, 2015 at 10:34 am
  
  Totally depends what metrics you want. Be aware that if you’re doing AVG over all your collected results, you’re really just doing the same as looking at the output from the DMV without a time constraint.
  
  Reply
Gaby A. says:

May 7, 2015 at 10:49 am

Thanks again.

Reply
Ivan Yankulov says:

September 26, 2015 at 1:43 am

Hi Paul,

I cannot explain how good that script is!
I am SharePoint admin and I have used it with many customers.

I like powershell and I would like to make a PowerShell version of this script, so it will be easier for other SharePoint folks. I imagine it something like this .

I will publish the script in TechNet Gallery. Will include the copyright in the script help section and in the script description and will give credits and link to your blog when I publish it.

Can I have your permission to do so?

Thank you in advance!

Ivan

Reply
1. Paul Randal says:
  
  October 1, 2015 at 3:02 pm
  
  Thanks. Sure.
  
  Reply
  1. Ivan Yankulov says:
    
    October 17, 2015 at 4:28 am
    
    Finally found time and published the script http://spyankulov.blogspot.bg/2015/10/capture-sql-io-latencies-for-period-of.html
    
    Thanks,
    Ivan
    
    Reply
2. Keysp says:
  
  September 9, 2018 at 11:18 am
  
  Hi Ivan, I took a quick look at your PowerShell script, you had pointed out that it captures data for the duration of the specified “Minutes” variable.
  
  But it seems that it is fed to a “for” command that simply counts backwards, waits for the specified duration then runs only once.
  
  The CSV result file is equally large no matter the number of minutes specified. (in my environment approximately 4KB)
  Am I correct or am I missing something????
  
  Either which way, good job.
  
  Best Regards
  
  Reply
ReluctantDBA says:

January 3, 2016 at 10:01 am

I’m curious why you use the IO stall divided by the number of reads and writes as opposed to the number of bytes? Ozar does the same thing and when I asked him, the response I got was essentially, “because I know more about it than you.”

It’s not a criticism, just a lack of understanding on my part.

Reply
1. Paul Randal says:
  
  January 3, 2016 at 1:02 pm
  
  Yeah, that’s not a very useful/nice response. The io_stall is the total aggregate stall time for all reads and writes together. To get the average per IO stall time, divide total stall time by number of IOs. Using number of bytes would give the stall time per byte, which is meaningless. Hope that helps.
  
  Reply
  1. ReluctantDBA says:
    
    January 13, 2016 at 1:21 pm
    
    I’m pulling the differential data into a spreadsheet and calculating MB/Sec based on bytes read/write and read/write stall. Am I calculating that incorrectly? Is it even a valid metric?
    At the moment, I’m getting stellar read performance but my write rate is abysmal, so I’m trying to dig in and find out why.
    
    Reply
    1. Paul Randal says:
      
      January 15, 2016 at 2:05 pm
      
      Yes, I think so.
      
      Reply
EskeRahn says:

February 15, 2016 at 3:03 pm

Nice script! Thanks!

Slightly simplified a few things, and corrected handling when divisor is zero to display Null rather than Zero. (Strictly NaN – but that is splitting sticks).

If object_id(N’Tempdb..##SQLskillsStats1′) is not null
DROP TABLE [##SQLskillsStats1];

SELECT [database_id], [file_id], [num_of_reads], [io_stall_read_ms],
[num_of_writes], [io_stall_write_ms], [io_stall],
[num_of_bytes_read], [num_of_bytes_written], [file_handle]
INTO ##SQLskillsStats1
FROM sys.dm_io_virtual_file_stats (NULL, NULL);
GO

WAITFOR DELAY ’00:30:00′;
GO

SELECT
DB_NAME ([vfs].[database_id]) AS [DB],
LEFT ([mf].[physical_name], 2) AS [Drive],
[mf].[type_desc],
[num_of_reads] AS [Reads],
[num_of_writes] AS [Writes],
[ReadLatency(ms)] = [io_stall_read_ms] / NullIf([num_of_reads],0),
[WriteLatency(ms)] = [io_stall_write_ms] / NullIf([num_of_writes],0),
[Latency] = [io_stall] / NullIf([num_of_reads] + [num_of_writes],0),
[AvgBPerRead] = [num_of_bytes_read] / NullIf([num_of_reads],0),
[AvgBPerWrite] = [num_of_bytes_written] / NullIf([num_of_writes],0),
[AvgBPerTransfer] = ([num_of_bytes_read] + [num_of_bytes_written]) / NullIf([num_of_reads] + [num_of_writes],0),
[mf].[physical_name]
FROM (SELECT
— Diff of latencies
ts2.[database_id],
ts2.[file_id],
ts2.[num_of_reads] – Isnull(ts1.[num_of_reads],0) AS [num_of_reads],
ts2.[io_stall_read_ms] – Isnull(ts1.[io_stall_read_ms],0) AS [io_stall_read_ms],
ts2.[num_of_writes] – Isnull(ts1.[num_of_writes],0) AS [num_of_writes],
ts2.[io_stall_write_ms] – Isnull(ts1.[io_stall_write_ms],0) AS [io_stall_write_ms],
ts2.[io_stall] – Isnull(ts1.[io_stall],0) AS [io_stall],
ts2.[num_of_bytes_read] – Isnull(ts1.[num_of_bytes_read],0) AS [num_of_bytes_read],
ts2.[num_of_bytes_written] – Isnull(ts1.[num_of_bytes_written],0) AS [num_of_bytes_written]
FROM (
SELECT [database_id], [file_id], [num_of_reads], [io_stall_read_ms],
[num_of_writes], [io_stall_write_ms], [io_stall],
[num_of_bytes_read], [num_of_bytes_written], [file_handle]
FROM sys.dm_io_virtual_file_stats (NULL, NULL)
) AS ts2
LEFT OUTER JOIN [##SQLskillsStats1] AS ts1
ON ts2.[file_handle] = ts1.[file_handle]
) AS [vfs]
JOIN sys.master_files AS [mf]
ON [vfs].[database_id] = [mf].[database_id]
AND [vfs].[file_id] = [mf].[file_id]
— ORDER BY [ReadLatency(ms)] DESC
ORDER BY [WriteLatency(ms)] DESC, [ReadLatency(ms)] DESC, DB;
GO

— Cleanup
If object_id(N’Tempdb..##SQLskillsStats1′) is not null
DROP TABLE [##SQLskillsStats1];

Reply
Wellington Prado says:

February 17, 2016 at 12:23 pm

Hi,

Great Job!!!
I created a JOB that insert’s in the table log each 1 minute the information using your approach:

1) Create the table log:

SELECT getdate() as Collection_time, [database_id], [file_id], [num_of_reads], [io_stall_read_ms],
[num_of_writes], [io_stall_write_ms], [io_stall],
[num_of_bytes_read], [num_of_bytes_written], [file_handle]
INTO tb_io_lat
FROM sys.dm_io_virtual_file_stats (NULL, NULL)
where 1=2

2) Create the procedure or t-sql in the sqlserver JOB with this command to execute each 1 minute:

INSERT INTO TB_LOG_LAT
SELECT getdate(),[database_id], [file_id], [num_of_reads], [io_stall_read_ms],
[num_of_writes], [io_stall_write_ms], [io_stall],
[num_of_bytes_read], [num_of_bytes_written], [file_handle]
FROM sys.dm_io_virtual_file_stats (NULL, NULL);

When I want to see delta information, i execute the below query. This work fine for me.: ( Before someone asks, I put only one AND to exclude some initial non useful data.)

SELECT [Collection_time],
DB_NAME ([vfs].[database_id]) AS [DB],
LEFT ([mf].[physical_name], 2) AS [Drive],
[mf].[type_desc],
— [num_of_reads_T],
[num_of_reads] AS [Reads],
— [num_of_writes_T],
[num_of_writes] AS [Writes],
— [io_stall_read_ms_T],
[ReadLatency(ms)] =
ROUND(CASE WHEN [num_of_reads] = 0
THEN 0 ELSE ([io_stall_read_ms] / [num_of_reads]) END ,2,1),
— [io_stall_write_ms_T],
[WriteLatency(ms)] =
ROUND(CASE WHEN [num_of_writes] = 0
THEN 0 ELSE ([io_stall_write_ms] / [num_of_writes]) END ,2,1),
/*[Latency] =
CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0)
THEN 0 ELSE ([io_stall] / ([num_of_reads] + [num_of_writes])) END,*/
[AvgBPerRead] =
ROUND(CASE WHEN [num_of_reads] = 0
THEN 0 ELSE ([num_of_bytes_read] / [num_of_reads]) END ,2,1),
[AvgBPerWrite] =
ROUND(CASE WHEN [num_of_writes] = 0
THEN 0 ELSE ([num_of_bytes_written] / [num_of_writes]) END,2,1),
[mf].[physical_name]
FROM (
SELECT [Collection_time],
database_id,
file_id ,
num_of_reads as num_of_reads_T,
case when num_of_reads = num_of_reads – LAG(num_of_reads, 1,0) OVER (PARTITION BY [database_id],[file_id] ORDER BY [collection_time],[database_id],[file_id] ) then -9999999999 else
num_of_reads – LAG(num_of_reads, 1,0) OVER (PARTITION BY [database_id],[file_id] ORDER BY [collection_time],[database_id],[file_id] )END AS [num_of_reads],
num_of_writes as num_of_writes_T,
case when num_of_writes = num_of_writes – LAG(num_of_writes, 1,0) OVER (PARTITION BY [database_id],[file_id] ORDER BY [collection_time],[database_id],[file_id] ) then -9999999999 else
num_of_writes – LAG(num_of_writes, 1,0) OVER (PARTITION BY [database_id],[file_id] ORDER BY [collection_time],[database_id],[file_id] ) END AS num_of_writes,
io_stall_read_ms as io_stall_read_ms_T,
case when io_stall_read_ms = io_stall_read_ms – LAG(io_stall_read_ms, 1,0) OVER (PARTITION BY [database_id],[file_id] ORDER BY [collection_time],[database_id],[file_id] ) then -9999999999 else
io_stall_read_ms – LAG(io_stall_read_ms, 1,0) OVER (PARTITION BY [database_id],[file_id] ORDER BY [collection_time],[database_id],[file_id] ) END AS io_stall_read_ms,
io_stall_write_ms as io_stall_write_ms_T,
case when io_stall_write_ms = io_stall_write_ms – LAG(io_stall_write_ms, 1,0) OVER (PARTITION BY [database_id],[file_id] ORDER BY [collection_time],[database_id],[file_id] ) then -9999999999 else
io_stall_write_ms – LAG(io_stall_write_ms, 1,0) OVER (PARTITION BY [database_id],[file_id] ORDER BY [collection_time],[database_id],[file_id] ) END AS io_stall_write_ms,
case when io_stall = io_stall – LAG(io_stall, 1,0) OVER (PARTITION BY [database_id],[file_id] ORDER BY [collection_time],[database_id],[file_id] ) then -9999999999 else
io_stall – LAG(io_stall, 1,0) OVER (PARTITION BY [database_id],[file_id] ORDER BY [collection_time],[database_id],[file_id] ) END AS io_stall,
case when num_of_bytes_read = num_of_bytes_read – LAG(num_of_bytes_read, 1,0) OVER (PARTITION BY [database_id],[file_id] ORDER BY [collection_time],[database_id],[file_id] ) then -9999999999 else
num_of_bytes_read – LAG(num_of_bytes_read, 1,0) OVER (PARTITION BY [database_id],[file_id] ORDER BY [collection_time],[database_id],[file_id] ) END AS num_of_bytes_read,
case when num_of_bytes_written = num_of_bytes_written – LAG(num_of_bytes_written, 1,0) OVER (PARTITION BY [database_id],[file_id] ORDER BY [collection_time],[database_id],[file_id] ) then -9999999999 else
num_of_bytes_written – LAG(num_of_bytes_written, 1,0) OVER (PARTITION BY [database_id],[file_id] ORDER BY [collection_time],[database_id],[file_id] ) END AS num_of_bytes_written
FROM [dbserv].[dbo].[tb_io_lat] ) AS [vfs]
JOIN sys.master_files AS [mf]
ON [vfs].[database_id] = [mf].[database_id]
AND [vfs].[file_id] = [mf].[file_id]
WHERE DB_NAME ([vfs].[database_id]) NOT IN (‘dbserv’,’model’,’FIMANAGERBD_CAOA_HOMOLOG’,’FIMANAGERBD_CAOA_TREINAMENTO’,’Caoa_Homologacao2′)
AND [num_of_reads] -9999999999
— ORDER BY [ReadLatency(ms)] DESC
ORDER BY [Collection_time],[WriteLatency(ms)] DESC;

Thank you..

Wellington Prado

Reply
Lee-Anthony says:

February 5, 2021 at 9:16 am

Hi Paul

Love the script, I too had a requirement to create a SQL job which runs every 5 minutes, I will be working on a more efficient version, but this one works for now.

I have not included the SQL job installation script, just the basics. I though xml would be the best way to store and retrieve the data.

/* This is part of the SQL installation job where it creates the new database called SQLDBioStats
This database is designed to capture the I/O latency of each of the database files
every five minutes.
It will contain 2 tables, one to hold the step1 data, second is to hold the processed latency data

This part of the script will be included as part of the SQL job installation script.
more info to add…..
*/

USE [master]
GO

–Make sure database and xml schema does not exits

–Drop XML Schema collection if exists

IF EXISTS (SELECT name FROM sys.xml_schema_collections
WHERE [name] = N’SQLServerFileIOLatencyCollection’)
DROP XML SCHEMA COLLECTION SQLServerFileIOLatencyCollection;
GO

–Drop Database if exists
IF EXISTS (SELECT name from [master].[sys].[databases]
WHERE [name] = N’SQLDBioStats’)
DROP DATABASE [SQLDBioStats];
GO

–Create SQL I/O Stats Databse – File locations will be configured as part of the SQL job installation sctip
CREATE DATABASE [SQLDBioStats]
CONTAINMENT = NONE
ON PRIMARY
( NAME = N’SQLDBioStats’, FILENAME = N’F:\Program Files\Microsoft SQL Server\MSSQL12.MYSQLEXPRESS\MSSQL\DATA\SQLDBioStats.mdf’ , SIZE = 5120KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
LOG ON
( NAME = N’SQLDBioStats_log’, FILENAME = N’F:\Program Files\Microsoft SQL Server\MSSQL12.MYSQLEXPRESS\MSSQL\DATA\SQLDBioStats_log.ldf’ , SIZE = 2048KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)
GO

—————————————Step 1——————————————

/* This script is to create the tables ioStatsTable to store the processed latency data – Step 1 of SQL Job */

–use error checking to handle if database is missing
USE [SQLDBioStats]
GO
SET ANSI_WARNINGS OFF
–drop ioStatsTable table if exists
IF EXISTS (SELECT * FROM [sys].[objects]
WHERE [name] = N’ioStatsTable’)
DROP TABLE [ioStatsTable];
GO

–remove FetchIODataT1 table if it exists
IF EXISTS (SELECT * FROM [sys].[objects]
WHERE [name] = N’FetchIODataT1′)
DROP TABLE [FetchIODataT1];
GO

–Create I/O Stats Table to store I/O stats
CREATE TABLE ioStatsTable(
i int IDENTITY(1,1) primary key,
Coldate dateTime,
x xml);
GO

— Create and populates FetchIODataT1 from current I/O Stats DMV Query
SELECT [database_id], [file_id], [num_of_reads], [io_stall_read_ms],
[num_of_writes], [io_stall_write_ms], [io_stall],
[num_of_bytes_read], [num_of_bytes_written], [file_handle]
INTO FetchIODataT1
FROM sys.dm_io_virtual_file_stats (NULL, NULL);
GO

/*
–displays the results of FetchIODataT1 table
select * from FetchIODataT1;
GO

–displays the results of ioStatsTable table
select * from ioStatsTable;
GO

*/

————————————————Step2—————————-

/*
This will be step 2 of the job, where it create temporary table to store the step 2 current I/O stats
Process it and then populates ioStatsTable
*/

USE [SQLDBioStats]– Database to capture stats
GO

SET ANSI_WARNINGS OFF

— Makes sure tempory tables do not exist
IF EXISTS (SELECT * FROM [sys].[objects]
WHERE [name] = N’##FetchIODataT2′)
DROP TABLE [##FetchIODataT2];
GO
IF EXISTS (SELECT * FROM [sys].[objects]
WHERE [name] = N’##IOStatsTable’)
DROP TABLE [##IOStatsTable];
GO

— Create and populates temporary table ##FetchIODataT2 from current I/O Stats DMV Query
SELECT [database_id], [file_id], [num_of_reads], [io_stall_read_ms],
[num_of_writes], [io_stall_write_ms], [io_stall],
[num_of_bytes_read], [num_of_bytes_written], [file_handle]
INTO ##FetchIODataT2
FROM sys.dm_io_virtual_file_stats (NULL, NULL);
GO

WITH [DiffLatencies] AS
(SELECT
— Files that weren’t in the first snapshot
[ts2].[database_id],
[ts2].[file_id],
[ts2].[num_of_reads],
[ts2].[io_stall_read_ms],
[ts2].[num_of_writes],
[ts2].[io_stall_write_ms],
[ts2].[io_stall],
[ts2].[num_of_bytes_read],
[ts2].[num_of_bytes_written]
FROM [##FetchIODataT2] AS [ts2]
LEFT OUTER JOIN [FetchIODataT1] AS [ts1]
ON [ts2].[file_handle] = [ts1].[file_handle]
WHERE [ts1].[file_handle] IS NULL
UNION
SELECT
— Diff of latencies in both snapshots
[ts2].[database_id],
[ts2].[file_id],
[ts2].[num_of_reads] – [ts1].[num_of_reads] AS [num_of_reads],
[ts2].[io_stall_read_ms] – [ts1].[io_stall_read_ms] AS [io_stall_read_ms],
[ts2].[num_of_writes] – [ts1].[num_of_writes] AS [num_of_writes],
[ts2].[io_stall_write_ms] – [ts1].[io_stall_write_ms] AS [io_stall_write_ms],
[ts2].[io_stall] – [ts1].[io_stall] AS [io_stall],
[ts2].[num_of_bytes_read] – [ts1].[num_of_bytes_read] AS [num_of_bytes_read],
[ts2].[num_of_bytes_written] – [ts1].[num_of_bytes_written] AS [num_of_bytes_written]
FROM [##FetchIODataT2] AS [ts2]
LEFT OUTER JOIN [FetchIODataT1] AS [ts1]
ON [ts2].[file_handle] = [ts1].[file_handle]
WHERE [ts1].[file_handle] IS NOT NULL)
SELECT
DB_NAME ([vfs].[database_id]) AS [DB],
LEFT ([mf].[physical_name], 2) AS [Drive],
[mf].[type_desc],
[num_of_reads] AS [Reads],
[num_of_writes] AS [Writes],
[ReadLatency(ms)] =
CASE WHEN [num_of_reads] = 0
THEN 0 ELSE ([io_stall_read_ms] / [num_of_reads]) END,
[WriteLatency(ms)] =
CASE WHEN [num_of_writes] = 0
THEN 0 ELSE ([io_stall_write_ms] / [num_of_writes]) END,
[Latency] =
CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0)
THEN 0 ELSE ([io_stall] / ([num_of_reads] + [num_of_writes])) END,
[AvgBPerRead] =
CASE WHEN [num_of_reads] = 0
THEN 0 ELSE ([num_of_bytes_read] / [num_of_reads]) END,
[AvgBPerWrite] =
CASE WHEN [num_of_writes] = 0
THEN 0 ELSE ([num_of_bytes_written] / [num_of_writes]) END,
[mf].[physical_name]
INTO ##IOStatsTable

FROM [DiffLatencies] AS [vfs]
JOIN sys.master_files AS [mf]
ON [vfs].[database_id] = [mf].[database_id]
AND [vfs].[file_id] = [mf].[file_id]
— ORDER BY [ReadLatency(ms)] DESC
ORDER BY [WriteLatency(ms)] DESC;
GO

—Populate SqldbioStats
INSERT INTO [dbo].[ioStatsTable]
([Coldate]
,[x])
VALUES
(GETDATE(),
(SELECT *
FROM [dbo].[##IOStatsTable]
FOR XML AUTO, XMLSCHEMA (‘SQLServerFileIOLatencyCollection’)));

—Cleanup tasks

—Drop table

DROP TABLE [dbo].[FetchIODataT1]
GO

— recreate FetchIODataT1 from ##FetchIODataT2
SELECT *
INTO FetchIODataT1
FROM ##FetchIODataT2;
GO

/*
–displays the results of FetchIODataT1 table
select * from FetchIODataT1;
GO
–displays the results of Temporary table ##FetchIODataT2
select * from ##FetchIODataT2;
GO

–displays the results of ##IOStatsTable table
select * from ##IOStatsTable;
GO

–displays the results of ioStatsTable table
select * from ioStatsTable;
GO
*/

————————-Step 3————————

/*
Step 3 cleanup tasks
*/
— Clean-up tempory tables
IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
WHERE [name] = N’##FetchIODataT2′)
DROP TABLE [##FetchIODataT2];
GO
IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
WHERE [name] = N’##IOStatsTable’)
DROP TABLE [##IOStatsTable];
GO

————————————————————

Thank you for all the guidance over the years.

Lee-Anthony

Reply
Lee-Anthony says:

February 5, 2021 at 9:20 am

future enhancements include In-Memory OLTP tables instead of Temp tables for 2014 Enterprise and 2016 SP1 and above
Sorry for the formatting
Lee-Anthony

Reply
Peter Wright says:

November 24, 2022 at 8:59 am

This is an awesome script, thank you. Worked on it to capture the data over a period of time, and only THEN saw that others had…

Hope you don’t mind if I post my scripts! Also, I have a maintenance database on my servers, and all useful stuff goes into different schemas, like [ola] and [blitz].
Yours is [sqlskills] of course!

— ============================================
— Create table
— ============================================

use Util;
GO

IF NOT EXISTS ( SELECT name FROM sys.schemas WHERE name = N’sqlskills’ ) EXEC(‘CREATE SCHEMA [sqlskills]’);
GO

SELECT * INTO sqlskills.tbl_ShortPeriodIOLatencies_work FROM sqlskills.tbl_ShortPeriodIOLatencies;

IF OBJECT_ID(‘sqlskills.tbl_ShortPeriodIOLatencies’) IS NOT NULL DROP TABLE sqlskills.tbl_ShortPeriodIOLatencies;
GO

CREATE TABLE sqlskills.tbl_ShortPeriodIOLatencies (
LogID int
, DatabaseName sysname
, FileName sysname
, LogDate datetime
, Drive varchar(10)
, FileType varchar(24)
, Reads bigint
, Writes bigint
, ReadStallms bigint
, WriteStallms bigint
, ReadBytes bigint
, WriteBytes bigint
, PhysicalName nvarchar(520)
, CONSTRAINT tbl_ShortPeriodIOLatencies_ID PRIMARY KEY (LogID,DatabaseName,FileName)
);
GO

CREATE INDEX tbl_ShortPeriodIOLatencies_Date ON sqlskills.tbl_ShortPeriodIOLatencies (LogDate) INCLUDE (LogID)

— ============================================
— Run every minute (or 5 or whatever) from a SQL Job
— ============================================
use Util;

IF OBJECT_ID(‘sqlskills.prc_ShortPeriodIOLatencies’) IS NOT NULL DROP PROCEDURE sqlskills.prc_ShortPeriodIOLatencies
GO

CREATE PROCEDURE sqlskills.prc_ShortPeriodIOLatencies

/*============================================================================
File: ShortPeriodIOLatencies.sql

Summary: Short snapshot of I/O latencies

SQL Server Versions: 2005 onwards
——————————————————————————
Written by Paul S. Randal, SQLskills.com

(c) 2014, SQLskills.com. All rights reserved.

For more scripts and sample code, check out http://www.SQLskills.com

You may alter this code for your own *non-commercial* purposes (e.g. in a
for-sale commercial tool). Use in your own environment is encouraged.
You may republish altered code as long as you include this copyright and
give due credit, but you must obtain prior permission before blogging
this code.

THIS CODE AND INFORMATION ARE PROVIDED “AS IS” WITHOUT WARRANTY OF
ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED
TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
PARTICULAR PURPOSE.

Modified by Peter Wright for logging into a table 2022-11-23
============================================================================*/
AS

— General Settings
SET NOCOUNT ON;

— Variables
DECLARE
@now datetime
, @counter int

SELECT
@now = GETDATE()

SELECT @counter = ISNULL(MAX(LogID),0) + 1 FROM sqlskills.tbl_ShortPeriodIOLatencies

— Go and get current files stats
INSERT INTO sqlskills.tbl_ShortPeriodIOLatencies
SELECT
@counter — LogID
, DB_NAME(vf.database_id) — DatabaseName
, mf.name — FileName
, GETDATE() — LogDate
, LEFT(mf.physical_name, 2) — Drive
, mf.type_desc — FileType
, num_of_reads — Reads
, num_of_writes — Writes
, io_stall_read_ms — ReadStallms
, io_stall_write_ms — WriteStallms
, num_of_bytes_read — ReadBytes
, num_of_bytes_written — WriteBytes
, mf.physical_name — PhysicalName
FROM sys.dm_io_virtual_file_stats (NULL, NULL) vf
INNER JOIN sys.master_files mf ON vf.database_id = mf.database_id AND vf.file_id = mf.file_id ;

— ============================================
— Report to view data
— ============================================

use Util;

DROP PROCEDURE sqlskills.rpt_ShortPeriodIOLatencies
GO

CREATE PROCEDURE sqlskills.rpt_ShortPeriodIOLatencies
@p_startdate datetime = NULL
, @p_enddate datetime = NULL
, @p_period int = 1
, @p_sort int = 0

/*============================================================================
File: ShortPeriodIOLatencies.sql

Summary: Short snapshot of I/O latencies

SQL Server Versions: 2005 onwards
——————————————————————————
Written by Paul S. Randal, SQLskills.com

(c) 2014, SQLskills.com. All rights reserved.

For more scripts and sample code, check out http://www.SQLskills.com

You may alter this code for your own *non-commercial* purposes (e.g. in a
for-sale commercial tool). Use in your own environment is encouraged.
You may republish altered code as long as you include this copyright and
give due credit, but you must obtain prior permission before blogging
this code.

THIS CODE AND INFORMATION ARE PROVIDED “AS IS” WITHOUT WARRANTY OF
ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED
TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
PARTICULAR PURPOSE.

Modified by Peter Wright for logging into a table 2022-11-23
============================================================================*/
AS

— General Settings
SET NOCOUNT ON;

— Variables
DECLARE
@startlogid int
, @endlogid int;

IF @p_enddate IS NULL SELECT @p_enddate = GETDATE();
IF @p_startdate IS NULL SELECT @p_startdate = DATEADD(hh, 0 – @p_period, @p_enddate);

SELECT
@startlogid = MIN(LogID)
, @endlogid = MAX(LogID)
FROM sqlskills.tbl_ShortPeriodIOLatencies
WHERE LogDate BETWEEN @p_startdate AND @p_enddate;

— Work it out!
WITH cte_s AS (
SELECT
s2.DatabaseName
, s2.FileName
, s2.Reads – ISNULL(s1.Reads ,0) AS Reads
, s2.Writes – ISNULL(s1.Writes ,0) AS Writes
, s2.ReadStalls – ISNULL(s1.ReadStalls ,0) AS ReadStalls
, s2.WriteStalls – ISNULL(s1.WriteStalls ,0) AS WriteStalls
, s2.ReadBytes – ISNULL(s1.ReadBytes ,0) AS ReadBytes
, s2.WriteBytes – ISNULL(s1.WriteBytes ,0) AS WriteBytes
, s2.PhysicalName
FROM sqlskills.tbl_ShortPeriodIOLatencies s2
LEFT JOIN sqlskills.tbl_ShortPeriodIOLatencies s1
ON s2.DatabaseName = s1.DatabaseName
AND s2.FileName = s1.FileName
AND s1.LogID = @startlogid
WHERE s2.LogID = @endlogid
)
, cte_t AS (
SELECT
DatabaseName
, FileName
, Reads
, Writes
, CASE WHEN Reads = 0 THEN 0 ELSE ReadStalls / Reads END AS ReadLatency
, CASE WHEN Writes = 0 THEN 0 ELSE WriteStalls / Writes END AS WriteLatency
, CASE WHEN Reads = 0 THEN 0 ELSE ReadBytes / Reads END AS ReadByteAvg
, CASE WHEN Writes = 0 THEN 0 ELSE WriteBytes / Writes END AS WriteByteAvg
, PhysicalName
FROM cte_s
)
SELECT
*
FROM cte_t
ORDER BY
CASE WHEN @p_sort = 1 THEN DatabaseName
WHEN @p_sort = 2 THEN FileName
WHEN @p_sort = 9 THEN FileName
ELSE ” END
, CASE WHEN @p_sort = 3 THEN Reads
WHEN @p_sort = 4 THEN Writes
WHEN @p_sort = 5 THEN ReadLatency
WHEN @p_sort = 6 THEN WriteLatency
WHEN @p_sort = 7 THEN ReadByteAvg
WHEN @p_sort = 8 THEN WriteByteAvg
ELSE 0 END DESC
, DatabaseName
, FileName

Reply
Peter Wright says:

November 24, 2022 at 9:11 am

To add,

Creates a permanent table for the data.

Logs it with real names (no chance I will remember database id’s!)

Reports for :
– a datetime range, or the last x periods, defaulting to the last hour
– output sorted by specified column

Reply
James Smith says:

November 28, 2024 at 2:12 pm

The whole blog is filled with great info but this script perfect for understanding how the average can get dragged around by spikey utilization. Thanks

Reply

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

Capturing IO latencies for a period of time

Posted Under:

27 thoughts on “Capturing IO latencies for a period of time”

Leave a Reply Cancel reply

Other articles

Presenting a workshop at PASS Summit West in Seattle!

Summer Sale: $699 Blackbelt Bundle!

The SQL Server Transaction Log, Part 4: Log Records

SQL101: Application data consistency checking

SQL101: Phishing attacks

SQL101: Top Ten SQL Server Performance Tuning Best Practices

Imagine feeling confident enough to handle whatever your database throws at you.

Capturing IO latencies for a period of time

Check It Out!

SQL Server Jumpstart+ Bundle

Posted Under:

27 thoughts on “Capturing IO latencies for a period of time”

Leave a Reply Cancel reply

Other articles

Presenting a workshop at PASS Summit West in Seattle!

Summer Sale: $699 Blackbelt Bundle!

The SQL Server Transaction Log, Part 4: Log Records

SQL101: Application data consistency checking

SQL101: Phishing attacks

SQL101: Top Ten SQL Server Performance Tuning Best Practices

Imagine feeling confident enough to handle whatever your database throws at you.