The Curious Case of… does SQL Server use a read/write thread per LUN?

(The Curious Case of… used to be part of our bi-weekly newsletter but we decided to make it a regular blog post instead so it can sometimes be more frequent. It covers something interesting one of us encountered when working with a client, doing some testing, or were asked in a random question from the community.)

I had a question in email this morning from someone that was basically does SQL Server use a read/write thread per LUN?

The short answer is no.

My friend Bob Dorr from CSS actually debunked this publicly way back in  2007 in the blog post SQL Server Urban Legends Discussed and also explains how some of the wording in earlier versions of Books online led to the myth that SQL Server uses one read/write thread per LUN or per data file. This myth leads people to believe that adding more data files will increase I/O performance because more threads will be used to do the writing. While it’s true that more data files can increase performance, depending on your I/O subsystem and the file/filegroup layout on that I/O subsystem, it’s nothing to do with that myth.

Any thread in SQL Server that’s processing a query can issue an asynchronous read request. When a thread makes a call into the buffer pool to get  a pointer to a data file page copy in memory, if the page isn’t already in memory then it has to issue a read and then wait for the read to complete (called a physical I/O). On the other hand, if the page already is in memory, the thread pops back out to where it came from (usually the Access Methods code) and that’s a logical I/O, with no accompanying wait. Pages can also be read in larger chunks by things like index seek/scan readahead and DBCC CHECKDB‘s readahead.

Pages can be written to disk in a variety of ways, including:

  • Through a regular checkpoint by a checkpoint background thread
  • Through lazy writing when there’s memory pressure on the buffer pool bu a lazy writer background thread
  • Through eager writing, when performing a minimally-logged operation in (guaranteeing that the full result of the operation is durable when it finishes)
  • Through database snapshot real-time maintenance, when a page is first changed after a database snapshot is created, it is synchronously written to the snapshot

There’s one exception though: backup and restore *will* use a separate reader or writer thread per device to aid in I/O parallelism, but backup and restore don’t use the buffer pool and open direct channels to the devices themselves.

Bottom line: for regular operations, the one-thread-per-file/LUN is a big myth that’s still around today.

 

New course: Improving Storage Subsystem Performance

Glenn’s latest Pluralsight course has been published – SQL Server: Improving Storage Subsystem Performance – and is just over two hours long. It’s based on Glenn’s very popular user group/conference sessions and workshops, plus extensive work with SQLskills consulting clients and in his previous roles.

The modules are:

  • Introduction
  • Measuring and Analyzing Storage Subsystem Performance
  • Testing and Benchmarking Storage Subsystems
  • Understanding Storage Types Suitable for SQL Server Workloads
  • Understanding the Effects of RAID Levels on SQL Server Workloads
  • Improving Storage Subsystem Performance

Check it out here.

We now have more than 145 hours of SQLskills online training available (see all our 48 courses here), all for as little as $29/month through Pluralsight (including more than 5,000 other developer and IT training courses). That’s unbeatable value that you can’t afford to ignore.

Enjoy!

Capturing IO latencies for a period of time

In both my wait statistics pre-conference workshops at the PASS Summit and SQLintersection I promised to do a bunch of blog posts. The second one on the list is a simple script to allow you to capture all the reads, writes, and I/O latencies that occurred over a period of time.

The script does the following:

  • Creates two temporary tables
  • Captures the output from sys.dm_io_virtual_file_stats into the first table
  • Waits for a configurable delay (line 41 in the script – I made it 30 minutes in the example)
  • Captures the output from sys.dm_io_virtual_file_stats into the second table
  • Provides my usual virtual file stats output on the results

The original code and explanation about using the DMV are here.

Enjoy!

/*============================================================================
  File:     ShortPeriodIOLatencies.sql

  Summary:  Short snapshot of I/O latencies

  SQL Server Versions: 2005 onwards
------------------------------------------------------------------------------
  Written by Paul S. Randal, SQLskills.com

  (c) 2014, SQLskills.com. All rights reserved.

  For more scripts and sample code, check out http://www.SQLskills.com

  You may alter this code for your own *non-commercial* purposes (e.g. in a
  for-sale commercial tool). Use in your own environment is encouraged.
  You may republish altered code as long as you include this copyright and
  give due credit, but you must obtain prior permission before blogging
  this code.

  THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF
  ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED
  TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
  PARTICULAR PURPOSE.
============================================================================*/
 

IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
    WHERE [name] = N'##SQLskillsStats1')
    DROP TABLE [##SQLskillsStats1];

IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
    WHERE [name] = N'##SQLskillsStats2')
    DROP TABLE [##SQLskillsStats2];
GO

SELECT [database_id], [file_id], [num_of_reads], [io_stall_read_ms],
	   [num_of_writes], [io_stall_write_ms], [io_stall],
	   [num_of_bytes_read], [num_of_bytes_written], [file_handle]
INTO ##SQLskillsStats1
FROM sys.dm_io_virtual_file_stats (NULL, NULL);
GO

WAITFOR DELAY '00:30:00';
GO

SELECT [database_id], [file_id], [num_of_reads], [io_stall_read_ms],
	   [num_of_writes], [io_stall_write_ms], [io_stall],
	   [num_of_bytes_read], [num_of_bytes_written], [file_handle]
INTO ##SQLskillsStats2
FROM sys.dm_io_virtual_file_stats (NULL, NULL);
GO

WITH [DiffLatencies] AS
(SELECT
-- Files that weren't in the first snapshot
        [ts2].[database_id],
        [ts2].[file_id],
        [ts2].[num_of_reads],
        [ts2].[io_stall_read_ms],
		[ts2].[num_of_writes],
		[ts2].[io_stall_write_ms],
		[ts2].[io_stall],
		[ts2].[num_of_bytes_read],
		[ts2].[num_of_bytes_written]
    FROM [##SQLskillsStats2] AS [ts2]
    LEFT OUTER JOIN [##SQLskillsStats1] AS [ts1]
        ON [ts2].[file_handle] = [ts1].[file_handle]
    WHERE [ts1].[file_handle] IS NULL
UNION
SELECT
-- Diff of latencies in both snapshots
        [ts2].[database_id],
        [ts2].[file_id],
        [ts2].[num_of_reads] - [ts1].[num_of_reads] AS [num_of_reads],
        [ts2].[io_stall_read_ms] - [ts1].[io_stall_read_ms] AS [io_stall_read_ms],
		[ts2].[num_of_writes] - [ts1].[num_of_writes] AS [num_of_writes],
		[ts2].[io_stall_write_ms] - [ts1].[io_stall_write_ms] AS [io_stall_write_ms],
		[ts2].[io_stall] - [ts1].[io_stall] AS [io_stall],
		[ts2].[num_of_bytes_read] - [ts1].[num_of_bytes_read] AS [num_of_bytes_read],
		[ts2].[num_of_bytes_written] - [ts1].[num_of_bytes_written] AS [num_of_bytes_written]
    FROM [##SQLskillsStats2] AS [ts2]
    LEFT OUTER JOIN [##SQLskillsStats1] AS [ts1]
        ON [ts2].[file_handle] = [ts1].[file_handle]
    WHERE [ts1].[file_handle] IS NOT NULL)
SELECT
	DB_NAME ([vfs].[database_id]) AS [DB],
	LEFT ([mf].[physical_name], 2) AS [Drive],
	[mf].[type_desc],
	[num_of_reads] AS [Reads],
	[num_of_writes] AS [Writes],
	[ReadLatency(ms)] =
		CASE WHEN [num_of_reads] = 0
			THEN 0 ELSE ([io_stall_read_ms] / [num_of_reads]) END,
	[WriteLatency(ms)] =
		CASE WHEN [num_of_writes] = 0
			THEN 0 ELSE ([io_stall_write_ms] / [num_of_writes]) END,
	-- [Latency] =
		-- CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0)
			-- THEN 0 ELSE ([io_stall] / ([num_of_reads] + [num_of_writes])) END,
	[AvgBPerRead] =
		CASE WHEN [num_of_reads] = 0
			THEN 0 ELSE ([num_of_bytes_read] / [num_of_reads]) END,
	[AvgBPerWrite] =
		CASE WHEN [num_of_writes] = 0
			THEN 0 ELSE ([num_of_bytes_written] / [num_of_writes]) END,
	-- [AvgBPerTransfer] =
		-- CASE WHEN ([num_of_reads] = 0 AND [num_of_writes] = 0)
			-- THEN 0 ELSE
				-- (([num_of_bytes_read] + [num_of_bytes_written]) /
				-- ([num_of_reads] + [num_of_writes])) END,
	[mf].[physical_name]
FROM [DiffLatencies] AS [vfs]
JOIN sys.master_files AS [mf]
	ON [vfs].[database_id] = [mf].[database_id]
	AND [vfs].[file_id] = [mf].[file_id]
-- ORDER BY [ReadLatency(ms)] DESC
ORDER BY [WriteLatency(ms)] DESC;
GO

-- Cleanup
IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
    WHERE [name] = N'##SQLskillsStats1')
    DROP TABLE [##SQLskillsStats1];

IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
    WHERE [name] = N'##SQLskillsStats2')
    DROP TABLE [##SQLskillsStats2];
GO