(The Curious Case of… used to be part of our bi-weekly newsletter but we decided to make it a regular blog post instead so it can sometimes be more frequent. It covers something interesting one of us encountered when working with a client, doing some testing, or were asked in a random question from the community.)
I had a question in email this morning from someone that was basically does SQL Server use a read/write thread per LUN?
The short answer is no.
My friend Bob Dorr from CSS actually debunked this publicly way back in 2007 in the blog post SQL Server Urban Legends Discussed and also explains how some of the wording in earlier versions of Books online led to the myth that SQL Server uses one read/write thread per LUN or per data file. This myth leads people to believe that adding more data files will increase I/O performance because more threads will be used to do the writing. While it’s true that more data files can increase performance, depending on your I/O subsystem and the file/filegroup layout on that I/O subsystem, it’s nothing to do with that myth.
Any thread in SQL Server that’s processing a query can issue an asynchronous read request. When a thread makes a call into the buffer pool to get a pointer to a data file page copy in memory, if the page isn’t already in memory then it has to issue a read and then wait for the read to complete (called a physical I/O). On the other hand, if the page already is in memory, the thread pops back out to where it came from (usually the Access Methods code) and that’s a logical I/O, with no accompanying wait. Pages can also be read in larger chunks by things like index seek/scan readahead and DBCC CHECKDB‘s readahead.
Pages can be written to disk in a variety of ways, including:
- Through a regular checkpoint by a checkpoint background thread
- Through lazy writing when there’s memory pressure on the buffer pool bu a lazy writer background thread
- Through eager writing, when performing a minimally-logged operation in (guaranteeing that the full result of the operation is durable when it finishes)
- Through database snapshot real-time maintenance, when a page is first changed after a database snapshot is created, it is synchronously written to the snapshot
There’s one exception though: backup and restore *will* use a separate reader or writer thread per device to aid in I/O parallelism, but backup and restore don’t use the buffer pool and open direct channels to the devices themselves.
Bottom line: for regular operations, the one-thread-per-file/LUN is a big myth that’s still around today.
6 thoughts on “The Curious Case of… does SQL Server use a read/write thread per LUN?”
I understand your explanation of SQL’s I/O implementation. Where I get confused is the Windows implementation. I have a hard copy of a slide from a SQL Saturday titled “Storage Stack Simplified”. I don’t remember the presenter or the session but I often think the slide title is tongue-in-cheek if not out right sarcasm :).
There is a lot of 6 point text in the 2 dozen or so flow-boxes and the dozen call outs. I review it from time-to-time to see if any magical insight will come to me. Alas, nothing so far.
However, it does seem to me that this really is where the separate files on separate drives provides a significant performance advantage.
Just one example is that when I added 2 additional disk controllers our I/O throughput increased dramatically.
Thanks for the background on SQL’s implementation.
It makes sense that in your scenario you’ve observed the speedup. But it’s not about the number of SQL Server threads servicing requests to these different drives. You simply increased I/O bandwidth and hence the increase of throughput. It doesn’t matter if there is 1 data file or 10 data files on each of these drives. Forgetting GAM, SGAM, PFS page contention for a moment, there’s no advantage of 10 files vs one. There won’t be 10 dedicated threads to handle them, and even if they were, they would not increase I/O performance at all. That’s what I understand from Paul’s description and what intuition tells me as well. We increase number of files to battle contention on internal data file structures.
So…why do we use multiple DATA files? I was a believer in the myth until now…
For the benefits of parallelism at the I/O subsystem level – YMMV – and multiple filegroups for availability, manageability, and performance, in that order of importance IMHO.
Do Backup/Restore command always follow sp_configure max degree of parallelism or these backup/restore parallelism is decided by no of data files and backup files.
Please shed light !
MAXDOP is only for query processing. Backup/restore parallelism is determined by the number of distinct drives that SQL Server can deduce from the data files and backup files.