The Curious Case of… does SQL Server use a read/write thread per LUN?

(The Curious Case of… used to be part of our bi-weekly newsletter but we decided to make it a regular blog post instead so it can sometimes be more frequent. It covers something interesting one of us encountered when working with a client, doing some testing, or were asked in a random question from the community.)

I had a question in email this morning from someone that was basically does SQL Server use a read/write thread per LUN?

The short answer is no.

My friend Bob Dorr from CSS actually debunked this publicly way back in  2007 in the blog post SQL Server Urban Legends Discussed and also explains how some of the wording in earlier versions of Books online led to the myth that SQL Server uses one read/write thread per LUN or per data file. This myth leads people to believe that adding more data files will increase I/O performance because more threads will be used to do the writing. While it’s true that more data files can increase performance, depending on your I/O subsystem and the file/filegroup layout on that I/O subsystem, it’s nothing to do with that myth.

Any thread in SQL Server that’s processing a query can issue an asynchronous read request. When a thread makes a call into the buffer pool to get  a pointer to a data file page copy in memory, if the page isn’t already in memory then it has to issue a read and then wait for the read to complete (called a physical I/O). On the other hand, if the page already is in memory, the thread pops back out to where it came from (usually the Access Methods code) and that’s a logical I/O, with no accompanying wait. Pages can also be read in larger chunks by things like index seek/scan readahead and DBCC CHECKDB‘s readahead.

Pages can be written to disk in a variety of ways, including:

  • Through a regular checkpoint by a checkpoint background thread
  • Through lazy writing when there’s memory pressure on the buffer pool bu a lazy writer background thread
  • Through eager writing, when performing a minimally-logged operation in (guaranteeing that the full result of the operation is durable when it finishes)
  • Through database snapshot real-time maintenance, when a page is first changed after a database snapshot is created, it is synchronously written to the snapshot

There’s one exception though: backup and restore *will* use a separate reader or writer thread per device to aid in I/O parallelism, but backup and restore don’t use the buffer pool and open direct channels to the devices themselves.

Bottom line: for regular operations, the one-thread-per-file/LUN is a big myth that’s still around today.

 

The Curious Case of… trying to find an MDF file in a RAW disk

(The Curious Case of… used to be part of our bi-weekly newsletter but we decided to make it a regular blog post instead so it can sometimes be more frequent. It covers something interesting one of us encountered when working with a client, doing some testing, or were asked in a random question from the community.)

Continuing on from my theme last time of using a hex editor to find information, this morning I had an email from someone who was trying to find an MDF file in a RAW disk without a file system. Their client had suffered a catastrophic I/O subsystem failure and had ended up with disks that could only be accessed using the RAW file system, and the only valid backups were from 2018.

I explained that I was teaching so unable to consult, but as usual with nasty corruption cases, I’d be happy to swap a few emails to see if I could provide some help anyway.

The first thing is to find the start of the file. My thoughts on this were:

  • Look at the boot page from the old database (dbcc page (name, 1, 9, 3)), and find the offset of the string with the name of the database from the start of the boot page.
  • Above is an screenshot of part of a DBCC PAGE dump of the boot page for a database called salesdb. You can see that the name of the database starts on the third line with ‘7300’, where the cursor is. This is the byte reversed Unicode for lowercase s. So the name starts at offset 0x34 into the boot page record, which itself starts at offset 0x60 into the page. So the name string starts at offset 0x94, or 148, bytes from the start of the boot page.
  • Search from the start of the raw disk for the first occurrence of the pattern of bytes with the name of the database, then go back 148 bytes and look to see if you’ve found the boot page by checking the page ID (the 6 bytes starting at offset 32 from the start of the page; 4 byte page number, byte reversed; 2 byte file number, byte reversed). Now go back 73728 bytes in the disk and hopefully that should be the file header page, page zero.

Then you can continue looking for pages from there, searching for the first two bytes being ‘0100’ and then a valid page ID as described above.

You could even build some automation to look for specific allocation unit IDs in the page header m_objId field (4 bytes, byte reversed at offset 24 in a page) and m_indexId field (2 bytes, byte reversed at offset 6 in a page), using the following code (from a disaster recovery case I worked on a few years ago):

-- Plug in the allocation unit ID you're interested in
DECLARE @alloc BIGINT = requiredallocunitID;
DECLARE @index BIGINT;

SELECT @index =
    CONVERT (BIGINT,
    CONVERT (FLOAT, @alloc)
        * (1 / POWER (2.0, 48)) -- right shift, reciprocal of left shift
    );

SELECT
    CONVERT (BIGINT,
    CONVERT (FLOAT, @alloc - (@index * CONVERT (BIGINT, POWER (2.0, 48))))
        * (1 / POWER (2.0, 16)) -- right shift, reciprocal of left shift
    ) AS [m_objId],
    @index AS [m_indexId];
GO

You might run into false positives, but you should be able to find all the pages. I heard back from the original sender that he’d found several million pages from the MDF so far, using a similar method.

This reminds me of another heroic data recovery case I blogged about back in 2011 – see here – where someone did the same thing to salvage a database whose most recent valid backup was from *SIX* years previously.

Bottom line: up-to-date and valid backups will prevent someone having to do this to get your data back, but, with a little ingenuity and lots of time, nearly anything is possible :-)

The Curious Case of… how to find FILESTREAM info from an MDF

(The Curious Case of… used to be part of our bi-weekly newsletter but we decided to make it a regular blog post instead so it can sometimes be more frequent. It covers something interesting one of us encountered when working with a client, doing some testing, or were asked in a random question from the community.)

A couple of weeks ago a question came in to the MCM email distribution list where someone wanted to know how to find FILESTREAM info from an MDF. They were working with a client who needed to attach a database and all they had was the MDF – no backup, and no access to the database attached anywhere else. The attach was failing because the database had a FILESTREAM filegroup defined, and so the CREATE DATABASE … FOR ATTACH statement needed to specify the FILESTREAM filegroup name and a new location for it.

Given that there was no SQL Server access to the database or a backup, the only way to find the information was through examining the MDF directly.

Here are the steps I recommended:

  • Download and install the freeware hex editor HxD from here.
  • Open the MDF file in HxD.
  • Hit Search, change the data type to hex values and search for the string ‘6C 00 64 00 66 00’, which is the Unicode byte-reversed hex for ‘ldf’.
  • Hit F3 until it can’t find any more. Now you’re positioned in the list of file for the database.
  • Just below that will be pathnames of other files, so scroll down until you come to one that doesn’t have ldf, mdf, or ndf at the end of the pathname.
  • This is the pathname for the FILESTREAM data container, and above that is the filegroup name.

As an example, the image below shows a portion of the MDF for a test database I created:

C:\SQLskills\Documents is the pathname and FSWaitsDocuments is the logical name.

Bottom line: you never know when a little hex-editor spelunking will come in handy!