PREEMPTIVE_OS_FLUSHFILEBUFFERS waits on Linux

Last week I blogged about Using the Wait Statistics Report in Azure Data Studio, using a Linux VM in Azure as an example host. As that was my first time running wait stats analysis on a Linux host, I found that the PREEMPTIVE_OS_FLUSHFILEBUFFERS waits on Linux will show up as the top wait on SQL Server 2017 CU6 and higher with default options.

The simple reason for this is that on some Linux servers where the underling I/O subsystem cannot guarantee durable writes when the power fails (i.e. it’s using a volatile write cache), a power failure could lead to data loss within SQL Server. To work around this, 2017 CU6 introduced forced-flush behavior for log and data writing, including things like backups and writes to FILESTREAM files. This means that the writes will be done as normal, and then a call is made to the Windows FlushFileBuffers function, which is translated via the SQLPAL/HostExtension wrapper on Linux into an fdatasync call. There’s a very deep explanation of all the file system internals in Bob Dorr’s excellent post SQL Server On Linux: Forced Unit Access (Fua) Internals.

If you know your I/O subsystem is resilient, you can disable this behavior, as described in KB 4131496.

However, as this will be by far the highest wait on many Linux systems, and as it’s benign, I’ve added it to the list of waits I ignore and updated the Server Reports extension for Azure Data Studio to filter it out as well (this is now live as v1.4).

Enjoy!

PS I also just added the VDI_CLIENT_OTHER wait to the ignore list as well (common from Availability Group seeding), and submitted GitHub changes to Server Reports yesterday, which will be v1.5.

The Curious Case of… copying FILESTREAM files between tables

(The Curious Case of… used to be part of our bi-weekly newsletter but we decided to make it a regular blog post instead so it can sometimes be more frequent. It covers something interesting one of us encountered when working with a client, doing some testing, or were asked in a random question from the community.)

I was asked last week whether it’s possible to create a table with a FILESTREAM column and then populate that column by copying FILESTREAM files from another directory in the FILESTREAM data container.

The simple answer is no – you can’t use that method for copying FILESTREAM files between tables.

In the data container, each table with a FILESTREAM column (or partition of a partitioned table with a FILESTREAM column) has a directory with a GUID name, and then there’s a sub-directory for each FILESTREAM column that also have GUID names. Then the FILESTREAM files themselves have filenames in the format 8hexdigits-8hexdigits-4hexdigits. The hex digits are the Log Sequence Number (LSN) of the log record that created the FILESTREAM file, and the data record contains that LSN as part of the in-row portion of the FILESTREAM column. (You can read a full explanation of the storage format in my post FILESTREAM directory structure.)

So, simply copying the files from one directory to another wouldn’t work for two reasons:

  • The FILESTREAM column in the data record of the target table wouldn’t be populated
  • And even if you created empty FILESTREAM columns, the LSN in the column wouldn’t match the LSN of the copied-in files

And you can’t use SELECT-INTO either, because it will not create FILESTREAM files and instead copies them and stores them in the new table as varbinary(max) with a limit of 2GB.

Bottom line: You can use INSERT INTO <target table> SELECT * FROM <source table> and that *will* populate the FILESTREAM files, but that means reading the FILESTREAM data through SQL Server’s buffer pool, which may be very inefficient for large amounts of data. So your best bet may be to write a little application that will using NTFS streaming access to directly copy the FILESTREAM data from the source table to the target table.

Using the Wait Statistics Report in Azure Data Studio

Azure Data Studio, which used to be called SQL Operations Studio, has a bunch of extensions available, including one called Server Reports from Microsoft. Last year they took my wait stats query (from here) and made it into a report as part of Server Reports extension. In this quick post I’ll show you how to install that extension and look at the report.

I’ve been using Azure Data Studio a lot as part of working with SQL Server 2017 on Linux in Azure and I like it a lot. It’s obviously not as comprehensive as SSMS, but for simple stuff it does the job.

You can see the pretty big list of available extensions by hitting the Extensions icon:

In the list that appears, find the Server Reports extension and click the Install button:

Once an extension is installed, you need to activate it. Without reading any documentation, the first time I used Azure Data Studio I couldn’t figure out why the extension wasn’t there once I’d installed it. Click the Reload button to active the extension:

The extension will now show up in the list of enabled extensions:

If the Server Dashboard screen isn’t showing, right-click on one of the servers you’re connected to and select Manage. All the active extensions will show as tabs at the top of the screen, so select Server Reports and then click the heart-shaped icon and you’ll see the current aggregate wait stats for the instance:

Note that the x-axis is percentage of all waits, not wait count. You’ll see that PREEMPTIVE_OS_FLUSHFILEBUFFERS is the top wait on my Linux instance – that’s by design and I’ll blog about that next. I’ve also submitted a GitHub change to add that wait to the list of waits filtered out by script the extension uses.

Anyway, you can drill in to the details by clicking the ellipsis at the top-right of the graph and selecting ‘Show Details’. That’ll give all the waits and by selecting each one you can see the usual output from my waits script. To get more information on what each wait means, select the bottom cell, right-click on the URL to copy it, and paste into your favorite browser to go to my waits library. And of course, you can refresh the results via the ellipsis as well.

There are many more useful extensions that you should check out too – enjoy!