High-performance FILESTREAM tips and tricks

Over the last few weeks I’ve had lots of questions about FILESTREAM performance and how to coax NTFS into scaling nicely. I just finished writing a 30-page whitepaper about FILESTREAM for the SQL Server team, which should be published before PASS in November (I’ll blog the link when I have it). Although my whitepaper isn’t strictly about performance, there is a long section about setting up your system to get high-performance from FILESTREAM. What I want to do in this blog post is give a bullet list of things to do that will help you get good performance. All of these are explained in more detail in the whitepaper.

Here you go, in no particular order:

Make sure you’re storing the right-sized data in the right way. Jim Gray (et al) published a research paper a couple of years ago based called To BLOB or Not To BLOB: Large Object Storage in a Database or a Filesystem?. To summarize the findings, BLOBs smaller than 256-KB should be stored in a database, and 1-MB or larger should be stored in the file-system. For those in-between, “it depends” – my favorite answer. The upshot of this is that you won’t get good performance if you store lots of small BLOBs in FILESTREAM.
Use an appropriate RAID level for the NTFS volume that will host the FILESTREAM data container (the NTFS directory structure corresponding to the FILESTREAM filegroup in the database). Don’t use RAID-5, for instance, for a write-intensive workload.
Use an appropriate disk technology. SCSI will be usually be faster than SATA/IDE, but more expensive. This is because SCSI drives usually have higher rotational speeds, so lower latency and seek times.
Whichever disk technology you choose, if SATA, ensure it supports NCQ, and if SCSI, ensure it supports CTQ. Both of these allow the drives to process multiple, interleaved IOs concurrently.
Separate the data containers. Separate them from each other, and separate them from other database data and log files. This avoids contention for the disk heads.
Defrag the NTFS volume if needed before setting up FILESTREAM, and periodically to maintain good scan performance
Turn off 8.3 name generation on the NTFS volume. This is an order-N algorithm that has to check that the new name generated doesn’t collide with any existing names in the directory. This slows insert and update performance down *a lot*. Do this using the command line fsutil utility.
Turn off tracking of last access time using fsutil.
Set the NTFS cluster size appropriately. For BLOBs 1-MB or large, use a cluster size of 64-KB. This will help to reduce fragmentation.
A partial update of FILESTREAM data creates a new file. Batch lots of small updates into one large update to reduce churn.
When streaming the data back to the client, use an ~60-KB SMB buffer size (or multiples thereof). This is so that the buffers don’t get overly fragmented as TCP/IP buffer are 64-KB.

Hope this helps!

9 thoughts on “High-performance FILESTREAM tips and tricks”

nora says:

November 27, 2008 at 4:32 am

Dear webmaster,
I am a computer programmer in Japan.

And I think them useful not only for English speaking people
but for Japanese people.

I would like to translate your page into
Japanese and publish as a web page (on my personal non-profit web site).
Can I possibly do that?

#This is my personal web site
#I’m sorry,this site is Japanese only…
#http://techbank.jp/Community/blogs/nora/

Reply
paul says:

November 28, 2008 at 10:23 am

Which pages – just this one or the whole blog? If it’s just this one, fine – but not the whole blog.

Thanks

Reply
nora says:

November 29, 2008 at 10:39 am

I understand and I think I want to translate just this page.
Thank you !

Reply
Mike Lawell says:

February 11, 2009 at 6:44 am

Paul, has any one taken a look at the FILESTREAM performance on Windows Server 2008? Especially as it relates to the Win32 API and/or SMB 2? Are there additional tips and tricks to account for in the new OS?

Reply
paul says:

February 14, 2009 at 11:42 am

Hi Mike – not as far as I know, and I haven’t. I’ll ping the team and post what I find.

Reply
Teun van den Biggelaar says:

July 3, 2019 at 1:36 am

Hi Paul,

Are above bullet points still valid for Windows 2016 / 2019 ?
Thanks in advance.

Reply
1. Paul Randal says:
  
  July 7, 2019 at 4:18 pm
  
  Yes.
  
  Reply
Henrik says:

November 22, 2022 at 10:09 am

Whats the Link for the paper

Reply
1. Paul Randal says:
  
  November 27, 2022 at 6:03 pm
  
  https://learn.microsoft.com/en-us/previous-versions/sql/sql-server-2008/hh461480(v=msdn.10)?redirectedfrom=MSDN
  
  Reply

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

High-performance FILESTREAM tips and tricks

9 thoughts on “High-performance FILESTREAM tips and tricks”

Leave a Reply Cancel reply

Other articles

2024: the year in books

2023 (not a typo!): the year in books

Black Friday Super Sale!

Summer School Savings SALE is live!

The Curious Case of… missing log information

The Curious Case of… missing 832 errors during memory corruption

Imagine feeling confident enough to handle whatever your database throws at you.