Well it’s the end of our first week in China – tomorrow we fly to Beijing for another week of teaching. Hopefully we’ll get to do some sightseeing on Sunday – the last time I was in Beijing (for TechEd ’06) the conference hotel was about 25 miles outside the city centre – so I haven’t really been to Beijing yet.

During this week I’ve been playing with FILESTREAM for some demos I’m writing for a Microsoft class I’ll be teaching when I get back to Redmond. The class is about SQL Server 2008 for DBAs and the attendees will be a bunch of Microsoft Field personnel and SQL Server MVPs. Anyway, as I was playing, I realized that I hadn’t blogged anything about FILESTREAM yet, so this is the start of a series of posts about the feature. First up – what is it?

One problem that SQL Server users face is how to store related structured and unstructured data (BLOBs) together while:

  • Providing a way to keep the data in sync (transactionally consistent)
  • Providing fast streaming access to the BLOBs
  • Keeping costs low
  • Enabling scalability
  • Providing ease of management

There’s also been a strong desire for a data type that supports BLOB values greate than 2GB for many years.

Before SQL Server 2008, the solutions centered around:

  1. Storing the BLOBs in the file system

    • Advantages: low cost per GB; great streaming performance
    • Disadvantages: difficult to integrate with structured data; complex app development/deployment

  2. Storing the BLOBs on a dedicated BLOB store

    • Advantages: good scale/expandability; cost decreases as scale increases
    • Disadvantages: difficult to integrate with structured data; complex app development/deployment; streaming support/performance is solution dependent

  3. Storing the BLOBs in a database

    • Advantages: integrated management; BLOBs integrated with structured data; easy app development/deployment
    • Disadvantages: high cost per GB; poor streaming performance; 2GB size limit per BLOB

Enter FILESTREAM. It provides the following:

  • BLOB data is stored in the file system but rooted in the database (in the table of which it is part)
  • BLOB data is kept transactionally consistent with structured data
  • BLOB data is accessible through T-SQL and the NTFS Streaming APIs – with great streaming performance
  • BLOB size is limited only by the NTFS volume size
  • Manageability is integrated into SQL Server

Sounds pretty good eh? Well, it is – mostly. There are a few drawbacks with v1 however:

  • Database mirroring cannot be configured on databases with FILESTEAM data
  • Database snapshots don’t snapshot FILESTEAM data
  • FILESTREAM data can’t be natively encrypted by SQL Server

Personally, I think the first of these could be a major barrier to adoption in the enterprise – hopefully this restriction will be lifted in v2.

So – there’s a taster. In the next few posts I’ll detail how to enable FILESTREAM and create FILESTREAM data.