Monday, April 07, 2008

Now we're back from Iceland and I have a week to catch up with some content development before the MVP Summit next week and then SQL Connections the following week.

One of the things I struggled with earlier in the year while writing a SQL Server 2008 training course for Microsoft was how to get FILESTREAM to work with partitioning. There wasn't (and still isn't) any information in Books Online that I could find so I had to play around to figure it out.

I should say that the CTP-6/February CTP version of Books Online *does* have a bunch of code examples around using FILESTREAM, so I'm not going to write a blog post about that. Look in the Getting Started with FILESTREAM Storage section (or paste this link into the Books Online URL: window).

Back to partitioning - first I created a test database:

CREATE DATABASE FileStreamTestDB
ON PRIMARY
   
(NAME = FileStreamTestDB_data,
      
FILENAME = N'C:\Metro Demos\FileStreamTestDB\FSTestDB_data.mdf'),
FILEGROUP FileStreamFileGroup CONTAINS FILESTREAM
   
(NAME = FileStreamTestDBDocuments,
      
FILENAME = N'C:\Metro Demos\FileStreamTestDB\Documents')
LOG ON
   
(NAME = 'FileStreamTestDB_log',
      
FILENAME = N'C:\Metro Demos\FileStreamTestDB\FSTestDB_log.ldf');
GO

Then I tried the obvious, knowing that I can't partition on the ROWGUIDCOL:

CREATE PARTITION FUNCTION MyPartFunction (INT) AS RANGE RIGHT FOR VALUES (1000, 2000);

CREATE PARTITION SCHEME MyPartScheme AS PARTITION MyPartFunction ALL TO ([PRIMARY]);

CREATE TABLE FileStreamTest (
   
TestId UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE,
   
Customer INT,
   
[
Name] VARCHAR (25),
   
[
Document] VARBINARY(MAX) FILESTREAM)
ON MyPartScheme (Customer);
GO

Partition scheme 'MyPartScheme' has been created successfully. 'PRIMARY' is marked as the next used filegroup in partition scheme 'MyPartScheme'.
Msg 1921, Level 16, State 4, Line 8
Invalid filegroup 'default' specified.

Eventually I worked out that you have to define a separate partitioning scheme just for FILESTREAM data. This is because the regular data is stored on non-FILESTREAM filegroups, so trying to use the regular partitioning scheme for FILESTREAM would mean telling the Engine to store the FILESTREAM data in non-FILESTREAM filegroups. Clearly a non-starter. Ok - try again with a separate partitioning scheme (the prior MyPartFunction partition function and MyPartScheme partition scheme already exist now remember):

CREATE PARTITION SCHEME MyFSPartScheme AS PARTITION MyPartFunction ALL TO ([FileStreamFileGroup]);

CREATE TABLE FileStreamTest (
   
TestId UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE,
   
Customer INT,
   
[Name] VARCHAR (25),
   
[Document] VARBINARY(MAX) FILESTREAM)
ON MyPartScheme (Customer)
FILESTREAM_ON MyFSPartScheme;
GO

Partition scheme 'MyFSPartScheme' has been created successfully. 'FileStreamFileGroup' is marked as the next used filegroup in partition scheme 'MyFSPartScheme'.
Msg 1908, Level 16, State 1, Line 1
Column 'Customer' is partitioning column of the index 'UQ__FileStreamTest__03317E3D'. Partition columns for a unique index must be a subset of the index key.
Msg 1750, Level 16, State 0, Line 1
Could not create constraint. See previous errors.

Hmm - I can't partition on Customer because there's already a unique index over TestId - UNLESS I specifically set the unique index on TestId to be non-partitioned by setting a filegroup for it:

CREATE TABLE FileStreamTest (
   
TestId UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE ON [PRIMARY],
   Customer INT,
   
[Name] VARCHAR (25),
   
[Document] VARBINARY(MAX) FILESTREAM)
ON MyPartScheme (Customer)
FILESTREAM_ON MyFSPartScheme;
GO

That works! Now - the BIG issue with this setup is that switching partitions won't work while the unaligned index is enabled. So how to disable it? First we need to find out what it's called:

SELECT * FROM sys.indexes WHERE [object_id] = OBJECT_ID ('FileStreamTest');
GO

And then disable it:

ALTER INDEX UQ__FileStre__8CC33161060DEAE8 ON FileStreamTest DISABLE;

Now you can do partition switching. Here's the catch - to re-enable the index you need to REBUILD it - which is a size of data operation! The upshot of all this is that partitioning can be made to work with FILESTREAM data but partition switching is no longer a metadata-only operation.

Hopefully this will be addressed for V2.

Monday, April 07, 2008 3:07:33 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Sunday, March 09, 2008

During the various courses I've been teaching, people are interested in how FILESTREAM performance compares with storing BLOBs in the database itself. I have some performance graphs based on measurements the dev team made - these have been presented publicly by myself and the dev team so I can share them with you here.

There are three graphs below, showing the relative performance for read, insert, and update of:

  • BLOB data stored in FILESTREAM format and accessed through the WIN32 streaming APIs. The times include getting a transaction context from SQL Server, getting the file path, doing the operation, closing the file, and committing the transaction in SQL Server.
  • BLOB data stored in FILESTREAM format and manipulated through T-SQL
  • BLOB data stored in varbinary(max) format (and obviously manipulated through T-SQL)

The data is the same in each test. The tests were performed on a 4-way box with a cold buffer pool. One interesting point to note is that for smaller data sizes, it's faster to manipulate them through T-SQL than through the file system - this is expected based on research Jim Gray did when putting together the original TerraServer.

Sunday, March 09, 2008 9:45:15 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Friday, March 07, 2008

Just over a month ago I posted on how to enable FILESTREAM in CTP-5 and the pre-CTP-6 build I was working with. Now that's all changed! CTP-6 is a hybrid of the CTP-5 method and what will eventually be the methodology in RTM. The changes were made to separate the OS-level configuration from the SQL-level configuration, so a SQL admin can't invoke OS-level changes. This makes sense to me.

In a nutshell, the new way of enabling FILESTREAM will be:

  • OS admin enables FILESTREAM when installing SQL Server or through the SQL Server Configuration Manager
  • SQL admin enables FILESTREAM in the instance using sp_configure. This will always succeed, but if this is done before the OS-level enabling, then FILESTREAM operations will fail.

I won't go into all the details as Joanna Omel (the Program Manager for FILESTREAM in the Storage Engine) has blogged about them on the Storage Engine PM team blog - see here for her post.

More on CTP-6 changes as I hear about them (or trip over them!)

Friday, March 07, 2008 11:10:52 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Thursday, February 14, 2008

Phew - last week Kimberly and I spent 3 days teaching the ins-and-outs of SQL Server 2008 for DBAs/IT-Pros to about 130 Microsoft SQL Server experts and MVPs (like Kalen Delaney, Adam Machanic and Ron Talmage). This was the (95% complete) Beta delivery of a course we've been developing for the last six months for Microsoft that they'll use to train their SQL experts around the world on the new release. It's been very interesting watching the features develop through the CTPs (especially since I left the fold last August) - and making demos work on pre-release builds of the CTPs.

Teaching the course was a *blast* - the thing I love about teaching a really geeky crowd is the plethora of great questions and opportunities for going deep with explanations. Our team actually wrote and delivered the concurrently presented Developer and BI tracks as well. As you can see from the list below (and this is just the features a DBA needs to use/know about), SQL Server 2008 isn't a dot release of Yukon at all, as some people have suggested. Over the three days we covered:

  • Database Mirroring (D)
  • Backup Compression
  • Peer-to-Peer Replication (D)
  • Transparent Data Encryption (D)
  • Extensible (Off-Box) Key Management
  • All Actions Audited (D)
  • Policy-Based Management
  • Resource Governor (D)
  • Extended Events (D)
  • Spatial Indexes
  • Integrated Full-Text Search
  • Sparse Columns (D)
  • Filtered Indexes
  • Change Tracking
  • Change Data Capture (D)
  • FILESTREAM (D)
  • Performance Data Collection
  • Query Optimizer Enhancements
  • Data Compression (D)
  • Service Broker
  • Partition-Level Lock Escalation (D)

The features marked with a (D) are ones I demo'd during the course (Kimberly demo'd a bunch of the others - especially the tools features). Some of the demos were challenging to make work in time as we only got a pre-CTP6 build mid-January just before we headed off to China.

So why am I posting this? Well, a bunch of these features are in CTP-6, which should be just around the corner, and I have some easy-to-understand demos of them that I'll be posting here over the next month or so. Also, if this course sounds interesting, Kimberly and I will be teaching it in various configurations over the next year - starting with SQL Connections in April, a soon-to-be-announced class in Iceland in March, and the ITPro portion of TechEd in June.

Watch this space starting next week (today's the last day of six straight weeks of teaching for us so this weekend's a break :-))

Thursday, February 14, 2008 4:38:30 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Monday, February 04, 2008

The second post in my series on FILESTREAM (see here) deals with how to enable FILESTREAM through T-SQL and using SSMS.

You can't just create FILESTREAM data - you have to specifically enable it at the instance level and then possibly restart the instance or host computer. The reason for this is that there's a file-system filter driver (called RsFx) installed to aid access to the data through the NTFS Streaming APIs. The rules for this are:

  • The computer needs to be restarted if this is the first time FILESTREAM has been enabled on any instance hosted on the machine
  • The instance needs to be restarted if you disable FILESTREAM after its been enabled

In the previous article I mentioned that FILESTREAM can't be enabled on a mirrored database - there's another restriction I forgot: FILESTREAM isn't supported for instances running on WOW64. I don't remember the architectural limitation that prevents this but I do remember the heated discussions in the dev team back in early 2007.

Anyway, I digress. You can enable FILESTREAM through T-SQL using the sp_filestream_configure stored procedure. It takes two parameters @enable_level and @share_name. The level of support is configured through the first and has the following options:

  • 0 - FILESTREAM is disabled for the instance
  • 1 - FILESTREAM is enabled for T-SQL access only
  • 2 - FILESTREAM is enabled for T-SQL AND local file-system access
  • 3 - FILESTREAM is enabled for T-SQL, local file-system, AND remote file-system access

If level 3 is specified, then the a share name for remote access must also be specified. For example:

EXEC sp_filestream_configure @enable_level = 3, @share_name = 'MyFilestreamSQLServerInstance';
GO

Note that once the share name is specified, it can't be changed without disabling and re-enabling FILESTREAM on the instance.

Now, if you don't want to have to remember the options you can use SSMS to do the enabling (with no change to the restart requirements listed above). To do this you right-click on the server name in the Object Explorer and select Properties. When the Server Properties window comes up, select the Advanced tab and right at the top is the FILESTREAM section. Here's what I mean:

Next up is creating a FILESTREAM filegroup and adding data.

Monday, February 04, 2008 11:57:33 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Friday, January 25, 2008

Well it's the end of our first week in China - tomorrow we fly to Beijing for another week of teaching. Hopefully we'll get to do some sightseeing on Sunday - the last time I was in Beijing (for TechEd '06) the conference hotel was about 25 miles outside the city centre - so I haven't really been to Beijing yet.

During this week I've been playing with FILESTREAM for some demos I'm writing for a Microsoft class I'll be teaching when I get back to Redmond. The class is about SQL Server 2008 for DBAs and the attendees will be a bunch of Microsoft Field personnel and SQL Server MVPs. Anyway, as I was playing, I realized that I hadn't blogged anything about FILESTREAM yet, so this is the start of a series of posts about the feature. First up - what is it?

One problem that SQL Server users face is how to store related structured and unstructured data (BLOBs) together while:

  • Providing a way to keep the data in sync (transactionally consistent)
  • Providing fast streaming access to the BLOBs
  • Keeping costs low
  • Enabling scalability
  • Providing ease of management

There's also been a strong desire for a data type that supports BLOB values greate than 2GB for many years.

Before SQL Server 2008, the solutions centered around:

  1. Storing the BLOBs in the file system
    • Advantages: low cost per GB; great streaming performance
    • Disadvantages: difficult to integrate with structured data; complex app development/deployment
  2. Storing the BLOBs on a dedicated BLOB store
    • Advantages: good scale/expandability; cost decreases as scale increases
    • Disadvantages: difficult to integrate with structured data; complex app development/deployment; streaming support/performance is solution dependent
  3. Storing the BLOBs in a database
    • Advantages: integrated management; BLOBs integrated with structured data; easy app development/deployment
    • Disadvantages: high cost per GB; poor streaming performance; 2GB size limit per BLOB

Enter FILESTREAM. It provides the following:

  • BLOB data is stored in the file system but rooted in the database (in the table of which it is part)
  • BLOB data is kept transactionally consistent with structured data
  • BLOB data is accessible through T-SQL and the NTFS Streaming APIs - with great streaming performance
  • BLOB size is limited only by the NTFS volume size
  • Manageability is integrated into SQL Server

Sounds pretty good eh? Well, it is - mostly. There are a few drawbacks with v1 however:

  • Database mirroring cannot be configured on databases with FILESTEAM data
  • Database snapshots don't snapshot FILESTEAM data
  • FILESTREAM data can't be natively encrypted by SQL Server

Personally, I think the first of these could be a major barrier to adoption in the enterprise - hopefully this restriction will be lifted in v2.

So - there's a taster. In the next few posts I'll detail how to enable FILESTREAM and create FILESTREAM data.

Friday, January 25, 2008 1:51:30 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Theme design by Jelle Druyts

Pick a theme: