SQL Server 2008 High Availability whitepaper published on MSDN

The 35-page whitepaper on high availability I wrote for the SQL team over the summer has been published on MSDN. It’s a 2-300 level whitepaper that describes the various high-availability technologies in SQL Server 2008 and how they can be used to mitigate disasters. It’s chock-full of links to other whitepapers, technical articles and Books Online sections and also presents my methodology for planning a high-availability strategy.

You can get it at http://msdn.microsoft.com/en-us/library/ee523927.aspx.


Here’s the table of contents:

  • Introduction
  • Causes of Downtime and Data Loss
    • Planned Downtime
    • Unplanned Downtime and Data Loss
  • Planning a High-Availability Strategy
    • Requirements
    • Limitations
    • Technology Evaluation
  • SQL Server 2008 High-Availability Technologies
    • Logging and Recovery
    • Backup, Restore, and Related Technologies
      • Partial Database Availability and Online Piecemeal Restore
      • Instant File Initialization
      • Mirrored Backups
      • Backup Checksums
      • Backup Compression
    • Other Single-Instance Technologies
      • Online Operations
      • Database Snapshots
      • Hot-Add Memory and CPU
      • Resource Governor
    • Multi-Instance Technologies
      • Log Shipping
      • Transactional Replication
      • Database Mirroring
      • Failover Clustering
      • Combining Multi-Instance Technologies
      • Virtualization
  • Mitigating the Causes of Downtime and Data Loss
  • High-Availability Features Supported by SQL Server 2008 Editions
  • Conclusion

SQL Server 2008 Data Compression whitepaper published

The SQLCAT team have published another excellent whitepaper – this time the long-awaited one on the SQL Server 2008 data compression feature. Thirteen people inside and outside Microsoft (including me) provided technical reviews and the authors (Sanjay Mishra along with Marcel van der Holst, Peter Carlin, and Sunil Agarwal) did a great job. I remember leading an effort back in 2005 to see if we (the SQL team) could get some form of data compression into SQL Server 2005 RTM (no, obviously) so it's great to see data compression out there and now with top-class proscriptive guidance on when and how to use it.

Bottom line: don't just go an turn it on without analyzing whether you'll get a decent compression ratio and your workload is suited to data compression.

You can get to the whitepaper at: Data Compression: Strategy, Capacity Planning and Best Practices and I'll add it to our Whitepaper Links page.


Are your disk partition offsets, RAID stripe sizes, and NTFS allocation units set correctly?

I've just been setting up some of our new hardware, and wanted to do some background reading to ensure I use the correct disk partition offset, RAID stripe size, and NTFS allocation unit size to enable the best possible performance for the volumes I'm creating.

You may not of heard about this (or your disk admins may not have heard about this) but on Windows Server 2003 and before, the default partition offset typically causes worse-than-optimal performance – and correcting it can get gains of maybe as high as 30% in terms of IO latency and duration. The SQLCAT team have just published a *fantastic* whitepaper (written by Jimmy May and Denny Lee) which explains the issue simply and clearly and shows you how to correct it. You should checkout the whitepaper at Disk Partition Alignment Best Practices for SQL Server.

The summary is that on Windows Server 2003 and before, the default partition offset is 31.5KB (63 x 512byte disk sectors), which does not align nicely with the common RAID stripe sizes of 64K or 128K, or the optimal NTFS allocation unit size of 64KB. This can lead to having to read/write multiple stripes every so often and a big perf drop. It can be fixed, as detailed in the whitepaper. For volumes *created* on Windows Server 2008, the problem does not exist as it creates a default partition offset of 1024KB – although see the edit below…

In fact Jimmy just published a blog post to help you make the case to your disk admins/customers a few days ago: Disk Partition Alignment (Sector Alignment): Make the Case: Save Hundreds of Thousands of Dollars.

Luckily I'm using Windows Server 2008, which correctly sets the disk partition for the vast majority of cases.

[Edit 06/30/2011: Beware! Even with Windows Server 2008, it's been reported that some I/O subsystem vendors intercept what Windows is trying to do and *still* create partitions with the incorrect offset. Create them and then check – just to be sure.]

Next thing I considered was RAID stripe size and NTFS allocation unit size (previously known as 'cluster size'). Kendal Van Dyke just published an *excellent* blog post series that provides a lot of empirical evidence as to what the best numbers are for the RAID level you're using. This saved me a lot of time. Check out his series at Disk Performance Hands On Series.

The Dell MD3000i units I'm using don't go any lower than 128KB for a RAID stripe size, so the default is fine. Unfortunately, I forgot to set the NTFS allocation unit size to 64KB when creating the partitions in Windows, so I need to recreate the partitions.

A massive thank-you to these guys for saving me a lot of time and hassle. You should go read this stuff too.