Over the next few months I’ll be blogging a lot about new features that are coming in SQL Server 2008 for DBAs and ITPros. First up is Backup Compression. This has been one of most heavily requested features since before I joined the SQL team back in 1999 – for me it’s really cool that it’s in this coming release. It’s widely recognized that database sizes are growing significantly, and that the growth is being forced by many factors, such as companies requiring more data to be available online for longer (e.g. to comply with government regulations) or an increasing amount of data being digitized for storage. Although the per/GB cost of storage is falling, the actual cost of keeping a database available is increasing. Here’s why: In addition to costs, the elapsed times of backup and restore operations need to be considered. Backup and restore operations are essentially IO-bound. For a backup in SQL Server 2005 and before, the whole database has to be read and then written to a new location – with the total IO size of the writes to the backup equalling that of the reads from the database. The same is true for a restore, with reads from the backup equalling writes to the restored database. (As an aside, a restore operation also has the added CPU cost of having to examine each page as it’s read from the backup to figure out where it should be written to in the restored database – this usually makes a restore operation take 10-20% more elapsed time than a backup.) The perfect solution to these problems, at the expense of sometimes-considerable CPU time, is to compress the database as it’s written into the backup. This reduces storage space per backup, required network bandwidth per backup copy, and elapsed time for backup and restore operations. The last point – reducing elapsed time for restore operations – is especially important in disaster recovery situations, where anything that can help reduce database downtime is a good thing. [Edit: I've added a little here based on comments I received - thanks Brent and Marc!] A less ideal solution I’ve seen is to take a regular backup and then compress it after-the-fact before copying to other locations. While this is a reasonable solution, it requires more disk space than compressing the database as its backed up, and it’s a more complicated procedure. It also increases the time to take the backup, as the compression is done in a seperate step. An alternative to compressing the backup at all is to make the backup location a compressed NTFS directory. While this achieves the compression goal, it doesn’t permanently compress the backup so doesn’t reduce the network bandwidth required to copy the backup or the space needed to archive the backup to tape. Up until SQL Server 2008, the only solutions for compression-during-backup have come from third-party software vendors. Although these solutions do the job of aleviating the problems I’ve described above, there are two major roadblocks to their adoption that I’ve heard from SQL Server customers: In SQL Server 2008, Microsoft will include a long-awaited and much-asked-for backup compression solution – eliminating the two roadblocks above. This is a fantastic first step improving backup/restore functionality – hopefully in the release after SQL Server 2008 we’ll see further innovations that will allow encrypted backups, table-level restore, and easier validation of the database stored in a backup. Some points to note: Once backup compression is available in a public CTP, I’ll blog some sample scripts and report on things like: In summary, backup bompression is a very exciting feature and should enable many customers to save money, either on additional software licenses or on storage/network costs, and time, especially in all-important disaster recovery situations. Thanks
4 Responses to SQL Server 2008: Backup Compression
Excellent news!
Why a proprietary format and not something commonly available (like GZip streams) so new third-party tools can be written against the new format?
Also, how will this compare (on disk only, of course) to the strategy of making the backup folder’s directory a compressed folder using NTFS compression. I would assume the ratios would be much better, but how much so?
Another advantage to using a native tool for compression is standardization. I’ve had times where I needed to send a database backup to another branch or division of the company who either wasn’t using a third party compression tool, or was using a different tool. Simply having a standard compression tool included with SQL gives an easy option to migrate data between branches & divisions.
The problem with making the backup folder’s directory a compressed folder with NTFS is that it doesn’t reduce the file’s volume when copied across the network or to tape. You have to actually compress the file if you want less network traffic and less tape volume.
[...] Remember also that SQL Server 2008 has native backup compression too – see my blog post here for more [...]