As I mentioned in a previous post, one of the new features for database mirroring in SQL Server 2008 is log stream compression. There’s a good explanation of what this is (along with some example workloads and performance graphs) on the SQL Customer Advisory Team blog so I’m not going to duplicate all that here.

Basically, the way mirroring works is by shipping all the transaction log from the principal to the mirror, and then replaying the log records on the mirror database. Network bandwidth can be a bottleneck for mirrored databases where the rate of transaction log generation is high enough to saturate the network. On synchronously mirrored databases (where a transaction cannot commit on the principal until all the transaction log for it has been hardened in the log of the mirror database), this can even lead to workload throughput degradation.

Log compression is a way to alleviate this problem. SS2008 Books Online states that compression rates of at least 12.5% are achieved – obviously the compression ratio could be much higher than that and is dependant on what’s being compressed – i.e. the data that is being processed by the application. One of the tests they did for the post above shows a 5x compression ratio – that’s pretty good.

The downside of this is that compression is not free – extra CPU must be used. The obvious CPU load increases comes from having to compress the transaction log on the principal and then decompress it again on the mirror before hardening it in the log. The non-obvious increase will come because the principal and mirror should end up being able to process more transactions/second, which means more CPU is needed.

This extra CPU could be a problem for some systems that are already pegged in terms of CPU load, and so they may actually see a drop in performance when log stream compression is enabled. Now, it’s on by default when you upgrade to SS2008 but there is a way to turn it off. The blog post above divulges that there is a trace flag, 1462, that turns off log stream compression and effectively reverts the behavior back to SS2005.

Apart from the obvious benefits of this feature for systems that can afford the extra CPU, I’m excited because it may allow more mirrored databases to perform effective database maintenance – in the form of fragmentation control and index maintenance. When database mirroring is enabled, the database has to be running in the full recovery mode. This means that operations such as index rebuilds are fully logged – that’s a lot of extra log being generated if you’re used to switching to bulk-logged recovery mode to perform index operations. Of course, this isn’t an issue for reorganizing an index (with ALTER INDEX … REORGANIZE) as this has been fully logged (and has to be that way because of how it works) since I wrote the old DBCC INDEXDEFRAG for SS2000.

Anyway, for some customers the fact that these two operations are fully logged means that running them produces so much additional log that, in conjunction with the log from the workload itself, the network becomes a bottleneck for mirroring and it slows down both the maintenance and reduces the workload throughput. The addition of log stream compression means that for these systems, the log may be compressed enough so that the network is no longer a bottleneck and more regular fragmentation control can take place. This in turn will increase workload throughput as well – a double benefit!