Conference Questions Pot-Pourri #9: Q&A around compression features

Today's post is based on a bunch of questions I've had around the various compression features in SQL Server 2008.

Does turning on data compression or backup compression compress the transaction log files on disk?

No. The transaction log is not compressed in any way under any circumstances. Rows from tables and indexes that have compression enabled will be logged in their row compressed forms, even if page compression is enabled.

Does data compression compress LOB columns?

No. There is no native compression support for any LOB columns (n/text, image, n/varchar(max), varbinary(max), XML), whether stored in-row or out-of-row. There's also no native compression support for FILESTREAM data.

Does log shipping use compression to compress the logs being shipped?

Log shipping does not ship transaction logs – it ships log *backups*. If backup compression is enabled for the instance hosting the log shipping primary database, or the log shipping job is changed to enable backup compression, then the log backups will be compressed and less data will be sent over the wire to the log shipping secondary(s).

Is backup compression the same as log stream compression with database mirroring?

No. Backup compression compresses backups (see my previous blog post here). Log stream compression with database mirroring compresses transaction log records before sending them between the principal and the mirror (see my previous blog post here).

Should I just turn on backup compression at the instance level?

Not necessarily. It depends whether the majority of database on the instance will benefit from backup compression. Backup compression (and any compression algorithm) uses CPU whether a decent compression ratio is achieved or not. Check what compression ratio is achieved first and then enable backup compression if its worth it. Otherwise, just enable it for individual databases.

Does data compression use the same algorithm as backup compression?

No. Backup compression uses a proprietary block-based compression algorithm that is part of Windows. Data compression uses up to 3 algorithms, depending on the level of compression configured. Row compression just makes all non-LOB columns into variable-length columns. Page compression does row compression, then common-prefix compression for each table columns, then common-value dictionary compression for each page. Details can be found at the following BOL sections: Row Compression Implementation  and Page Compression Implementation.

Hope this helps!

13 thoughts on “Conference Questions Pot-Pourri #9: Q&A around compression features

  1. Thanks Paul for the good stuff. I was discussing with my friend that I never come across person like Paul who so desperate to blog and get the information out without any material interst.
    Thanks again for all the great info blogs.
    Please request Kimberly to blog more often :)

  2. While developing applications for data reading in SQL Server backup files, I have encountered compressed backups appeared in SQL Server 2008. Where can I get the description of algorithm used to decompress backups?

  3. Hi Paul,

    If any Transaction running before Backup and continuing after backup completion , will those log records will also be compressed in Backup (backup compression feature i am using)?

    Regards,

  4. Hello Paul,

    Here
    “No. The transaction log is not compressed in any way under any circumstances. Rows from tables and indexes that have compression enabled will be logged in their row compressed forms, even if page compression is enabled.”

    above you mentioned row compressed will be the rows regardless page. In this case, log records will be magnitude larger than data records?

  5. Hi Paul,

    A small query…

    What will be impact on execution plan or query cost(execution plan-estimated & actual) if data is page compressed /Page compressed as less IO and more CPU Cost involved here?

    Regards,

  6. I have read that the variable max fields are compressed if in-row and page compression is used, but not for row compression. Is this the case?

    1. Kind of correct. If it’s a varchar(max) field, then nothing is done for row compression (as that just strips out empty space, and varchar(max) is already stored without extra space). If it’s nvarchar(max) and the character set in use doesn’t use the extra byte of each two-byte character, then those extra bytes are stripped out for row compression for 2008 R2 and onward.

  7. Great Paul,

    Log block size is max 60 KB so we can assume max log record size is also 60 KB while for data page it is 8 KB. What would happen if any transaction generates log record larger than this size

    1. Log records have to fit within a log block, but multiple log records can be used to log an operation (e.g. insert of a 100KB LOB value would need two log records to hold the value).

  8. Thank you for sharing your gift of teaching with us!

    I am completely speechless to express my gratitude to you.

    BIG Thank you Sir !!

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.