The Curious Case of… data compression caching

(The Curious Case of… used to be part of our bi-weekly newsletter but we decided to make it a regular blog post instead so it can sometimes be more frequent. It covers something interesting one of us encountered when working with a client, doing some testing, or were asked in a random question from the community.)

In the previous Curious Case post, I explained how to monitor how well page compression is working. That prompted a reader to ask me how to monitor the hit rate of lookups in the cache of decompressed column values. She was very surprised by my answer…

There is no cache of decompressed column values, but it’s a common misconception that there is.

There’s another common misconception that compressed data is uncompressed as it’s read into the buffer pool from disk (in the same way that encrypted pages are decrypted during reads when TDE is enabled) – they’re not.

Compressed columns are only decompressed when needed, and they’re decompressed *every* time they’re needed – there isn’t even a short-term cache of decompressed values. This means that for some workloads there might be a significant climb in CPU usage and degradation of workload throughput, especially for high-volume OLTP workloads with page compression enabled.

Make sure you do real-world load testing before enabling row or page compression, as you may find there’s unacceptable performance degradation.

4 thoughts on “The Curious Case of… data compression caching”

Vesa Juvonen says:

February 8, 2022 at 2:32 am

Hi
Regarding this:
“There’s another common misconception that compressed data is uncompressed as it’s read into the buffer pool from disk (in the same way that encrypted pages are decrypted during reads when TDE is enabled) – they’re not.”

I have read that data pages are even compressed in processor L3 and L2 cache, is this true?
That might also explain why compression makes certain queries run significantly faster as a lot more data pages can reside in those ultra fast caches near CPU core.

Reply
1. Paul Randal says:
  
  February 8, 2022 at 12:27 pm
  
  Correct – column values are only decompressed when needed.
  
  Reply
Pingback: Dew Drop – February 8, 2022 (#3616) – Morning Dew by Alvin Ashcraft
Pingback: Data Compression and Caching – Curated SQL

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

The Curious Case of… data compression caching

4 thoughts on “The Curious Case of… data compression caching”

Leave a Reply Cancel reply

Other articles

2024: the year in books

2023 (not a typo!): the year in books

Black Friday Super Sale!

Summer School Savings SALE is live!

The Curious Case of… missing log information

The Curious Case of… missing 832 errors during memory corruption

Imagine feeling confident enough to handle whatever your database throws at you.