The SQLCAT team have published another excellent whitepaper – this time the long-awaited one on the SQL Server 2008 data compression feature. Thirteen people inside and outside Microsoft (including me) provided technical reviews and the authors (Sanjay Mishra along with Marcel van der Holst, Peter Carlin, and Sunil Agarwal) did a great job. I remember leading an effort back in 2005 to see if we (the SQL team) could get some form of data compression into SQL Server 2005 RTM (no, obviously) so it's great to see data compression out there and now with top-class proscriptive guidance on when and how to use it.
Bottom line: don't just go an turn it on without analyzing whether you'll get a decent compression ratio and your workload is suited to data compression.
You can get to the whitepaper at: Data Compression: Strategy, Capacity Planning and Best Practices and I'll add it to our Whitepaper Links page.
Enjoy!
5 thoughts on “SQL Server 2008 Data Compression whitepaper published”
Hi Paul!
Why compressed_page_count is zero?
Drop Table If Exists RowOverFlow
Create Table RowOverFlow
(
ID Int Not Null Primary Key Identity,
Cl1 Nvarchar(3000) Not Null Default Replicate(‘a’, 3000),
Cl2 Nvarchar(3000) Not Null Default Replicate(‘a’, 3000),
Cl3 Nvarchar(3000) Not Null Default Replicate(‘a’, 3000)
)
GO
Set Nocount On
GO
Insert Into RowOverFlow Default Values
GO 10000
Exec sys.sp_spaceused N’RowOverFlow’ — 240400 KB
Alter Table RowOverFlow Rebuild With (Data_Compression = Page)
Exec sys.sp_spaceused N’RowOverFlow’ — 160880 KB
Select index_id, index_level, alloc_unit_type_desc, page_count, compressed_page_count from
sys.dm_db_index_physical_stats (DB_ID(), Object_ID(N’RowOverFlow’), Null, Null, N’Detailed’)
Because your data isn’t compressible. Only one row per page is possible with your defaults, compressed or not.
Yes. that is row overflow.
Why the volume gets low?
Exec sys.sp_spaceused N’RowOverFlow’ — 240400 KB
Alter Table RowOverFlow Rebuild With (Data_Compression = Page)
Exec sys.sp_spaceused N’RowOverFlow’ — 160880 KB
No, nothing to do with the 3000 bytes of row overflow per row. Your in-row data length is 6000 bytes per row, so it could be doing row compression. Without spending time on it, I can’t say, and I’m not spending time on it for contrived data like this.
You are right.
Thanks!