sqlskills-logo-2015-white.png

A small-scale test measuring the impact of data compression on the transaction log

I received a question today about the impact of data compression on the transaction log.  While most of the time we talk about data compression from a data page and memory utilization perspective, I hadn’t actually directly tested the impact to transaction logging and I wanted to see it for myself. 

Here is just a very small scale test using data from AdventureWorks.HumanResources.Employee into two tables – one with a clustered index that is not using compression and one that is using page compression (I used a separate database just to keep things clean):

USE [CompressionTest];

GO

CREATE TABLE [dbo].[Employee](

      [EmployeeID] [int] IDENTITY(1,1) NOT NULL,

      [NationalIDNumber] [nvarchar](15) NOT NULL,

      [ContactID] [int] NOT NULL,

      [LoginID] [nvarchar](256) NOT NULL,

      [ManagerID] [int] NULL,

      [Title] [nvarchar](50) NOT NULL,

      [BirthDate] [datetime] NOT NULL,

      [MaritalStatus] [nchar](1) NOT NULL,

      [Gender] [nchar](1) NOT NULL,

      [HireDate] [datetime] NOT NULL,

      [SalariedFlag] [bit] NOT NULL,

      [VacationHours] [smallint] NOT NULL,

      [SickLeaveHours] [smallint] NOT NULL,

      [CurrentFlag] [bit] NOT NULL,

      [rowguid] [uniqueidentifier] NOT NULL,

      [ModifiedDate] [datetime] NOT NULL,

      [EncryptedNationalIDNumber2] [varbinary](128) NULL

) ON [PRIMARY];

GO

CREATE CLUSTERED INDEX [IX_Employee]

ON [dbo].[Employee]

(

      [EmployeeID] ASC

)

ON [PRIMARY];

GO

CREATE TABLE [dbo].[Employee_Compressed](

      [EmployeeID] [int] IDENTITY(1,1) NOT NULL,

      [NationalIDNumber] [nvarchar](15) NOT NULL,

      [ContactID] [int] NOT NULL,

      [LoginID] [nvarchar](256) NOT NULL,

      [ManagerID] [int] NULL,

      [Title] [nvarchar](50) NOT NULL,

      [BirthDate] [datetime] NOT NULL,

      [MaritalStatus] [nchar](1) NOT NULL,

      [Gender] [nchar](1) NOT NULL,

      [HireDate] [datetime] NOT NULL,

      [SalariedFlag] [bit] NOT NULL,

      [VacationHours] [smallint] NOT NULL,

      [SickLeaveHours] [smallint] NOT NULL,

      [CurrentFlag] [bit] NOT NULL,

      [rowguid] [uniqueidentifier] NOT NULL,

      [ModifiedDate] [datetime] NOT NULL,

      [EncryptedNationalIDNumber2] [varbinary](128) NULL

) ON [PRIMARY];

GO

CREATE CLUSTERED INDEX [IX_Employee_Compressed]

ON [dbo].[Employee_Compressed]

(

      [EmployeeID] ASC

)WITH (DATA_COMPRESSION=PAGE) ON [PRIMARY]

GO

Now the test is very simple.  I perform a full backup, and an initial transaction log backup:

BACKUP DATABASE CompressionTest TO DISK = 'c:\temp\CompressionTest.bak';

BACKUP LOG CompressionTest TO DISK = 'c:\temp\throwaway.trn';

Next, I insert 290,000 rows into the uncompressed table:

SET NOCOUNT ON;

INSERT dbo.Employee

(NationalIDNumber, ContactID, LoginID, ManagerID, Title, BirthDate,

MaritalStatus, Gender, HireDate, SalariedFlag, VacationHours,

SickLeaveHours, CurrentFlag, rowguid, ModifiedDate, EncryptedNationalIDNumber2)

SELECT NationalIDNumber, ContactID, LoginID, ManagerID, Title, BirthDate,

MaritalStatus, Gender, HireDate, SalariedFlag, VacationHours,

SickLeaveHours, CurrentFlag, rowguid, ModifiedDate, EncryptedNationalIDNumber2

FROM AdventureWorks.HumanResources.Employee;

GO 1000

Data used is 74,288 KB:

EXEC sp_spaceused 'dbo.Employee';

Querying the log record length from fn_dblog shows 120,503,216 bytes:

SELECT SUM([log record length])

FROM fn_dblog (NULL, NULL);

Backing up the transaction log produces a file that is 121,964 KB in size:

SELECT SUM([log record length])

FROM fn_dblog (NULL, NULL);

Now to insert rows into the table with data compression:

INSERT dbo.Employee_Compressed

(NationalIDNumber, ContactID, LoginID, ManagerID, Title, BirthDate,

MaritalStatus, Gender, HireDate, SalariedFlag, VacationHours,

SickLeaveHours, CurrentFlag, rowguid, ModifiedDate, EncryptedNationalIDNumber2)

SELECT NationalIDNumber, ContactID, LoginID, ManagerID, Title, BirthDate,

MaritalStatus, Gender, HireDate, SalariedFlag, VacationHours,

SickLeaveHours, CurrentFlag, rowguid, ModifiedDate, EncryptedNationalIDNumber2

FROM AdventureWorks.HumanResources.Employee;

GO 1000

Data used is 36,800 KB (versus 74,288 KB):

EXEC sp_spaceused 'dbo.Employee_Compressed';

Querying the log record length from fn_dblog shows 100,677,612 bytes versus the uncompressed 120,503,216 bytes:

SELECT SUM([log record length])

FROM fn_dblog (NULL, NULL);

Backing up the transaction log produces a file that is 102,158 KB in size (versus 121,964 KB):

SELECT SUM([log record length])

FROM fn_dblog (NULL, NULL);

So on a much larger scale – you can imagine the transaction log size reduction could be non-trivial.  Mileage will vary on the overall compression you can get from specific tables and data – but this is just another potential benefit to be aware of.  And as an aside, compressed rows are written to the transaction log in the ROW compression format and not the PAGE type.

2 thoughts on “A small-scale test measuring the impact of data compression on the transaction log

  1. Thanks Joe, this is fascinating. I hadn’t considered that compression would have any effect upon the transaction log (nor do I remember seeing it documented anywhere). This is really good to know and remember.

Comments are closed.

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.