How can a log backup fail but a full backup succeed?

This is an interesting case that cropped up yesterday – the transaction log is damaged so a log backup doesn’t work (with the error below):

Backup detected log corruption in database FakeDBName. Context is FirstSector. LogFile: 2 ‘F:\SQLLOGS\XYZ\FakeDBName_Log.ldf’ VLF SeqNo: x502e VLFBase: x2ce40000 LogBlockOffset: x2d0a9000 SectorStatus: 2 LogBlock.StartLsn.SeqNo: x4ee3 LogBlock.StartLsn.Blk: x1348 Size: x200 PrevSize: x400
2009-03-06 10:00:02.61 Backup      Error: 3041, Severity: 16, State: 1.
2009-03-06 10:00:02.61 Backup      BACKUP failed to complete the command BACKUP LOG FakeDBName Check the backup application log for detailed messages.
2009-03-06 10:00:03.61 Backup      Error: 3041, Severity: 16, State: 1.

However a full backup succeeds, as does a DBCC CHECKDB. What’s going on?

The answer comes with understanding what portions of the transaction log are required for these operations. (For a good understanding of the transaction log itself, along with logging and recovery, see my article in the February TechNet Magazine.)

A transaction log backup, by its very nature, has to backup *all* transaction log generated since the last log backup – so it will try to backup the corrupt portion and fail.

A full database backup only has to backup enough transaction log to allow the database to be restored and recovered to a transactionally consistent point. In other words, it only requires the transaction log back to the beginning of the oldest active transaction at the point that the data-reading section of the full backup completes. This is a source of immense confusion – many people don’t believe that a full (or differential) backup needs to also backup some transaction log. For a more in-depth study of this, see my previous blog posts Debunking a couple of myths around full database backups and More on how much transaction log a full backup includes.

A DBCC CHECKDB operation uses a database snapshot to get a transactionally consistent view of the database on which to run consistency checks. When the database snapshot is created, crash recovery is run on it to make it transactionally consistent. That requires the same amount of log as if a full backup was taken – back to the beginning of the oldest active transaction at the time the database snapshot is created. See CHECKDB From Every Angle: Complete description of all CHECKDB stages for more info.

So – it’s entirely possible for the situation reported above to exist. The question then becomes, how to recover from it?

Assuming that the database files are intact, there is a simple solution. This solution will break the log backup chain, but given that the log is corrupt so a log backup cannot be taken, the log backup chain is *already* broken. Here’s what to do:

Stop all user activity in the database
Switch to the SIMPLE recovery model (breaking the log backup chain and removing the requirement that the damaged portion of log must be backed up)
Switch to the FULL recovery model
Take a full database backup (thus starting a new log backup chain)
Start taking log backups

You might want to manually shrink and grow the log file in between steps 2 and 3 too – in case the log file is on a damaged portion of disk – or maybe even shrink it right down and add another log file on an undamaged disk. You also will need to do some root-cause analysis to determine why the corruption occured in the first place, and to take preventative measures to stop it happening again.

Hope this helps

PS In my previous post, Testing a new survey method: backup validation, the answer with the largest number of responses so far is that people never verify their backups – very disturbing!

PPS If a subsequent log backup actually succeeds, you’ve likely got some kind of transient I/O subsystem problem. See here for more details.

Posted Under:

Backup/Restore, Corruption, Database Maintenance, Involuntary DBA, Transaction Log

19 thoughts on “How can a log backup fail but a full backup succeed?”

Kothan says:

April 16, 2009 at 8:13 am

Paul:

Understanding Logging and Recovery in SQL Server from http://technet.microsoft.com/en-us/magazine/2009.02.logging.aspx

Piece of the information from site is pasted below for your convenience…

As an example, consider what happens when a single table row is updated in an implicit transaction. Imagine a simple heap table with an integer column c1 and a char column c2. The table has 10,000 rows, and a user submits an update query as follows:
Copy Code
UPDATE SimpleTable SET c1 = 10 WHERE c2 LIKE ‘%Paul%’;

The following operations take place:
The data pages from SimpleTable are read from disk into memory (the buffer pool) so they can be searched for matching rows. It turns out that three data pages hold five rows that match the WHERE clause predicate.
1. The Storage Engine automatically starts an implicit transaction.
2. The three data pages and five data rows are locked to allow the updates to occur.
3. The changes are made to the five data records on the three data pages in memory.
4. The changes are also recorded in log records in the transaction log on disk.
5. The Storage Engine automatically commits the implicit transaction.

My Question:

Does the step 4 happens before step 5. Per BOL, WAL, writes Log records to disk when the transactions are committed.

Note: I did a small test, Begin Tran Insert a record in a table, did not commit and did not exec checkpoint, I found that open tran recorded in tlog.

As always I enjoy reading your blog :)

Reply
paul says:

April 16, 2009 at 9:23 pm

Hi Kothan,

Step 4 always has to happen before the transaction commits back to the user. Log records are constantly being flushed out to disk. When you query the log using DBCC LOG or fn_dblog, the log records are, of course, in the log, but you can’t tell whether the log reader got them from disk or from the in-memory log buffer as they haven’t yet been flushed out to disk. What is guaranteed is that before a transaction commit notification is returned to the user, all log records for the transaction are on disk, and before a data page is written from the buffer pool to disk, all log records up-to-and-including the one(s) affecting that page must have been flushed to disk too.

Hope this makes sense – follow-up if it doesn’t.

Cheers

Reply
Amos Livingston says:

May 9, 2013 at 12:55 pm

Is it ok to do a full backup once a week and transactional log backup every hour.

Reply
1. Paul Randal says:
  
  May 10, 2013 at 5:26 am
  
  I can’t give you a yes/no answer as it depends on a bunch of things. Most importantly, how much data can your business afford to lose? With log backups every hour, you’re looking at a maximum of 1 hour of data/work being lost. Figure out what restores you need to be able to do to meet your downtime and data loss SLAs, then plan your backup strategy from that.
  
  Reply
bala says:

May 16, 2013 at 3:02 am

Iam seeing a reverse issue this is a mirrored database and mirroring is currently suspended, checkdb doesn’t report errors

BACKUP DATABASE SharePoint_Config TO disk = ‘G:\Default\test.bak’ WITH COPY_ONLY–not to break the logchain hence used the copy only option, and strangely log backup works

Processed 18536 pages for database ‘SharePoint_Config’, file ‘SharePoint_Config’ on file 1.
Msg 3049, Level 16, State 1, Line 1
BACKUP detected corruption in the database log. Check the errorlog for more information.
Msg 3013, Level 16, State 1, Line 1
BACKUP DATABASE is terminating abnormally.

BACKUP LOG SharePoint_Config TO disk = ‘G:\Default\test.bak’ WITH COPY_ONLY

Processed 52 pages for database ‘SharePoint_Config’, file ‘SharePoint_Config_log’ on file 1.
BACKUP LOG successfully processed 52 pages in 0.051 seconds (7.908 MB/sec).

Reply
Jinal Contractor says:

August 17, 2015 at 11:09 am

I’ve same issue going with me from last 2 weeks. I had an log file corruption this morning and this is happened 4th times in 2 weeks.
Is there anyway I can get to know. How to fix this issue? I’m really helpless to fix this issue at my work.
we are using following edition,.

We are using MICROSOFT WINDOWS Server 2003r2 – Standard X64 Edition service pack 2

We are using SSMS as following version

MICROSOFTSQL Server 2005 – 9.00.5324.00 (X64)
Aug 24 2012 18:28:47
Copyright (c) 1988-2005 MICROSOFT Corporation
Standard Edition (64-bit) on WINDOWS NT 5.2 (Build 3790: Service Pack 2)

Reply
1. Paul Randal says:
  
  August 17, 2015 at 1:43 pm
  
  If it keeps happening, it’s either bad memory chips or something wrong with your I/O subsystem. Get deep diagnostics run on both.
  
  Reply
Jinal Contractor says:

September 14, 2015 at 11:51 am

Hello Paul,
Thanks for your response.
We are using network backup by veeam network backup software and corruption happened right after network backup.
Do you have any idea to what to look in this case?

Reply
1. Paul Randal says:
  
  September 16, 2015 at 2:04 am
  
  The I/O subsystem.
  
  Reply
2. Preston says:
  
  April 4, 2019 at 7:52 am
  
  We were getting the same error as above, about 12 times over the course of a year.
  When the error always occurred, it happened during the 11:00 pm transaction log backup.
  We did some checking and found out that a veeam database backup was happening at 11:00 pm.
  
  4 months ago we shifted the timing of the transaction log backups so they didn’t happen at 11:00 pm, and the corruption has not happened since.
  
  Here are the times/dates for the previous log corruptions:
  Friday 11:00 pm 1/11/2019
  Monday 11:00 pm 1/7/2019
  Saturday 11:00 pm 12/19/2018
  Friday 11:00 pm 12/14/2018
  Friday 11:00 pm 10/19/2018
  Friday 11:00 pm 9/7/2018
  
  Reply
  1. Paul Randal says:
    
    April 17, 2019 at 12:58 pm
    
    Yup – likely because Veeam is pausing I/O and preventing the log backup from working correctly.
    
    Reply
Chris Page says:

February 24, 2016 at 1:00 am

Hi Paul,
Your article has proven really useful to me as I’ve been experiencing SAN issues recently resulting in this exact sort of corruption, however, one of the databases on an AlwaysOn availability group has experienced the same issue, it’s already on the secondary server so am wondering, if I go ahead and fix in the normal way am I going to break the availability group, so perhaps I need to drop the database out and recreate at the secondary – any pointers appreciated before I try to fix it! Thanks

Chris

Reply
1. Paul Randal says:
  
  March 10, 2016 at 1:56 pm
  
  Yes – you’ll need to break the AG to do that as you can’t alter the secondary replica in any way.
  
  Reply
Ishtiaq Ahmed says:

May 26, 2016 at 12:32 pm

Thank You Paul you saved my day. I am Thankful for your great support for community.

Ishtiaq

Reply
Alish says:

January 3, 2017 at 8:13 am

Hi paul,

i have similar problem in log shipping . and my question is does LSN chain breaks up if there is some DBCC shrink database running for particular db which is changing status to simple and again changing back to full. is that the root cause to break my log shipping.

And adding , after it was shrunk changing status back and forth , it did not backup any Trn backup, is it Becauseof changing status?

Reply
1. Paul Randal says:
  
  January 3, 2017 at 2:54 pm
  
  It’s because you switched to simple and back – that breaks the LSN chain. It has nothing to do with shrinking the database.
  
  Reply
Rizaan says:

March 9, 2020 at 4:34 am

Hi Paul

i have set a specific database from full recovery to simple to shrink the logs. i then set the database back to full recovery and since then my transactional log backups are not writing for that specific database, how do i go about resolving this ?

Reply
1. Paul Randal says:
  
  March 10, 2020 at 1:08 pm
  
  Please don’t do that – you can shrink in full recovery mode. You’ve broken your log backup chain. You need to perform another full backup before you can take log backups again.
  
  Reply
prasanna says:

March 18, 2020 at 2:28 am

i got below error
Backup detected log corruption in database . Context is FirstSector.

i followed below steps

Stop all user activity in the database
Switch to the SIMPLE recovery model (breaking the log backup chain and removing the requirement that the damaged portion of log must be backed up)
Switch to the FULL recovery model
Take a full database backup (thus starting a new log backup chain)
Start taking log backups

it’s working fine…Thank You

Reply

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

How can a log backup fail but a full backup succeed?

Posted Under:

19 thoughts on “How can a log backup fail but a full backup succeed?”

Leave a Reply Cancel reply

Other articles

SQL101: Application data consistency checking

SQL101: Phishing attacks

SQL101: Top Ten SQL Server Performance Tuning Best Practices

The Curious Case of… no buffer pool memory and no OS memory available

The SQL Server Transaction Log, Part 3: The Circular Nature of the Log

SQL101: Indexing Strategies for SQL Server Performance

Imagine feeling confident enough to handle whatever your database throws at you.

How can a log backup fail but a full backup succeed?

Check It Out!

SQLskills Complete Short Course Bundle

Posted Under:

19 thoughts on “How can a log backup fail but a full backup succeed?”

Leave a Reply Cancel reply

Other articles

SQL101: Application data consistency checking

SQL101: Phishing attacks

SQL101: Top Ten SQL Server Performance Tuning Best Practices

The Curious Case of… no buffer pool memory and no OS memory available

The SQL Server Transaction Log, Part 3: The Circular Nature of the Log

SQL101: Indexing Strategies for SQL Server Performance

Imagine feeling confident enough to handle whatever your database throws at you.