There’s still a widely held misconception that when properly in the FULL or BULK_LOGGED recovery models that full or differential backups can truncate the log. No. It *NEVER* happens. This is one of the reasons why I’m doing a whole spotlight session on this at PASS this year – the transaction log and its behavior is IMHO one of the most misunderstood parts of SQL Server.
Notice that I said ‘when properly in the FULL or BULK_LOGGED recovery models’. If you switch recovery models to FULL or BULK_LOGGED, until you take the first full backup, you are still essentially in the SIMPLE recovery model, and so the log will truncate on checkpoint. Once you take that first full backup, you are then in I-will-manage-the-size-of-the-log-through-log-backups mode. After that, the ONLY thing that will allow the log to clear/truncate is a log backup, as long as nothing else requires those transaction log records.
If you’re not familiar with the term ‘log clearing’ or ‘log truncating’, they mean exactly the same thing – part of the transaction log is marked as no longer needed and can be overwritten. Nothing is zeroed out, the log file size is not altered. Some background reading on this:
- My Understanding Logging and Recovery in SQL Server article in TechNet Magazine from February 2009. This also has a screencast demo that shows the log continues to truncate-on-checkpoint after switching to FULL and before a full backup has been taken.
- An editorial I wrote in April about Importance of proper transaction log size management.
- A pretty deep blog post I wrote about analyzing VLF sequences in the log: Inside the Storage Engine: More on the circular nature of the log
Earlier this week I was involved in a discussion about log backup size management and how to prevent a log backup following a maintenance operation to not contain details of the maintenance operation.
There’s a very simple answer: you can’t.
If you do an operation in the FULL or BULK_LOGGED recovery models, the next log backup will contain all information required to replay that operation. In the FULL recovery model, everything is fully logged, so the log backup will contain all the log records generated by the operation. In the BULK_LOGGED recovery model, you may perform a minimally-logged operation, which generates hardly any transaction log, but the next log *backup* will be about the same size as if the operation was fully logged – because the log backup will pick up all the data extents modified by the minimally-logged operation.
One point in the discussion was that if you’re running in the FULL or BULK_LOGGED recovery models, and you do a full backup after the maintenance operation, and before the log backup, the full backup will contain all the changes made by the maintenance operation, yes, and will clear the log.
A log backup is *ALL* the log generated since the last log backup. If this were not the case, how would log shipping work? You could take a full backup on the log shipping primary and suddenly you’ve broken the log backup chain and log shipping breaks. No, this is not how things work. A full backup contains only enough transaction log necessary to be able to restore that database to a transactionally consistent time – the time at which the data reading portion of the full backup completed. I blogged about this extensively previously:
- Debunking a couple of myths around full database backups (which also explains why the WITH STOPAT syntax exists for full and differential backups, but has no effect and is only there to allow you to put WITH STOPAT on all backups in a point-in-time restore sequence.
- More on how much transaction log a full backup includes
But you don’t have to believe me – it’s very simple to convince yourself. The following script will show you that a full backup has no effect on the transaction log. It does the following:
- Create a database and put it into the FULL recovery model, with a full backup.
- Create and populate and index.
- Take log backup 1 (just to clear things out)
- Rebuild the index.
- Take log backup 2.
- Rebuild the index.
- Take a full backup.
- Take log backup 3.
And we will see that log backup #3 is the same size as log backup #2. The full backup will make no difference whatsoever.
Here’s the script:
USE master; GO DROP DATABASE LogBackupTest; GO CREATE DATABASE LogBackupTest; GO USE LogBackupTest; GO ALTER DATABASE LogBackupTest SET RECOVERY FULL; GO BACKUP DATABASE LogBackupTest TO DISK = 'C:SQLskillsLogBackupTest_Full1.bak' WITH INIT; GO CREATE TABLE t1 (c1 INT IDENTITY, c2 CHAR (8000) DEFAULT 'a'); GO CREATE CLUSTERED INDEX t1c1 ON t1 (c1); GO SET NOCOUNT ON; GO INSERT INTO t1 DEFAULT VALUES; GO 1000 BACKUP LOG LogBackupTest TO DISK = 'C:SQLskillsLogBackupTest_Log1.bak' WITH INIT; GO -- Rebuild the index to generate some log and get a baseline ALTER INDEX t1c1 ON t1 REBUILD; GO BACKUP LOG LogBackupTest TO DISK = 'C:SQLskillsLogBackupTest_Log2.bak' WITH INIT; GO -- Now do it again, but take a full backup before the log backup ALTER INDEX t1c1 ON t1 REBUILD; GO BACKUP DATABASE LogBackupTest TO DISK = 'C:SQLskillsLogBackupTest_Full2.bak' WITH INIT; GO BACKUP LOG LogBackupTest TO DISK = 'C:SQLskillsLogBackupTest_Log3.bak' WITH INIT; GO
And here’s the result:
Log backup #3 is the same size as log backup #2. It contains all the log generated since log backup #2 was taken. The full backup had no affect whatsoever, because that would break the log backup chain.
If you don’t believe me, run the script yourself and you’ll see. A full backup does not and cannot affect the transaction log.
28 thoughts on “Misconceptions around the log and log backups: how to convince yourself”
Thanks for pointing this out. Another way to look at this… I learned this lesson many (many) years ago. We had a customer who did nightly FULL backups and Txlog backups every 30 minutes. They called one day to say that the server and disks had failed and needed help restoring from backup. The normal plan – restore from the last Full backup and then apply TXlogs until done.
Well it was soon discovered that several FULL backups were corrupt (SQL 2000 days when you could backup a corrupt db— but not restore it). Anyhow – after a bit of panic our DBA pointed out that the Full backup does not break the chain… we were able to come up with a good full backup over a week old and still had all txlog backups since. Previously I thought that the previous days files were "obsolete" because there was a Full backup in between – not so (I was asleep during that training topic).
It was a long set of txlogs to restore – but it worked (the log backups weren’t corrupt – only the FULL).
Several customers run out of space on the backup device after a large maint night. Many ask "how come the tx backup is larger than the full backup?" Well now you know! Of course they always want to know if there is a way to predict this before it happens (which I think your previous blog posts help answer).
"Knowledge is SQL power!"
That certainly makes it very clear. Also the comment above about "…the Full backup does not break the chain…" – I never realised that you could therefore then use a *any* full backup and apply logs, even if those logs occurred after another subsequent full backup(s).
(ps. why isn’t you explaination above part of SQL Books online?)
Books Online does kind of say all that but in a roundabout way unfortunately.
Hi. I’m relatively new to SQL Server, so I’m still working on best practices. If the transaction log is not dependent on the full backup (or vice versa), then how many days of transaction logs should I keep? (I was only keeping a few days worth, thinking I didn’t need them anymore after the full backup ran.)
You need to keep anough around that you can recover up to the current point even if the most recent full backup is corrupt. So, for instance if you have a weekly full backup and log backups every 4 hours (YMMV), you should keep at least 2 full backups and all log backups since the earliest full backup you have, IMHO.
Thanks this was very helpful.
For those interested in more information about backups after minimally-logged operations under BULK_LOGGED recovery model take a look at: http://msdn.microsoft.com/en-us/library/ms190692(v=sql.105).aspx “How the Bulk-Logged Recovery Model Works”
“A log backup is *ALL* the log generated since the last log backup.”
Does it also include the log records in the log buffer?
if that’s the case , does it need a checkpoint when doing the log backup to flush the log records in the log buffer to disk?
thank you so much.
No – it’s to the last flushed portion when the backup starts. Otherwise the log backup would be never-ending if log records could be generated as fast as they can be read into the backup.
Well said! I just implemented my first SQL maintenance as part-time DBA and I thought that the chain was broken by full backups.
Thank you so much!
For years, I too have used log chains with full backups in the middle, and I have never seen an issue. But why in the world does MS have this feature and make these comments?
“A copy-only backup is a SQL Server backup that is independent of the sequence of conventional SQL Server backups. Usually, taking a backup changes the database and affects how later backups are restored. However, occasionally, it is useful to take a backup for a special purpose without affecting the overall backup and restore procedures for the database. Copy-only backups serve this purpose. ”
Maybe its just how it affects MSDB?
A copy-only full backup doesn’t reset the differential bitmaps, so doesn’t ‘break’ the differential backups. A copy-only log backup isn’t part of any log backup chain. These are useful if you want to do a right-up-to-now backup for a dev restore, for instance, without messing up your log backup chain. Copy only is very useful, and necessary.
…And maybe their comments mostly apply to doing a copy only LOG backup.
At a lot of smaller companies I know of, the database and log files are on the local sql server machine itself, with both being backed up nightly. In the case of a total disk crash on that server you’re really no better off than if you had been in Simple mode at that point correct? I suppose you’d only be protected from some sort of database corruption that you could blow out and restore over with your last full and the log files on disk … am I correct in this? Trying to make sure I understand the paradigm.
If the server is destroyed, you’re only as good as your latest off-server backup.
Very nice Paul, you are best of the best.
Hi, in your article you state that a full backup needs to be made before a newly created database is properly in fully logged mode. I’ve discovered that a copy-only full backup isn’t good enough to put a newly created database into proper fully logged mode. Can you explain why that is? I thought copy-only full backups only ignored the differential chain?
It’s because any copy-only backup does not affect *anything* to do with the database, so the copy-only full backup doesn’t set the LSN field in the database metadata that is the pre-requisite for moving from pseudo-simple recovery model to full recovery model.
Do I need to keep the log files for database that are using the simple recovery mode?
I had an error on a recovery because we encrypt our database and when you remove the encryption and backup a simple recovery mode database it also backs up the log file but if you rebuild your server and try to restore it the restore fails because it tries to restore the log file that is still encrypted.
Yes, the log file is always required. Simple mode doesn’t mean no logging, just that some operations are logged more efficiently (e.g. index rebuilds, bulk loads).
I manage a 5-node SQL Server 2012 Availability Group. We don’t use differential backups (just tlog backups), and we take full nightly COPY_ONLY backups off of a secondary synchronous replica. Should a full backup (non-COPY_ONLY) be taken on a regular basis just to update the LSN’s? We do restores all the time (for verification and research reasons, or to update our staging environment) without any problems.
COPY_ONLY full backups only affect subsequent differential backups (they can’t be used as the start of a restore sequence involving differential backups). The COPY_ONLY option on a full backup has no effect whatsoever on log backups.
If we take trn log backup it truncates logs, now I am taking diff backup, how it has all the log records if the logs are truncated/cleared?
It doesn’t – it only has log records necessary to recover the database to point of diff backup. If there’s an active transaction, log backup can’t truncate those log records.
I have a transaction log file which is full . Can I take backup of that log file . Is there any other way to create space
Wow, was really useful. Thanks a lot.
I am a sysadmin of database, but i have totally no idea about technical aspect of keeping database clean (i create database, procedures, schemas, accesses, roles, tables, etc, but i do not touch technical aspects – server is created by 3rd party).
My database is on FULL backup mode – it was tiem when i was really scared, when my Log was rapidly increasing in size (when it was 32 mb, and then 64, … 1 Gb in just days) – now i have 22 GB, growing spd in MB is a bit decreased (stable, log was growing much faster when i was creating all schemas, procedures etc comparing a lot of data, drop/truncate tables, etc) and still increasing spd in GB is very slow.
1. How to get a point, when log should be cleared out? 22 GB seems not be really big based on current times, anyway: last backup was done on May. Should i even focus on log size or last backup time is more important?
2. I am using SQL via SSMS w/o access to local machine – Am I able to make a database& log backup or it should be made only by users with access to local machine (i dont have access to any other server, i don’t want to make backup from server to my local computer)?
3. If i will decie to make a ticket for this case, what i need to request for? ‘Database & log backup’ will be enough? Or i need also add ‘Database & log backup, log truncate & shrink’ ? I need to be fully clear in tickets
4. How backup and truncate/shrink works for database? I am not able to find a clear answer. Server can be online for users who work on database? There is any recomendation to make it outside of hours? I dont have access/possiblity to run/restart server, it can be done by 3rd party, so they need also to restart database if needed.
Best regards :)
Way too much to cover in a comment here. I highly recommend you get some help from a DBA. In the meantime, have a read through the backup sections of our accidental DBA blog series – see https://www.sqlskills.com/help/accidental-dba/