Paul and I started discussing a comment that came up regarding the many issues surrounding logging & recovery. It's one of our favorite topics and in fact was the title to an article that Paul recently wrote for TechNet here: http://technet.microsoft.com/en-us/magazine/2009.02.logging.aspx. And, as a sidenote, depending on how much you already know about the transaction log – you might want to review that article first!

The comment that came up was related to a common misunderstanding on what is and what is not required to make a backup transactionally consistent when restored. And, in my opinion, some of the confusion as to whether or not log backups are "required" is because many changes have occurred release to release. Also, a lot of us say "log backups are required for better recovery" and while restoring log backups is what allows features like up-to-the-minute recovery and point-in-time recovery, not all strategies or recovery procedures actual require additional and/or separate log backups (some backups actually backup part of the log during their backup – and this is actually something that has changed release to release). And so, this is the reason for this post, I want to try and clear up a few of the many misconceptions about what happens with regard to the log during backup and restore. What's really interesting is that some of the best features (seemingly minor) have been around exactly this – the behavior of the transaction log during other backups and the requirement during restore. So, I thought I'd give a play by play from 2000 to 2005/2008 to discuss the differences and what's changed and why those changes were significant. The biggest changes were between 2000 and 2005.

First and foremost, the log portion of a database is required to make that database transactionally consistent. The transaction log is the key to SQL Server's durability (data integrity even after power loss). Transaction log backups are the key to our being able to recover from more catastrophic failures (possibly even point-in-time recovery if the right backup strategy exists). Inside the database, SQL Server doesn't really need all of the transaction details after they've guaranteed a transaction's durability (or, more simply put, once the effect of the change has been reflected in the data portion of the database then the details of that change are no longer needed in the log portion of the database). As a result, you can have SQL Server clear the "inactive" portion of the log by setting the database's recovery model to the SIMPLE recovery model. Loosely translated the SIMPLE recovery model means "when SQL Server no longer requires the transaction information to guarantee durability – then the log information can be removed from the log". Setting the recovery model to SIMPLE limits your backup options and makes administration easier (i.e. simple :)); however, it does not offer any other protection in the event of a more catastrophic disaster (because the log is being regularly cleared then there's no transactional information to backup). For some development/test databases and databases where data loss is not a major concern, then this can be an easy choice because log management (i.e. backups) does not need to be performed. However, if you want to minimize data loss – you can't choose the SIMPLE recovery model; you must choose either the FULL (which is the default) or the BULK_LOGGED recovery model. However, the discussion on when/why to choose BULK_LOGGED is a lengthy one and it does NOT impact the rest of this blog post. However, I did write a chapter for a SQL Server 2000 HA book and I described in detail the best uses for the BULK_LOGGED recovery model as well as the benefits and concerns. While this was written for SQL Server 2000, most of it *still* applies (and there are a few timeline based examples as well). You can download a pdf of this chapter here: http://www.sqlskills.com/resources/SQLServerHAChapter9.pdf.

As for the main purpose of this post – there are basically a few key questions that I want to answer/clarify by version:

  • Is the log backed up as part of the other backups?
  • Is it cleared?
  • Is there anything else that's affected?
  • What happens to the log during other database, filegroup, file, database-differential, filegroup-differential and file-differential backups? And, since the behaviors and internals seem to be grouped into two groups, I will differentiate between these two different groups of backup strategies with the following types:   
    • Database-level backup strategies are backups that use database and optionally database-differential backups
    • Granular backup strategies are backup strategies that use file and/or filegroup backups and optionally file-differential/filegroup-differential backups

SQL Server 2000
Database-level backups cannot occur simultaneously with log backups. However, granular backups *CAN* occur concurrently with log backups.
If a log backup is attempted while a database-level backup is running, then the log backup is paused. This can have the following affects:

  • the transaction log may require auto-growth and become very large

  • the transaction log for a secondary server (i.e. through log shipping) can fall *very* far behind the primary server. And, this is a HUGE concern for high availability. If a full backup takes 4 hours to run, then logs cannot be shipped for 4 hours. As a result of this limitation, some chose to use a granular backup strategy. The reason why log backups CAN occur concurrently in SQL Server 2000 is because, in implementation, SQL Server does NOT to backup the log as part of these more granular backups. As a result, transactional integrity is not guaranteed until the appropriate log chain is rolled forward. This has the following affects:
    • Granular backups only support the BULK_LOGGED or FULL recovery models (somewhat negative but not really)
    • The transaction log backups could run and even clear the inactive portion of the log while these granular backups were running (this is a huge benefit because it limits the need to auto-grow during these backups)
    • Recovery during restore is required (for transactional integrity) which means that all logs need to be restored to cover the time of the granular backup (and then all of those up-to-the-minute or to the desired point-in-time). And, even if a filegroup is set to READ_ONLY – *all* transaction logs need to be restored (this is a big negative but there is a trick: perform periodic file/filegroup differentials (after setting the filegroup to READ_ONLY) so that you can avoid having to perform numerous transaction log restores).

So, to answer the questions for SQL Server 2000:

  • Is the log backed up as part of the other backups?
    • for Database-level backups: YES
    • for Granular backups: NO
  • Is it cleared?
    • When a transaction log backup is performed then the default behavior is to clear the inactive portion of the log
    • When a database-level backup is performed AND there's no corresponding log chain (meaning the first time you backup the database OR the first time you backup the database after the transaction log chain was broken), then the transaction log is cleared. NOTE: Breaking the continuity of the log is relatively easily done in SQL Server 2000 when someone executes a BACKUP LOG with NO_LOG or a BACKUP LOG with TRUNCATE_ONLY command. To disable these from execution (for the FULL or BULK_LOGGED recovery model, use TRACE FLAG 3231). This is a VERY COOL and *SAFE* trace flag. I blogged about this trace flag in a "MSDN webcast Q&A" here. An important side note here is that in SQL Server 2000, log backups can be performed AFTER the continuity of the log has been broken. So, if someone manually cleared the log (using NO_LOG or TRUNCATE_ONLY) and did NOT follow that with a database-level (or appropriate granular backups), then scheduled log backups could continue to run without failure or errors. However, log backups performed AFTER the continuity of the log has been broken CANNOT be restored. So, during recovery you might receive an error that a log backup cannot be applied because it's too "late" to apply. Using Trace Flag 3231 reduces this possibility. However, SQL Server 2005 fixes some of these issues.
  • Is there anything else that's affected?
    • Log backups are paused during database-level backups
    • When restored, database-level backups are transactionally consistent (and can be recovered directly – without restoring additional logs)
    • When restored, granular backups require transaction log backups to guarantee transactional integrity (note: this can be complex to determine the "minimum effective log sequence" and I wrote a series of articles for SQL Server Magazine on how to determine the appropriate log sequence here)
  • What happens to the log during other database, filegroup, file, database-differential, filegroup-differential and file-differential backups?
    • for Database-level backups: log backups cannot occur concurrently 
    • for Granular backups: log backups can occur concurrently and are required for recovery

SQL Server 2005
The biggest improvement in SQL Server 2005 was that log backups are no longer paused by database-level backups – they *can* occur simultaneously; however, this change to database-level backups also applied to granular backups. While 2000 did allow log backups at the same time as a granular backup, they did so by NOT maintaining transactional integrity in the backup. In SQL Server 2000, you need to restore logs to make the granular backup transactionally consistent. In SQL Server 2005, they changed ALL backup strategies to follow the same behavior – database-level and granular backup strategies ALL backup the required log information needed to recover the backup to a transactionally consistent point in time which is essentially when the backup completes (this is a lot more complex than it sounds but Paul wrote a comprehensive post on exactly what this means here). Simply put, this requirement means that transaction log backups CAN occur concurrently; however, the log CANNOT be cleared until the backup completes. The primary negative effect is that the transaction log may require auto-growth and become very large. However, the positives are that you can do granular backups in any recovery model (although there are still some limitations to how this works in the SIMPLE recovery model but they added a new option during backup to allow a backup of ALL of the READ_WRITE_FILEGROUPS as a unit – separately from the read-only file groups which could be backed up at any time after they are set to READ_ONLY).

So, to answer the questions for SQL Server 2005:

  • Is the log backed up as part of the other backups?
    • for Database-level backups: YES
    • for Granular backups: *YES*
  • Is it cleared?
    • When a transaction log backup is performed then the default behavior is to clear the inactive portion of the log
    • When a database-level backup is performed AND there's no corresponding log chain (meaning the first time you backup the database), then yes, the inactive portion of the log is cleared. As far as breaking the continuity of the transaction log… In SQL Server 2005, they significantly reduced the problems that occur after the log chain is broken by NOT allowing log backups to continue. If a log backup is attempted after the continuity of the log is broken then you will receive error: 
        Msg 4214, Level 16, State 1, Line 1
        BACKUP LOG cannot be performed because there is no current database backup.
      So, this means that you don't necessarily need the trace flags. However, I still recommend using the trace flag because it would be better to not break the continuity of the trace flag to begin with! And, in fact, in SQL Server 2005, there are two trace flags: 3231 and 3031. They are both safe and here's how the two differ:
      • Trace Flag 3231 (same as 2000): When set, BACKUP LOG with TRUNCATE_ONLY and BACKUP LOG with NO_LOG do not allow a log backup to run if the database's recovery model is FULL or BULK_LOGGED.
      • Trace Flag 3031 (new in 2005): When set, BACKUP LOG with TRUNCATE_ONLY and BACKUP LOG with NO_LOG run as a CHECKPOINT – regardless of recovery model.
  • Is there anything else that's affected?
    • Log backups are *NOT* paused during database-level backups
    • When restored, database-level backups are transactionally consistent (and can be recovered directly – without restoring additional logs)
    • When restored, granular backups are transactionally consistent (and can be recovered directly – without restoring additional logs). However, you must always remember that the database cannot be brought online until the entire database is at a single transactionally consistent point in time. All read-write-filegroups must be restored as a unit (if in the SIMPLE recovery model) OR you must use transaction log backups to recover the entire database up to the SAME point in time.
  • What happens to the log during other database, filegroup, file, database-differential, filegroup-differential and file-differential backups?
    • for Database-level backups: log backups *can* occur concurrently (but the log will not be cleared until the backup completes) 
    • for Granular backups: log backups can occur concurrently (but the log will not be cleared until the backup completes) 

SQL Server 2008
Almost everything is the same in SQL Server 2008 as it was in 2005 – they made the largest number of improvements in 2005. However, one thing did change. In SQL Server 2008, the BACKUP LOG with NO_LOG and BACKUP LOG with TRUNCATE_ONLY options are not allowed at all. There is no need for the trace flags (3231/3031) because breaking the continuity of the log is not allowed (well, there is still a way… I'll get to that in a moment :)). In SQL Server 2008, if BACKUP LOG with NO_LOG or BACKUP LOG with TRUNCATE_ONLY are attempted, you will receive this error:
     Msg 3032, Level 16, State 2, Line 1
   One or more of the options (no_log) are not supported for this statement. Review the documentation for supported options.
But, what if you really don't want to backup the log? Why? Take this scenario (from a real customer!)… You have a 10GB database that's been around for quite some time AND you're doing regular full database backups… then, all of a sudden you run out of disk space. In looking around for large files (to investigate why you ran out of space), you find that this 10GB database's log is 987GB… so, you wonder – what happened? A database that is in the FULL recovery model (remember, this is the default) requires transaction log management. The easiest way to manage the log is with regular log backups; however, you're only doing full database backups (which do NOT clear the log). As a result, the transaction log grows and grows and grows and grows – until you're out of disk space (Paul demo'ed this in a TechNet Podcast here). At this point, how do you get rid of this 987GB transaction log? In prior releases, you can "clear" the log by using TRUNCATE_ONLY or NO_LOG but in 2008, what do you do? Switch to the SIMPLE recovery model. And, if you only want to do full database backups, stay there. And, if you want to physically shrink down the size of the transaction log file to a reasonable size – check out these two related blog posts: 8 Steps to Better Transaction Log Throughput and Transaction Log VLFs – Too many or too few?. And, in related news, Linchi Shea posted a good post on some tests he ran related to too many VLFs here and a second post that shows that some workloads don't see any issues wrt to lots of VLFs here. But, the long story short is that you still want to be proactive about creating a reasonably sized transaction log (my two other previously mentioned posts). Significant auto-growth can cause problems and backup operations (and managment in general) can be more difficult with lots of VLFs.

Wow, that was much longer than I was expecting… and, in writing it all down – pretty complex (I had a hard time trying to section things but I think this works?!). Regardless, all the facts are there so this should help to clarify what happens the when, where and why – wrt to the transaction log. Let me know if you have more questions!

Thanks for reading,
kt