This month the SQLskills team is presenting a series of blog posts aimed at helping Accidental/Junior DBAs ‘keep the SQL Server lights on’. It’s a little taster to let you know what we cover in our Immersion Event for The Accidental/Junior DBA, which we present several times each year. You can find all the other posts in this series at http://www.SQLskills.com/help/AccidentalDBA. Enjoy!
In today’s post I want to introduce you to the three recovery models a database can use, and to the various kinds of backups that are possible with SQL Server.
Before that, I need to explain a little bit about the transaction log that exists in each database. Whenever a transaction makes a change to a database, a record of the change is made in the transaction log, using log records to describe the change. Each transaction that’s making changes can either commit, making those changes durable in the database, or it can roll back, which removes the effects of the changes from the database.
If the system crashes, databases undergo crash recovery, which makes sure that:
- For all transactions that were committed at the time of the crash, their effects are present in the database. This may mean that some log records are redone – re-applying the described changes to the database.
- For all transactions that were not committed at the time of the crash, their effects are not present in the database. This may mean that some log records are undone – applying changes that reverse the described changes in the database (essentially doing the anti-operations, like deleting a table row that was inserted by an uncommitted transaction).
Crash recovery makes sure that the database is transactionally-consistent. The same set of operations occur when a database is restored as part of disaster recovery, using the transaction log records present in your various backups.
The transaction log needs to be managed to ensure it doesn’t grow out of control, by periodically allowing the contents of the log to be cleared, so the space can be reused for more log records. “Clearing” the log is sometimes referred to as “truncating” the log and they both mean the same thing: some portion of the log will be marked as available for reuse.
Back in 2009 I wrote an long, introductory article for TechNet Magazine that discusses all of this in more detail – see Understanding Logging and Recovery in SQL Server – and if you want really comprehensive information on the transaction log, see my 7.5 hour, online Pluralsight course: SQL Server: Logging Recovery, and the Transaction Log.
The way this log clearing is triggered depends on the recovery model that is used for the database.
There are three recovery models:
- Full recovery model (the default and the most commonly used)
- All modifications in the database a fully logged
- The log will not clear until a transaction log backup is performed
- Bulk-logged recovery model
- Some modifications (like an index rebuild or a bulk load) can be minimally logged, which reduces the amount of log records generated
- The log will not clear until a transaction log backup is performed, just like the full recovery model
- Simple recovery model
- Some modifications can be minimally logged, just like the bulk-logged recovery model
- The log will not clear until a checkpoint occurs (usually automatically), and transaction log backups are not possible
The main thing to remember is that if your database uses the full or bulk-logged recovery model, you must perform periodic transaction log backups or the transaction log will grow forever.
There are three commonly used backup types in SQL Server:
- Full database backup
- Transaction log backup
- Differential database backup
There are also smaller-granularity data backups that are much less commonly used:
- Full filegroup backup
- Full file backup
- Differential filegroup backup
- Differential file backup
I’m going to focus on the three common backup types, as these are the ones you’ll be using as an Accidental DBA.
Full database backup
A full database backup (usually just referred to as a ‘full backup’, skipping the word ‘database’) backs up the entire database, and a full backup must be performed before you can perform other types of backups. When you’re performing a restore operation, the first backup you restore from must be a full backup, as that is the basis for other backups to be restored.
A full backup backs up all the data from the data files plus some transaction log records from the transaction log. This is to ensure that when restoring the full backup, the transaction log records allow the restored database to be brought to a transactionally-consistent point. Contrary to popular myth, full backups to not allow the transaction log to clear, and in fact do not affect the transaction log in any way whatsoever.
A full backup allows you to restore the database to a single point in time – the end of the backup – or to continue restoring more differential or log backups to bring the restored database more up-to-date.
Full backups are usually performed once per day or once per week.
Transaction log backup
A transaction log backup (usually just referred to as a ‘log backup’, skipping the word ‘transaction’) backs up all the transaction log records generated since the most recent log backup occurred. When a log backup completes, log clearing will be performed, which allows the log to be reused, helping to prevent log growth. A log backup cannot be performed until a full backup has been performed, but once a full backup has been performed, log backups and full backups do not interact or have any effect on each other at all.
A log backup allows to you continue a restore operation started by restoring a full backup, with each log backup that you restore bringing the restored database more up-to-date.
Log backups are usually performed once or multiple times per hour.
There is a special kind of log backup called a ‘tail-of-the-log’ or ‘tail-log’ backup that is performed when recovering from a disaster. The idea is that the very first thing you do after a disaster occurs is to take a final log backup, capturing all the transaction log that was generated since the most recent scheduled log backup before the disaster. During a restore sequence, this log backup will be the last one to be restored, bringing the database as close as possible to the point of the disaster.
Differential database backup
A differential database backup (usually just referred to as a ‘differential backup’ or ‘diff backup’, skipping the word ‘database’) backs up all the data that has changed since the most recent full backup. As more and more data changes in the database since the most recent full backup, successive differential backups will get larger and larger. Differential backups are cumulative, not incremental (which is a common misconception, as other database systems do have incremental data backups – SQL Server does not).
Differential backups are used to speed up restores by essentially replacing a number of log backups, as a differential backup is essentially the net effect of all the log backups since the most recent full backup.
Differential backups are not so commonly used, but when they are, they are usually performed once or multiple times per day.
Most of you will be using the full recovery model, which means you need to perform log backups regularly. But how regularly? A proper backup strategy takes into account the RTO and RPO requirements to determine what backup types to use, and how often to perform them. And that’s the topic for tomorrow’s post!