(Look in the Misconceptions blog category for the rest of the month’s posts and check out the 60-page PDF with all the myths and misconceptions blog posts collected together when you join our Insider list, plus my online Myths and Misconceptions training course on Pluralsight.)
The month is finally over so time for the grand finale!
Although it’s been fun debunking all these myths, it’s been a tad stressful making sure I come up with an interesting and useful myth to debunk every day.
To round out the month, I present to you 30 myths around backups – one for each day of the month of April. Last night I sat down to write this post and was a few myths short so reached out to the fabulous SQL community on Twitter (follow me!) for help – too many people to list (you know who you are) – I thank you!
A few folks have asked if I’ll pull the month’s posts into a PDF e-book – let me know if you’d like that.
I *really* hope you’ve enjoyed the series over the last month and have had a bunch of myths and misconceptions debunked once and for all – I know quite a few of you are going to use these explanations as ammunition against 3rd-party vendors, developers, and other DBAs who insist on incorrect practices.
Ok – here we go with the last one…
Myth #30: various myths around backups…
All are FALSE!!
For a good primer on understanding backups and how they work see my TechNet Magazine article Understanding SQL Server Backups. [Edit 2016: the article has been removed – check out the backup posts at https://www.sqlskills.com/help/accidental-dba/ for more info.]
30-01) backup operations cause blocking
No. Backup operations do not take locks on user objects. Backups do cause a really heavy read load on the I/O subsystem so it might *look* like the workload is being blocked, but it isn’t really. It’s just being slowed down. There’s a special case where a backup that has to pick up bulk-logged extents will take a file lock which could block a checkpoint operation – but DML is never blocked.
30-02) switching from the FULL recovery model to the BULK_LOGGED recovery model and back again breaks the log backup chain
No. It just doesn’t. Switching from either FULL or BULK_LOGGED to SIMPLE *does* break the log backup chain however.
30-03) breaking the log backup chain requires a full backup to restart it
No. You can restart the log backup chain with either a full or differential backup – anything that bridges the LSN gap from the point at which the log backup chain was broken. See my blog post A SQL Server DBA myth a day: (20/30) restarting a log backup chain requires a full database backup for more details.
30-04) concurrent log backups are not possible while a full or differential backup is in progress
No, this changed in SQL Server 2005. See my blog post Search Engine Q&A #16: Concurrent log and full backups.
30-05) a full or differential backup clears the log
No. A log backup includes all the log since the last log backup – nothing can change that – no matter whether that log was also backed up by a full or differential backup. I had a famous argument on Twitter last year and wrote this blog post as proof: Misconceptions around the log and log backups: how to convince yourself. In the FULL or BULK_LOGGED recovery models, the *only* thing that clears the log is a log backup.
30-06) using the BULK_LOGGED recovery model for minimally-logged operations reduces the size of the next transaction log backup
No. A minimally-logged operation is so-named because only the page allocations are logged. A log backup needs all the information necessary to resconstitute the transaction, so a log backup following a minimally-logged operation must backup the log plus all extents changed by the minimally-logged operation. This will result in the log backup being roughly the same size as if the operation was fully logged.
30-07) full and differential backups only contain the log generated while the backup was running
No. A full or differential backup contains enough log to be able to recover the database to a transactionally-consistent view of the database at the time the data-reading portion of the backup finished (or as far back as the oldest log record that transactional replication has not yet processed – to ensure that replication works properly after a restore). Check out these two blog posts for details:
30-08) backups always test existing page checksums
No. It only does it when you use the WITH CHECKSUM option – which you should.
30-09) backups read data through the buffer pool
No. The backup subsystem opens its own channels to the database files to avoid the performance hit of having to read everything into SQL Server’s memory and back out to the backup device (and also effectively flushing the buffer pool in the process). If you ask the for page-checksum checking, it uses it’s own small portion of memory.
30-10) backups perform consistency checks (a la DBCC CHECKDB)
No. Nothing else to say.
30-11) if the backup works, the restore will too
No. Please don’t fall into this trap. You must regularly validate your backups to give yourself a high level of confidence that they will work if a disaster occurs. See Importance of validating backups for more details.
30-12) a mirrored backup will succeed if the mirror location becomes unavailable
No. If any one of the paths to a mirrored backup fails, the entire mirrored backup operation fails. I’d really like it to work the other way around – where the local backup succeeds and the remote backups fail, but it doesn’t unfortunately.
30-13) a tail-of-the-log backup is always possible
No. A tail-of-the-log backup is one that backs up all the log generated since the last log backup, in an exceptional situation. If the data files are damaged, you can still do a tail-of-the-log backup EXCEPT if the un-backed-up log contains a minimally-logged operation. That would require reading data extents – which cannot be done if the data files are damaged. For this reason, the BULK_LOGGED recovery model should not be used on databases that have 24×7 user transactions.
30-14) you can use backups instead of DBCC CHECKDB
No. See A SQL Server DBA myth a day: (27/30) use BACKUP WITH CHECKSUM to replace DBCC CHECKDB.
30-15) you can backup a database snapshot
No. It’s not implemented, but would be great if you could.
30-16) you can use database snapshots instead of log backups
No. A database snapshot is only usable while the database on which it is based is usable and online. If the source database is corrupted, the database snapshot most likely is too. If the source database goes suspect, so does the database snapshot.
Also, having multiple database snapshots (equating to multiple log backups) incurs an increasing performance drain – as every page that changes in the source database may need to be synchronously written to all existing snapshots before it can be written to the source database data files, and all existing database snapshots will grow as more pages are pushed into them.
30-17) log backups will be the size of the log
No. The log has to accommodate the space necessary to roll back active transactions, the amount of space returned by DBCC SQLPERF (LOGSPACE) on a busy system doesn’t accurately refect the amount of log records in the log. This blog spot explains: Search Engine Q&A #25: Why isn’t my log backup the same size as my log? And apart from that, a log backup is just all the log generated since the last log backup – not the whole log file usually – and if it happens to be, the first part of the explanation comes into play.
30-18) you cannot backup a corrupt database
No. In most cases you can use the WITH CONTINUE_AFTER_ERROR option to back up the corrupt database. If that fails (maybe because of a damaged boot page or file-header page), there are no other options apart from OS-level file backups.
30-19) you cannot stop someone doing a BACKUP LOG .. WITH NO_LOG or TRUNCATE_ONLY operation
No. It’s not possible any more (yay!) and in 2005 and before, use trace flag 3231 which turns the operation into a no-op.
30-20) log backups always clear the log
If there’s no concurrent data backup running, a log backup will always *try* to clear the log, and only succeed in clearing the inactive portion of the log – the log that’s only considered ‘required’ by SQL Server because it hasn’t yet been backed up. If anything else is holding the log ‘required’, it cannot be cleared, even though it has been backed up. Subsequent log backups will check again and again until the time comes when that portion of the log can be cleared. The TechNet Magazine article Understanding Logging and Recovery in SQL Server I wrote last year explains a lot more about how the log works.
Also, if there is a concurrent data backup running, the log clearing will be delayed until the data backup finishes. See the blog post in myth 30-04 for more details.
30-21) differential backups are incremental
No. Differential backups are all the data extents that have changed since the last full backup – so they are cumulative. Log backups are incremental – all log generated since the last log backup. Many people call differential backups ‘incrementals’, when they’re not really.
30-22) once a backup completes, you can safely delete the previous one
No. No. No.
If you go to restore, and you find your full backup is corrupt, what do you do? Well, if you don’t have an older full backup, you most likely start updating your resume. You need to keep a rolling-window of backups around in case a disaster occurs and you need to restore from an older set of backups.
30-23) you can back up a mirror database
No. A mirror database is not accessible except through a database snapshot. And you can’t back up that either.
30-24) you can back up a single table
No. You can effectively back up single table if it happens to be wholly contained on a single filegroup, but there’s no way to say BACKUP TABLE.
30-25) SQL Server has to be shut down to take a backup
No. No idea how this myth started… [Edit: apparently this myth started with Oracle – and we all know how good Oracle is compared to SQL Server… :-)]
30-26) my transaction is guaranteed to be contained in the backup if it committed before the backup operation completed
No. The commit log record for the transaction has to have been written out before the data-reading portion of the backup finished. See my blog post Search Engine Q&A #6: Using fn_dblog to tell if a transaction is contained in a backup for more details.
30-27) you should shrink the database before a backup to reduce the backup size
No. Shrink just moves pages around so won’t make any difference. See my old blog post Conference Questions Pot-Pourri #10: Shrinking the database before taking a backup. And of course, shrink is evil. See A SQL Server DBA myth a day: (9/30) data file shrink does not affect performance. And what’s even worse as someone reminded me, is if you do the shrink *after* the full backup, the next differential backup may be huge, for no actual data changes!
30-28) backups are always the best way to recover from a disaster
No. Backups are usually the best way to recover with zero data-loss (as long as you have log backups up to the point of the disaster), but not necessarily the best way to recover with minimal downtime. It may be way faster to fail over, or to run repair and accept some data loss if the business requirements allow it.
30-29) you don’t need to back up master, msdb, model…
No. You should always back up the system databases. Master contains all the security info, what databases exist – msdb contains all the SSIS packages, Agent jobs, backup history – model contains the configuration for new databases. Don’t fall into the trap of only backing up user databases otherwise you’ll be in a world of hurt if you have to do a bare-metal install.
30-30) you should always plan a good backup strategy
No. Now you’re thinking ‘Huh?’…
You should plan a restore strategy. Use the business requirements and technical limitations to figure out what you need to be able to restore in what time, and then use that to figure out what backups you need to take to allow those restores to happen. See the blog posts:
The vast majority of the time people plan a backup strategy without testing or thinking about restores – and come a disaster, they can’t restore within their SLAs. Don’t let that be you.