(I’ve heard from many of you that the Comments feature of my blog isn’t working. I know – there’s an issue with our blog engine that we’re fixing. My apologies – I’ll post a quick note when it’s fixed.) After posting last week about a BACKUP feature that I don’t like (WITH NO_LOG – see here), I thought I’d do a quick post this week about a feature that was introduced in SS2005 for BACKUP that I DO like – the COPY_ONLY option to BACKUP DATABASE and BACKUP LOG. Here’s a situation I’ve seen several times that really screws people up. A savvy DBA of a busy web-fronted sales business has a rigorous backup schedule setup – daily full backups at midnight and differential backups every 4 hours. Everything’s working perfectly. One day a disaster strikes at 7pm and the storage for the database is destroyed. The DBA starts restoring the backups using WITH NORECOVERY, gets to the noon differential backup and gets the following message: RESTORE
GO
Msg 3136, Level 16, State 1, Line 1
This differential backup cannot be restored because the database has not been restored to the correct earlier state.
Msg 3013, Level 16, State 1, Line 1
RESTORE DATABASE is terminating abnormally.
Uh-oh. That’s not good. That says that the 12pm differential backup does not have the same differential base as the 4am and 8am ones. How can that have happened? The DBA takes a look in the backup history tables in msdb (stripping out all the log backups):
SELECT
name, backup_start_date, type, first_lsn, database_backup_lsn
FROM msdb.dbo.backupset WHERE database_name = ‘production’;
GO
name backup_start_date type first_lsn database_backup_lsn
—————————— ———————– —- ——————– ——————–
production Full 10/14/07 2007-10-14 00:00:00.000 D 88000000025300001 0
production Diff 4am 10/14/07 2007-10-14 04:00:00.000 I 118000000003000160 88000000025300001
production Diff 8am 10/14/07 2007-10-14 08:00:00.000 I 144000000070500160 88000000025300001
NULL 2007-10-14 10:29:50.000 D 161000000056100147 88000000025300001
production Diff 12pm 10/14/07 2007-10-14 12:00:00.000 I 161000000062800034 161000000056100147
production Diff 4pm 10/14/07 2007-10-14 16:00:00.000 I 173000000054100144 161000000056100147
Aha! Look at the highlighted date in the output – someone took a full database backup of the database at 10.29am. The DBA checks and finds that one of the developers wanted a copy of the production database to play with so took a database backup. He restored the backup and then deleted both it and the database. Looking at the database_backup_lsn field, we can see that all the backups up till the accidental backup (look at the LSNs highlighted blue) have the differential base equal to the first_lsn of the full backup from midnight. The two backups after that have the differential base equal to the first_lsn of the accidental full backup (the LSNs highlighted red).
Oops! That means that the production database cannot be rolled forward any further than the last log backup before the accidental full backup was taken – losing more than 8 hours of data completely. Even though all the subsequent backups are intact, the initial full backup for them no longer exists so they’re useless!
So how can a developer get a copy of the database without screwing up a recovery from a potential disaster? Using the new COPY_ONLY option. Taking a full backup with this option does not make the new backup a differential base – it does not clear any of the differential bitmaps and basically doesn’t interfere with the regularly scheduled backups. Apart from that, it’s a regular full backup of the database. One thing to bear in mind is that it’s a one-off – you can’t use one of these backups as a differential base, so you can’t take COPY_ONLY differential backups. If you specify COPY_ONLY with DIFFERENTIAL, the COPY_ONLY option is ignored.
One other cool thing is that you can specify this option for a BACKUP LOG command too. This behaves the same way – it takes a log backup, but does not change the transaction log at all (i.e. it doesn’t make any portion of the log inactive and permit log truncation). This is useful for doing online file restores without having the necessary backup of the tail of the log affect the log backup chain. More on that in a later post…
[Edit: In the initial version of this post, the DBA's backup strategy included log backups. I went through a couple of versions of this post before settling on full + diffs, but I forgot to remove the reference to log backups. In the first comment, Mark House correctly points out that an accidental full backup doesn't prevent a DBA with a complete log backup chain from recovering to any point in time. Apologies for the confusion!]
8 Responses to BACKUP WITH COPY_ONLY – how to avoid breaking the backup chain
The stealth backup is yet another reason to keep well-meaning developers, managers, and network admins out of production admin roles. :) If keeping people out isn’t possible, a job that checks for rogue backups could also be used.
However, in this case, if the DBA still has his transaction log backups, things should be OK. The developer’s full backup will ruin all subsequent differential backups, but won’t invalidate the transaction log backups. As long as the DBA has an uninterrupted chain of transaction log backups from any full backup, the data will be restorable up to the last available transaction log. In this case the developer would only owe the DBA a lunch, and not a job reference.
Mark
Hey Mark – absolutely correct. My mistake comes from changing my scenario to only have differential backups and forgetting to go back to my description of the backup strategy to remove the log backups. It’s fixed now :-) Thanks!
Found something that loos like a bug. If you back a database up with Copy_only, then try to use the management studio restore database command, the dialog lists no entries in the backup set list box.
Restoring via script works. This is management studio with SP2 installed.
Hi there,
I want to preface my question by saying that I’m very new to SQL Server and am first and foremost an Oracle DBA.
In this scenario you’ve provided, I’m not entirely sure why this is a problem. I will give you that it will increase the time needed to restore the database because you’d need to wait for the restore to fail, investigate why it failed and then fix the problem, but here’s my question: If you got this error, wouldn’t you simply restore the developer’s full backup from 10:29am, then the 12pm differential based off that full, the 4pm differential and then any log backups up to 7pm? Is this not a possibility in SQL Server?
And in regards to Mark’s comment, why will the developer’s full backup "ruin all subsequent differential backups"? Aren’t they simply based off of a different baseline?
Hopefully someone is still monitoring this thread since it’s almost 2 years old at this point, but I’d appreciate the clarification.
Thanks for the great post!
Hi Vanessa – yes, what you say is of course possible – but only if the developer still has their full backup. And, in a disaster recovery situation, what’s the likelihood that the developer is available at short notice to provide access to the backup? Possibly low. Thanks
Ah yes, I forgot that you had specified the developer had removed the full backup. Thanks for pointing that out!
[...] full backup – so an out-of-band full backup could screw-up your disaster recovery – see this blog post for more [...]
[...] No, no, no, no. A backup file is just like a data file – it sits on an I/O subsystem. And what causes most corruptions? I/O subsystems. You must periodically check that your backups are still valid otherwise you could be in for a nasty surprise when disaster strikes. See Importance of validating backups. The other thing to consider is that an out-of-band full or log backup could have been taken that breaks your restore sequence if it's not available. See BACKUP WITH COPY_ONLY – how to avoid breaking the backup chain. [...]