This month the SQLskills team is presenting a series of blog posts aimed at helping Accidental/Junior DBAs ‘keep the SQL Server lights on’. It’s a little taster to let you know what we cover in our Immersion Event for The Accidental/Junior DBA, which we present several times each year. If you know someone who would benefit from this class, refer them and earn a $50 Amazon gift card – see class pages for details. You can find all the other posts in this series at http://www.SQLskills.com/help/AccidentalDBA. Enjoy!
Kimberly and I have a saying: “You don’t have a backup until you’ve restored it!”. While this is not technically true, the sentiment is that you should be validating your backups, so that you have a good sense of confidence that when a disaster strikes and you go to use your backups, your disaster recovery will be successful.
What do I mean by ‘validating your backups’? I mean that at prescribed intervals you should emulate a disaster recovery response and perform a full restore sequence on a test server, involving your full, differential, and log backups. I usually advise doing this at least once a week, which also happens to be the most common interval between performing full backups. There’s really no frequency that’s too frequent for backup validation, but there are definitely those that are not frequent enough.
There are all kinds of excuses that people have for not regularly validating their backups, including:
- I don’t have time
- I don’t have the necessary disk space
- That’s not my job
- I don’t think it’s important
All of these excuses are invalid in my opinion. As a DBA, accidental or otherwise, it is part of y our job to protect the integrity of the database for which you’re responsible. Simply performing backups is not enough. You’re also responsible for making sure that those backups are valid, and not just at the time they were taken, but as time passes as well because an I/O subsystem can just as easily corrupt a backup on disk as it can corrupt data and log files on disk. It’s extremely important that, when a disaster strikes, you’re able to successfully use your backups – otherwise what’s the point of having them? You can’t ever get a guarantee (as I/O subsystem corruption can happen at any time) but you can get increased peace of mind.
I’ve lost count of the number of times I’ve heard of people with database corruption that also have corrupt backups, or the backups have been failing for months, or their restore sequence is broken because of a missing log backup, or some other variation of backups not working when they’re needed. You don’t want that to happen to you, and it’s really so easy to test your backups even just occasionally. If you think you don’t have enough disk space, you could even just restore to some external USB drives, anything will work as long as it lets you do the restore.
Now the question is, what RESTORE options do you want to use? At the very least, you want the restore system to verify the contents of the backup. Hopefully you’re using the CHECKSUM option for your backups (that I discussed on day 9) which validates page checksums as the pages are read from the data files and also creates a checksum of the entire backup file. If so, you at the very least you should do something like:
RESTORE VERIFYONLY FROM DISK = N'F:\SQLskills\20130609_SalesDB_Full.bck' WITH CHECKSUM; GO
That will re-validate all the page checksums of the pages in the database in the backup, and also recalculate the checksum of the entire backup and validate it against the checksum stored in the backup header. This doesn’t take any disk space at all, and will let you know if your backup is corrupt.
What’s even better is to actually perform the restore operations, using the CHECKSUM option at all times, and then running consistency checks (through DBCC CHECKDB that we’ll discuss later in the month) to make sure that all the backups you would use are valid, and the database itself is free of corruptions.
If you do find you’ve got corrupt backups, you know you’re going to have to take new backups or potentially go backup to older backups that you’ve kept around for just this eventuality. Which brings up another issue: backup retention. How long do you keep backups around for, and where do you keep them? I’ll talk about that tomorrow.