Now, I’m very thick-skinned and I know there are always some people in a conference session who don’t agree with everything I say (that’s human nature, and I’m totally cool with that) but this one I just couldn’t pass up mentioning here on the blog as I *utterly* disagree with the advice in that post, and suspect that the poster didn’t “get” what I was trying to explain in the session.
I came across an interesting blog post from someone who attended PASS, describing my Corruption Survival Techniques session as really interesting and fun, but basically useless. The advice was that there are only a handful of people in the world who can run things like single-page restore and emergency mode repair, and as soon as corruption is suspected, the DBA should just call Product Support for help.
The point of my session is to explain two things – that you should pro-actively be looking for corruption, and you should know what to do when corruption occurs. Both of these enable your business to experience less down-time and data-loss when corruption does occur. So turning on page checksums and running DBCC CHECKDB regularly are easy. So is planning a decent backup strategy (based on what you want to be able to restore – see my previous post on this – Planning a backup strategy – where to start?).
The more tricky part is knowing what to do when corruption does occur. That’s why I discuss some of the output of DBCC CHECKDB, in terms of high-level tips and tricks rather than what each and every error means (see my previous post on this – Tips and tricks for interpreting CHECKDB output). I also recommend backups as the best way to limit data-loss, but not necessarily down-time – depending on the backups you have available. The last part of the session shows some tricks for getting around worst-case scenarios, like someone detaching a suspect database or needing to run emergency mode repair. I don’t expect everyone to run off and start hacking the 2005 system tables with a single-user booted server and using the DAC (but if you do, see this post) but having some of this knowledge can make DBAs more confident to tackle problems themselves and increase their skills.
Since I’ve been blogging about this stuff and presenting it at conferences, I’ve heard from *countless* people who’ve used these techniques themselves to recover from disasters, and learned a ton of information and good practices in the process. Any production DBA with half a brain (a great Scottish expression :-) should be able to use restore, single-page restore, or run a repair – otherwise, with all due respect, they shouldn’t be running a production system. Now, for “involuntary” DBAs, who (through no fault of their own) may not know anything about backups, restores, or repairs – it’s a totally different story, and help should be sought through Product Support or forums.
But to come out with a blanket statement that knowing how to run restores, repairs and do first-level interpretation of DBCC CHECKDB output is useless? And that potentially wasting time and money with front-line Product Support is the best course of action when corruption occurs, when you can work out most of it for yourself? That’s *bad advice* as far as I’m concerned.
Maybe I’m just cranky as I’m sitting here with a very sore mouth after getting a filling at the dentist this morning :-(
What do you think? Comments please!
(PS I’m not fishing for praise – I want to know what you think of the argument)