Corruption survival techniques – useless?

Now, I’m very thick-skinned and I know there are always some people in a conference session who don’t agree with everything I say (that’s human nature, and I’m totally cool with that) but this one I just couldn’t pass up mentioning here on the blog as I *utterly* disagree with the advice in that post, and suspect that the poster didn’t “get” what I was trying to explain in the session.

I came across an interesting blog post from someone who attended PASS, describing my Corruption Survival Techniques session as really interesting and fun, but basically useless. The advice was that there are only a handful of people in the world who can run things like single-page restore and emergency mode repair, and as soon as corruption is suspected, the DBA should just call Product Support for help.

The point of my session is to explain two things – that you should pro-actively be looking for corruption, and you should know what to do when corruption occurs. Both of these enable your business to experience less down-time and data-loss when corruption does occur. So turning on page checksums and running DBCC CHECKDB regularly are easy. So is planning a decent backup strategy (based on what you want to be able to restore – see my previous post on this – Planning a backup strategy – where to start?).

The more tricky part is knowing what to do when corruption does occur. That’s why I discuss some of the output of DBCC CHECKDB, in terms of high-level tips and tricks rather than what each and every error means (see my previous post on this – Tips and tricks for interpreting CHECKDB output). I also recommend backups as the best way to limit data-loss, but not necessarily down-time – depending on the backups you have available. The last part of the session shows some tricks for getting around worst-case scenarios, like someone detaching a suspect database or needing to run emergency mode repair. I don’t expect everyone to run off and start hacking the 2005 system tables with a single-user booted server and using the DAC (but if you do, see this post) but having some of this knowledge can make DBAs more confident to tackle problems themselves and increase their skills.

Since I’ve been blogging about this stuff and presenting it at conferences, I’ve heard from *countless* people who’ve used these techniques themselves to recover from disasters, and learned a ton of information and good practices in the process. Any production DBA with half a brain (a great Scottish expression :-) should be able to use restore, single-page restore, or run a repair – otherwise, with all due respect, they shouldn’t be running a production system. Now, for “involuntary” DBAs, who (through no fault of their own) may not know anything about backups, restores, or repairs – it’s a totally different story, and help should be sought through Product Support or forums.

But to come out with a blanket statement that knowing how to run restores, repairs and do first-level interpretation of DBCC CHECKDB output is useless? And that potentially wasting time and money with front-line Product Support is the best course of action when corruption occurs, when you can work out most of it for yourself? That’s *bad advice* as far as I’m concerned.

Maybe I’m just cranky as I’m sitting here with a very sore mouth after getting a filling at the dentist this morning :-(

What do you think? Comments please!

(PS I’m not fishing for praise – I want to know what you think of the argument)

9 thoughts on “Corruption survival techniques – useless?

  1. Paul,

    On only need look as far as the MSDN forums to find the success behind your Corruption series. I can’t tell you how many people I have helped with your tips and tricks get themselves out of a jamb. It really is not that uncommon for someone to have something critical in SQL, and they don’t have any backups running, and no budget to call Microsoft support because they are a small mom and pop shop that is barely scraping by in today’s economy. They just need enough pointers to get out of the mess they are in, before it makes them go completely under. I have had one person who emailed me privately after the fact to find out how to plan a backup strategy to protect them in the future. It generally only takes once, and people learn to protect their data better.

    To go beyond this, your attach a suspect database trick has saved more than one person who was developing a database in CTP6 or RC0 with a filestream, and then upgraded on migrated servers before realizing they needed the filestream data path. The binary data was an acceptable loss, but not the core data and structure of the database itself. These weren’t rookie DBA’s or coders, but people running on the cutting edge of technology, and there are always growing pangs along the way. They just experienced them for some of us.

    I think your stuff is great, and it has certainly made me a better DBA and a better Forums contributor, so keep it coming.

  2. Hi Paul,
    I was in your class, Corruption Survival Techniques, at the 2008 PASS conference and by no means was it a waste of time, matter of fact it was extremely useful and impressive!!! Anyone mentioning that it should not be attempted and leave it to MSFT PSS should not be running a production system. All your tips and tricks are well worth the time to study, and practice with. Every DBA supporting a production enterprise should be able to perform this level of work in his/her sleep.

    Thanks again….

  3. Hi Paul,

    I already told you at the conference how much I liked your session. And now, I can add that I also found it extremely useful. Not all was new to me, but some was. And for the stuff I already knew, the refresher was welcome as well.

    Describing your sessions as interesting but useless indeed totally misses the point. Even if you never intend to do any restore work yourself (which, in my case, is a sound plan if you’re uncertain in this field), then the session is still useful. For it teaches you how to monitor (so that problems have a better chance of being detected when repair is still relatively easy), and also (though more might be good here) how to plan your backups, so that if you have to call Product Support, they at least have the option to do a page restore if that’s called for.

    A great session, and one that every DBA, both voluntary and involuntary, should attend. Well, unless you happen to be that exceptional DBA who could have delivered the session him-/herself, :)

    Best, Hugo

  4. It was useful even for DBAs who will just call CSS with corruption errors, for the ‘what not to do’ portion and for the awareness of just how serious a problem it is.

    I certainly wouldn’t object to people calling CSS if they’re out of their depth. Better than fiddling when they’re not sure and making a bigger mess. Too many people I’ve encountered (on forums) don’t understand how serious corruption is and so ignore it for long periods of time)

  5. I think it is a matter of who you are. While it is true that (relatively speaking) there are a handful of people who can use these techniques, if you purport to be a Database Administrator, and your management is expecting you to safeguard the company’s data, then you should be one of them…

    Far too often I’ve seen people applying for a DBA position who do not have the requisite skills. Nothing against these folks – many are very capable people, but there is a pretty significant lack of understanding "out there" about what a DBA does…

    MDD

  6. Well, somewhat of a shocking comment… I was there as well and thoroughly enjoyed the session. As one commenter above stated, it was not all new but some of the demos were pretty cool and encouraged me to play around a bit more for that "just in case" scenario so that I can be more prepared. AND, I can honestly say as I read posts about corruption that there is still more need of training on this subject. So, keep on doing what you are doing!

    Thanks for all the great work!

    David

  7. I also attended your session, Paul, and thought it very useful – although hopefully we don’t need to use the techniques too often. Some people don’t like to go into the deep end of the pool, but still need to know how to swim. After all, isn’t the point of conducting a deep dive session like this so that there are more than a handful of people that can do it?

    Keep up the good work, Paul.

    Dale Hughes.

  8. Paul, I’d’ve attended your session, but we were speaking in the same time slot.

    Lo these many years ago on the very first day of my very first DBA job I was asked to do a task which my company hadn’t been able to do in six months, namely backup-&-restore a db from the Midwest to the Left Coast. Because of my self-didactic training I was able to do it with ease.

    My former day job was as an enterprise architect. In my years at that job there’s not a lot I wouldn’t’ve given to have been more proficient in recovery.

    My experience validates the believe a knowledge of fundamentals is critical to the care-&-feeding of high-performance databases. I’m eager to learn more.

  9. Paul, you are the best among the best to teach any kind of useful SQL Tips with a great purpose. All of your presentations would for sure have a meaningful meaning and should only be read by meaningful DBAs. For example, I am a big follower of SQLSkills blog and I could learn a lot from the DBAMyths white paper in my early stage. Just to keep my self updated, mostly I go to SQLSkills blog all the time. And I am sure that this is the case with many professional DBAs across the globe. We should admit that Paul never speaks any nonsense or useless or irrelevant with a whole heart.

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.