Sample corrupt databases to play with

The very worst piece of advice I ever saw on the Internet was in response to someone asking on a SQL newsgroup ‘how can I create a corrupt database?’. The first response was:

When I want to corrupt a database to play with, I go into the data center, find a hard-drive and flick the power switch on-and-off quickly a few times.

This was closely followed by a bunch of replies (including mine) saying ‘Noooooooooooo!!!!’

So, for a few years now I’ve provided a zip file with a bunch of pre-corrupted databases in – so you can test your consistency checking jobs, and experiment with corrupt databases without having to create them yourself by destroying hard-drives, or less daft means. You can find the main zip file at the top of our Past Events resources page, along with a link to a blog post which explains the various databases and demo scripts.

Several times I’ve been asked to provide SQL Server 2008 versions of the two databases which demo system table corruption. Well, that task has finally bubbled to my long-list of blog posts and website updates. There’s now a second, smaller zip file which has 2008-only versions of the DemoFatalCorruption1 and DemoFatalCorruption2 databases, which showcase corruptions that prevent DBCC CHECKDB from running.

Let me know if you have any problems (playing with the backups :-)

Enjoy!

14 thoughts on “Sample corrupt databases to play with

  1. Whatever happened to the feature or claim for databases that support ACID that even if the power is turned off everything should still be ok?

  2. This has nothing to do with crash recovery – this is to do with the I/O subsystem corrupting data ‘at rest’. Crash recovery works fine (except in the case of running on drives with write-caches and insufficient battery-backup). ACID properties are absolutely maintained.

  3. I quote a part of TPC full disclosure reports

    "4.4.2 System Crash
    Guarantee the database and committed updates are preserved across an instantaneous interruption(system
    crash/system hang) in processing which requires the system to reboot to recover.
    1. Six streams of ACID transactions we restarted. Each stream executed a minimum of 100 transactions.
    2. While the streams of ACID transactions were running, the system was powered off.
    3. When power was restored, the system booted and the database was restarted.
    4. The database went through a recovery period.
    5. The success file and the HISTORY table counts were compared and were found to match."

    So you’re basically saying that hard drives will crash/cause physical corruption when power is suddenly interrupted? And that this portion of TPC is a hit and miss affair, and that those systems that passed this test were just lucky?

  4. You are completely misunderstanding the point, and making out that I don’t know what I’m talking about. No – a single power interruption will not cause problems, as long as the drive has enough battery-backup to flush out it’s cache – this used to be a big problem for drives on which transaction logs are placed. Flicking the power switch on and off on the drive many times quickly will break the drive.

    Seriously, what’s your problem?

  5. As paul points out. The issue is that hard drives are physical entities having them stop and start can cause them to physically break/damage data. In 13 years of working with SQL I think all my corruption issues are due to power downs and power ups, either planned or not, especially old drives.

    Components fail thats just a fact of life. turning things on and off causes surges of power that is often the cause of a failure. A huge amount is done to avoid that, but nothing is perfect.

  6. Appreciate this post is very late in the day but I do feel I have something to add to this thread. All the corruption I’ve dealt with recently (quite a bit unfortunately) relates to cross site failover tests performed on servers with SAN replicated drives. The impression the SAN providers give Windows Admins is the process is bullet proof, experience says otherwise.
    There are dirty pages in memory that are lost during the failover I think. All the corruption seems to be page link type errors (page xxx refers to page xxx which does not exist). It might work if SQL Server were shutdown cleanly before the failover (or stop transaction activity and checkpoint!?) but then management say (correctly in my opinion) that that isn’t a true failover test.

  7. Dirty pages would be rolled forward on crash recovery in a cross site, SAN replicated failover wouldn’t they? Our problem must be that the log writes, seen as committed before the failover, have not actually been SAN replicated to the DR site?
    I really enjoy reading your blog, it’s a goldmine.
    Thanks

  8. I asked the SAN admin whether write order is being maintained. Apparently it kills the performance during asynchronous SAN replication so they had it switched off.

    Thanks for your pointer and help.
    Regards

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Some thoughts on courage

(This is also the Ponderings – editorial – in today’s SQLskills newsletter.) I want to start out this post by sincerely thanking everyone who emailed

Explore

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.