Search Engine Q&A #26: Myths around causing corruption

Every so often I’ll see posts on the various data corruption forums discussing causes of corruption. In this post I want to debunk some of the myths around what can cause corruption. There are really two types of corruption to deal with, physical corruption and logical corruption.

Physical corruption

This is where something has altered the contents of a data or log file sector with no regard for what is being stored there. Possible causes of physical corruption are:

  • Problem with the I/O subsystem (99.8% of all cases I’ve ever seen – only 3 nines as I’d estimate I’ve seen around about a thousand corruption cases). Remember the I/O subsystem is everything underneath SQL Server in the I/O stack – including the OS, 3rd-party file system filter drivers, device drivers, RAID controllers, SAN controllers, network hardware, drives themselves, and so on. Millions of lines of code and lots of moving parts spinning very fast, very close to very fragile pieces of metal oxide (I once heard Jim Gray liken a disk drive head to a 747 jumbo jet flying at 500 mph at a height of 1/4 inch from the ground…)
  • Problem with the host machine hardware (0.1% of cases). Most of the time this is a memory error.
  • SQL Server bugs (0.1% of cases). Yes, there have been corruption bugs. Every piece of software has bugs. There are KB articles describing bugs.
  • Deliberate introduction of corruption using a hex editor or other means.

Physical corruption is what DBCC CHECKDB usually reports and the majority of cases are caused by a physical failures of some kind, with the minority caused by humans – software bugs.

Logical corruption

This is where something has altered some data so that a data relationship is broken. Possible causes of logical corruption are:

  • Humans

:-) Okay…

  • Application bug. The application deletes one part of an inherent data relationship but not the other. Or the application designer doesn’t implement a constraint properly. Or the application designer doesn’t cope with a transaction roll-back properly. You get the idea.
  • Accidental update/delete. Someone deletes or updates some data incorrectly.
  • SQL Server bug. See above.
  • DBCC CHECKDB when using the REPAIR_ALLOW_DATA_LOSS option. As is documented in Books Online, and I’ve blogged about and mentioned when lecturing, if you run repair, it doesn’t take into account any inherent or explicit constraints on the data.

The point here is that a physical failure of a component does not cause logical corruption, it causes physical corruption. Conversely, application errors cause logical corruption, not physical corruption. DBCC CHECKDB errors are about physical corruption (okay, with the inclusion of DBCC CHECKCATALOG code in 2005, it will find cases where the DBA has manually altered the system tables, causing logical corruption) and applications cannot cause physical corruption as they can only manipulate data through SQL Server. If an application hits a SQL Server bug which causes physical corruption, that’s still not the application causing physical corruption, it’s SQL Server.

So – on to the myths.

  • Can an application cause physical corruption? No.
  • Can stopping a shrink operation cause corruption of any kind? No.
  • Can stopping an index rebuild cause corruption of any kind? No.
  • Can running DBCC CHECKDB without repair cause corruption of any kind? No.
  • Can creating a database snapshot cause corruption of any kind? No.

Hope this helps.

8 thoughts on “Search Engine Q&A #26: Myths around causing corruption

  1. Hey Paul, I also heard from another DBA that having a max file size on your t-logs can cause corruption when the max size is exceeded. I disagree, but I am unable to find any documentation either way.

    I believe an exception is created, and the transaction is backed out, but can you verify either way?

    1. Nope – it won’t cause corruption. If the log file runs out of space, all active transactions are rolled back (and there’s enough space reserved to be able to do that).

  2. Hi Paul,
    I didn’t find so much about the issue of DB corruption during usage command DBCC SHRINGFILE with emty file. I moved all data from 4 files to only one and back to 4 files because I want use traceflag 1117 and 1118 to asure the best db performance. But during this move was data in random indexes corrupeted. Once it was also data in change_track table. Do know reason of this issue? It is MSSQL 2014 SP4 used for MS SCCM.
    Thank you for your opinion.
    BR Page

      1. Thank you. I’ll test it in singleuser mode, and in case that will help, I’ll not bothering anyone. In other case, when it won’t help, I’ll open case with MS.

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.