Every so often I'll see posts on the various data corruption forums discussing causes of corruption. In this post I want to debunk some of the myths around what can cause corruption. There are really two types of corruption to deal with, physical corruption and logical corruption.

Physical corruption

This is where something has altered the contents of a data or log file sector with no regard for what is being stored there. Possible causes of physical corruption are:

  • Problem with the I/O subsystem (99.8% of all cases I've ever seen - only 3 nines as I'd estimate I've seen around about a thousand corruption cases). Remember the I/O subsystem is everything underneath SQL Server in the I/O stack - including the OS, 3rd-party file system filter drivers, device drivers, RAID controllers, SAN controllers, network hardware, drives themselves, and so on. Millions of lines of code and lots of moving parts spinning very fast, very close to very fragile pieces of metal oxide (I once heard Jim Gray liken a disk drive head to a 747 jumbo jet flying at 500 mph at a height of 1/4 inch from the ground...)
  • Problem with the host machine hardware (0.1% of cases). Most of the time this is a memory error.
  • SQL Server bugs (0.1% of cases). Yes, there have been corruption bugs. Every piece of software has bugs. There are KB articles describing bugs.
  • Deliberate introduction of corruption using a hex editor or other means.

Physical corruption is what DBCC CHECKDB usually reports and the majority of cases are caused by a physical failures of some kind, with the minority caused by humans - software bugs.

Logical corruption

This is where something has altered some data so that a data relationship is broken. Possible causes of logical corruption are:

  • Humans

:-) Okay...

  • Application bug. The application deletes one part of an inherent data relationship but not the other. Or the application designer doesn't implement a constraint properly. Or the application designer doesn't cope with a transaction roll-back properly. You get the idea.
  • Accidental update/delete. Someone deletes or updates some data incorrectly.
  • SQL Server bug. See above.
  • DBCC CHECKDB when using the REPAIR_ALLOW_DATA_LOSS option. As is documented in Books Online, and I've blogged about and mentioned when lecturing, if you run repair, it doesn't take into account any inherent or explicit constraints on the data.

The point here is that a physical failure of a component does not cause logical corruption, it causes physical corruption. Conversely, application errors cause logical corruption, not physical corruption. DBCC CHECKDB errors are about physical corruption (okay, with the inclusion of DBCC CHECKCATALOG code in 2005, it will find cases where the DBA has manually altered the system tables, causing logical corruption) and applications cannot cause physical corruption as they can only manipulate data through SQL Server. If an application hits a SQL Server bug which causes physical corruption, that's still not the application causing physical corruption, it's SQL Server.

So - on to the myths.

  • Can an application cause physical corruption? No.
  • Can stopping a shrink operation cause corruption of any kind? No.
  • Can stopping an index rebuild cause corruption of any kind? No.
  • Can running DBCC CHECKDB without repair cause corruption of any kind? No.
  • Can creating a database snapshot cause corruption of any kind? No.

Hope this helps.

I woke up this morning and someone had replaced my wife with someone who likes to blog :-). Kimberly's turned over a new leaf and is going to blog much more often - in fact she's blogged 4 times today already. Check out her blog here.

Continuing on the transaction log theme of the last few Search Engine Q&A posts, this one addresses a question I've heard a few times, most recently on an MVP discussion group. Let me paraphrase:

If I have a transaction that inserts a huge amount of data, the transaction log grows to 50-GB. I then rollback the transaction. When I take the next log backup, it's way smaller than 50-GB. What's going on?

Let's see if we can repro the scenario. I've created a database with a 500-MB data file and a 1-MB log file, with 100-MB and 1-MB auto-growth intervals. I want the log to be as small as possible and to grow in small chunks so I can see just how much it *has* to grow by, rather than having a large growth size. Then I set the recovery mode to full and took a database backup to make sure the log won't truncate until it's backed up.

CREATE DATABASE LogSizeTest ON
   
(NAME = N'LogSizeTest',
   
FILENAME = N'C:\SQLskills\LogSizeTest.mdf',
   
SIZE = 512MB,
   
FILEGROWTH = 100MB)
LOG ON 
   
(NAME = N'LogSizeTest_log',
   
FILENAME = N'C:\SQLskills\LogSizeTest_log.ldf',
   SIZE = 1MB,
   
FILEGROWTH = 1MB);
GO

ALTER DATABASE LogSizeTest SET RECOVERY FULL;
GO

BACKUP DATABASE LogSizeTest TO DISK = 'C:\SQLskills\LogSizeTest.bak';
GO

Let's check the size of the log:

DBCC SQLPERF (LOGSPACE);
GO

Database Name  Log Size (MB) Log Space Used (%) Status
-------------- ------------- ------------------ -----------
LogSizeTest    0.9921875     36.66339           0

This gives back info for all databases, I've trimmed down the output just for the LogSizeTest database.

Now I'm going to create a table, start an explicit transaction and add about 500-MB of info to the table.

USE LogSizeTest;
GO
SET NOCOUNT ON;
GO
CREATE
TABLE Test (c1 INT IDENTITY, C2 CHAR (8000) DEFAULT (REPLICATE ('a', 8000)));
GO

BEGIN TRAN;
GO

DECLARE @count INT;
SELECT @count = 0;
WHILE (@count < 64000)
BEGIN
   
INSERT INTO Test DEFAULT VALUES;
   
SELECT @count = @count + 1;
END;
GO

Checking the log file size again gives:

DBCC SQLPERF (LOGSPACE);
GO

Database Name  Log Size (MB) Log Space Used (%) Status
-------------- ------------- ------------------ -----------
LogSizeTest    703.9922      99.98737           0

The log size has grown to about 700-MB, way more than the size of the data I was inserting, and it's completely full. Now let's rollback the transaction and check the log size again.

ROLLBACK TRAN;
GO

DBCC SQLPERF (LOGSPACE);
GO

Database Name  Log Size (MB) Log Space Used (%) Status
-------------- ------------- ------------------ -----------
LogSizeTest    703.9922      85.21268           0

The size of the log file is the same, but the percentage used has actually gone down! How can that happen? Let's take a backup and checkout it's size:

BACKUP LOG LogSizeTest TO DISK = 'C:\SQLskills\LogSizeTest_log.bck';
GO
RESTORE HEADERONLY FROM DISK = 'C:\SQLskills\LogSizeTest_log.bck';
GO

The BackupSize in the output from the RESTORE HEADERONLY is 631454208, which is 602.2-MB. Taking the numbers from the DBCC SQLPERF output above, 85.21268% of 703.9922-MB is 599.89-MB - so the backup is roughly the same size as the used transaction log. That's what I'd expect, but why is it smaller than the total size of the transaction log?

So what's going on? Why did the transaction log need to grow so much larger than it needed to, and why did the percentage used actually *drop* after the transaction rolled back?

The answer is in the way the transaction log works. Whenever a logged operation occurs in a transaction, there is some transaction log space reserved in case the transaction rolls back. The idea is that there's always enough space available in the transaction log for a transaction to roll back, without having to grow the transaction log and potentially have that fail. If a transaction could not roll back successfully because the log didn't have enough space, the database would become transactionally inconsistent, would be taken offline and the state changed to SUSPECT.

The behavior we saw was the Storage Engine reserving transaction log space for a potential roll back. When the roll back occured, the transaction log records necessary to undo the effects of the transaction (called compensation log records) are created and written to the log. The issue is that they usually don't take up as much space as the Storage Engine reserved, as it tends to be very conservative in its estimates of how much log space to reserve, to avoid the potential for SUSPECT databases. This explains the difference between the various sizes and percentages we saw above.

The Storage Engine code to do the reservations is quite interesting - I remember fixing a couple of bugs in it during SQL Server 2000 development in 1999 while I was getting to know the internals of the logging and recovery system before tackling some of the (since removed) log-reading code in DBCC CHECKDB in SQL Server 2000.

Anyway, there you have it. Log space reservation is the answer, and is also one of the reasons why it can be tricky to estimate how large a transaction log should be when a database is created.

This is a quickie in response to a blog comment from my previous post on instant initialization - How to tell if you have instant initialization enabled?. The comment was:

I must say, I love instant initialization. It's awesome. But I always wondered why it's not available for the log file. I assume there's a technical reason... but what is it? Does it depend on having the rest of the file be zeroed out? Doesn't it already know where it's start and stop points are anyways, since the log is circular?

I couldn't remember the exact answer so I discussed with Peter Byrne on the Storage Engine dev team and now I have the answer to share. There is a lot of metadata kicking around in the Storage Engine about the transaction log (mostly in the boot page - see my post Search Engine Q&A #20: Boot pages, and boot page corruption), including where to start reading the log during crash recovery. However, there's nothing that can be used after a crash occurs to determine where the active transaction log ends (i.e. where should crash recovery stop processing log records).

The way this is done is to have each log sector have parity bits stamped on it. When the log is first created, it is zero-initialized (with zero being an illegal log sector parity value). As the log is written, each sector has parity bits in it. When the end of the log is reached, and it wraps around to the start of the log file, the parity bits are flipped, so that overwritten log sectors have the opposite parity from when they were last written. When a crash occurs, log sectors are read and processed until a log sector with an out-of-sequence parity is found.

This entire process will not work if there's already random data in the space used by the log file - some of the random data could just look like a valid set of parity bits and cause the recovery system to try to process a log sector full of garbage, leading to a suspect database, at best.

So - it's not just a "there wasn't time" - there really is a good, architectural reason why instant initialization cannot be done with the transaction log.

There's been a spate of problems on the forums and in my inbox with people's transaction logs filling up. With the increase in "involuntary DBAs", there are more people being thrust into the role of database administration without being given the chance to learn the skills, tips, and tricks necessary. One of the most common problems I see is the tranaction log filling up and the database grinding to a halt, and the most common cause of this is the lack of log backups when in the FULL recovery mode.

The very first post in my Search Engine Q&A series last September dealt with this problem - and showed two examples of filling up a transaction log plus the command to see why a transaction log is full (examining the log_reuse_wait_desc column in master.sys.databases). See here for more details. One thing I don't go into too much detail about in that post is the way that transactional replication or database mirroring can cause the log to grow. Both of these technologies rely on reading the transaction log for an unbroken stream of changes to be sent to the secondary systems (but with slightly different mechanisms) - but both will cause the log to grow if there is transaction log that hasn't yet been read (usually because of a performance issue). The same thing applies to the new Change Data Capture technology in SQL Server 2008 - it uses the transactional replication log reader to harvest changes from the log. The addition of any of these technologies could cause your log to grow excessively.

So it's grown too big - now what? Let's deal with the case where you've taken a full database backup in the FULL recovery mode but haven't taken any log backups. There are a number of options:

  1. Take a log backup! This might be easier said than done if you're like the customer I once had who had a 10GB database with an (I kid you not) 987GB log file.
  2. Switch to the SIMPLE recovery model (either permanantly or temporarily). If you're not interested in point-in-time recovery, there's no need for you to be in the FULL recovery mode (unless you're using a feature that requires it, like database mirroring, in which case you have no choice but to start taking log backups)
  3. In SQL Server 2005 and before, you can manually throw away the log contents - see my blog post here. This is not advisable.
  4. Grow the log file. This really just addresses the symptom but not the cause - the log will fill up again eventually.

Note that options #1-3 will just remove the restriction that the log contents are still needed (called "clearing" or "truncating" the log). None of these will cause the log file to shrink. The ONLY way to change the size of the log file is to use DBCC SHRINKFILE. After doing one of #1-3, you must manually shrink the log file using DBCC SHRINKFILE. If you do #2, and then go back into the FULL recovery mode, you should immediately take a full database backup. After shrinking the log file, you should manually grow it again using ALTER DATABASE so that it doesn't suffer repeated auto-grows and lead to performance issues with the log (see Kimberly's post here).

This topic is really a two-hour lecture in the database maintenance class I teach, but at least here (and with the blog post links) I've given you a bit of insight into why it can happen and how to solve the problem.

I've had a few follow-ups on my two posts about boot page and file header page corruption - asking if its possible to do single-page restore operations for these pages. Let's try:

CREATE DATABASE BootPageTest;
GO

-- Single page restore is only possible using the FULL recovery model
ALTER DATABASE BootPageTest SET RECOVERY FULL;
GO

BACKUP DATABASE BootPageTest TO DISK = 'C:\sqlskills\BootPageTest.bck';
GO
BACKUP LOG BootPageTest TO DISK = 'C:\sqlskills\BootPageTest.trn';
GO

RESTORE DATABASE BootPageTest PAGE = '1:9' FROM DISK = 'C:\sqlskills\BootPageTest.bck';
GO

Msg 3111, Level 16, State 1, Line 2
Page (1:9) is a control page which cannot be restored in isolation. To repair this page, the entire file must be restored.
Msg 3013, Level 16, State 1, Line 2
RESTORE DATABASE is terminating abnormally.

The answer is no. The following page types cannot be restored using single-page restore:

  • File header pages (see here)
  • Boot page (see here)
  • GAM, SGAM, DIFF map, ML map pages (see here)

Following on from my previous post on boot pages and boot page corruption, I've been asked about file header pages - and I was already planning this post as the next in the series.

So what's a file header page? Every data file in a database has the very first 8kb page (i.e. page 0 in the file) set aside as the place to store all the metadata info about the file. As with the boot page, you can look at the contents with DBCC PAGE and it will interpret all the fields for you, or you can use the DBCC FILEHEADER command, which does a better job. This is undocumented and unsupported, just like DBCC DBINFO for looking at the database boot page, but it's been discussed and posted about on the Internet before so it's existence is no secret.

The command take a database name or database ID plus the file ID to dump. Here I've created a database called FileHeaderTest and used SSMS with results-to-text, plus a bunch of editing of the results to make it blog-able:

DBCC FILEHEADER ('FileHeaderTest', 1);
GO

FileId                : 1
LogicalName           : FileHeaderTest
BindingId             : D30AE3EF-14A6-47D5-B267-96F38238D882
FileGroup             : 1
Size                  : 152
MaxSize               : -1
MinSize               : 152
UserShrinkSize        : -1
Growth                : 128
BackupLSN             : 0
RedoStartLSN          : 0
FirstLSN              : 0
MaxLSN                : 0
FirstUpdateLSN        : 0
CreateLSN             : 0
SectorSize            : 512
RecoveryForkGUID      : 00000000-0000-0000-0000-000000000000
RecoveryForkLSN       : 0
DifferentialBaseLsn   : 19000000048800037
DifferentialBaseGuid  : 279A8EF4-4431-4CA5-8939-F613E5BC3033
Status                : 2
RestoreStatus         : 0
ReadOnlyLsn           : 0
ReadWriteLsn          : 0
MaxLsnBranchId        : 00000000-0000-0000-0000-000000000000
RedoTargetPointLsn    : 0
RedoTargetPointGuid   : 00000000-0000-0000-0000-000000000000
RestoreDiffBaseLsn    : 0
RestoreDiffBaseGuid   : 00000000-0000-0000-0000-000000000000
RestorePathOriginLsn  : 0
RestorePathOriginGuid : 00000000-0000-0000-0000-000000000000
OldestRestoredLsn     : 0

Lots of interesting stuff in here, such as:

  • BindingId: used to make sure a file is really part of this database
  • SectorSize: the disk sector size
  • Status: what kind of file and what state is it in (e.g. 2 = regular disk file)
  • Various sizes in number-of-8kb-pages (e.g. MaxSize of -1 means file growth is unlimited)
  • Growth: the number of pages to grow the file by if the 0x100000 bit is NOT set in the Status field. If it is set, the Growth is in percent.

And you can watch things change. For instance, if I change the file growth to 10%:

ALTER DATABASE FileHeaderTest MODIFY FILE (NAME = FileHeaderTest, FILEGROWTH = 10%);
GO

And then dump the file header page contents again, the Status and Growth fields have changed to:

.
.
Growth                : 10
.
.
Status                : 1048578
.
.

So what if a file header page is corrupt? I corrupted the file header page of my database and then started up SQL Server.

USE FileHeaderTest;
GO

Msg 945, Level 14, State 2, Line 1
Database 'FileHeaderTest' cannot be opened due to inaccessible files or insufficient memory or disk space. See the SQL Server errorlog for details.

Let's try EMERGENCY mode:

ALTER DATABASE FileHeaderTest SET EMERGENCY;
GO

Msg 5172, Level 16, State 15, Line 1
The header for file 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\FileHeaderTest.mdf' is not a valid database file header. The PageAudit property is incorrect.

In this case, there's nothing to do except restore from backups, and the database is inaccessible because the PRIMARY filegroup could not be brought online. If the corruption were in a file in a secondary filegroup, things are a little bit different. Now I've added a secondary filegroup with a single file (called CorruptFile) and corrupted it's file header page. After starting up SQL Server we get the same behavior - but this time we can set the file to be offline and access the rest of the database. This is called partial database availability and works in Enterprise (and Developer) Edition only.

ALTER DATABASE FileHeaderTest MODIFY FILE (NAME = CorruptFile, OFFLINE);
GO

Note that the only way to bring an offline file back online is to restore it from a backup - see this post from my old Storage Engine blog for more details. Another corruption that can only be repaired using backups...

 

Now that I've done all the business-related blog posts, back to the good stuff to stop people complaining!

Something that's cropped up a few times over the summer so far is people trying to repair boot page corruptions.

First off, what's a boot page? Every database has a single page that stores critical information about the database itself. It's always page 9 in file 1 (the first file in the PRIMARY filegroup). You can examine the page using DBCC PAGE and it will interpret all the fields for you, but there's another command, DBCC DBINFO, that also dumps all this info (in fact the DBCC PAGE code calls the same underlying dumping code). This command is undocumented and unsupported but widely known and 'documented' in lots of places on the web - given that it uses the same code as DBCC PAGE, it's just as safe to use IMHO.

So what's on the boot page?

DBCC DBINFO ('BootPageTest');
GO

DBINFO STRUCTURE:


DBINFO @0x5BF6EF84

dbi_dbid = 19                        dbi_status = 65536                   dbi_nextid = 2073058421
dbi_dbname = BootPageTest            dbi_maxDbTimestamp = 2000            dbi_version = 611
dbi_createVersion = 611              dbi_ESVersion = 0                   
dbi_nextseqnum = 1900-01-01 00:00:00.000                                  dbi_crdate = 2008-07-10 15:53:18.843
dbi_filegeneration = 0              
dbi_checkptLSN

m_fSeqNo = 41                        m_blockOffset = 29                   m_slotId = 55
dbi_RebuildLogs = 0                  dbi_dbccFlags = 2                   
dbi_dbccLastKnownGood = 1900-01-01 00:00:00.000                          
dbi_dbbackupLSN

m_fSeqNo = 0                         m_blockOffset = 0                    m_slotId = 0

dbi_oldestBackupXactLSN

m_fSeqNo = 0                         m_blockOffset = 0                    m_slotId = 0
dbi_LastLogBackupTime = 1900-01-01 00:00:00.000                          
dbi_differentialBaseLSN

m_fSeqNo = 0                         m_blockOffset = 0                    m_slotId = 0

dbi_createIndexLSN

m_fSeqNo = 0                         m_blockOffset = 0                    m_slotId = 0

dbi_versionChangeLSN

m_fSeqNo = 0                         m_blockOffset = 0                    m_slotId = 0
dbi_familyGUID = a4e88c13-b4cf-4320-834e-92b237244d4b                    
dbi_recoveryForkNameStack


entry 0

m_fSeqNo = 0                         m_blockOffset = 0                    m_slotId = 0
m_guid = a4e88c13-b4cf-4320-834e-92b237244d4b                            

entry 1

m_fSeqNo = 0                         m_blockOffset = 0                    m_slotId = 0
m_guid = 00000000-0000-0000-0000-000000000000                            
dbi_differentialBaseGuid = 00000000-0000-0000-0000-000000000000           dbi_firstSysIndexes = 0001:00000014
dbi_collation = 872468488            dbi_category = 0                     dbi_maxLogSpaceUsed = 231936
dbi_localState = 0                   dbi_roleSequence = 0                
dbi_failoverLsn

m_fSeqNo = 0                         m_blockOffset = 0                    m_slotId = 0

dbi_dbmRedoLsn

m_fSeqNo = 0                         m_blockOffset = 0                    m_slotId = 0

dbi_dbmOldestXactLsn

m_fSeqNo = 0                         m_blockOffset = 0                    m_slotId = 0
dbi_dbMirrorId = 00000000-0000-0000-0000-000000000000                    
dbi_pageUndoLsn

m_fSeqNo = 0                         m_blockOffset = 0                    m_slotId = 0
dbi_disabledSequence = 0            
dbi_dvSplitPoint

m_fSeqNo = 0                         m_blockOffset = 0                    m_slotId = 0
dbi_CloneCpuCount = 0                dbi_CloneMemorySize = 0             
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

There's all kinds on interesting things in there, for instance:

  • dbi_version and dbi_createversion: the physical version number of the database (and when it was created). See question 1 in the August 2008 SQL Q&A column in TechNet Magazine for an explanation (see here).
  • dbi_RebuildLogs: a count of the number of times the transaction log has been rebuilt for the database. PSS can use this to tell whether corruption problems could have been caused by DBAs rebuilding the log
  • dbi_dbccLastKnownGood: the completion time of the last 'clean' run of DBCC CHECKDB
  • a bunch of different LSNs related to checkpoint, backups, database mirroring
  • dbi_LastLogBackupTime: self-explanatory
  • dbi_differentialBaseGuid: the GUID generated by the last full database backup. Differential backups can only be restored on top of a matching full backup - so an out-of-band full backup could screw-up your disaster recovery - see this blog post for more info.

Now, what about if this page is corrupt in some way? I corrupted the BootPageTest database to have a corrupt boot page. Let's see what happens:

USE BootPagetest;
GO

Msg 913, Level 16, State 4, Line 1
Could not find database ID 19. Database may not be activated yet or may be in transition. Reissue the query once the database is available. If you do not think this error is due to a database that is transitioning its state and this error continues to occur, contact your primary support provider. Please have available for review the Microsoft SQL Server error log and any additional information relevant to the circumstances when the error occurred.

Okay - let's try setting the database into EMERGENCY mode:

ALTER DATABASE BootPageTest SET EMERGENCY;
GO

Msg 824, Level 24, State 2, Line 1
SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0xcdee22fa; actual: 0xcb6ea2fa). It occurred during a read of page (1:9) in database ID 19 at offset 0x00000000012000 in file 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\BootPageTest.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

Hmm. What about running DBCC CHECKDB?

DBCC CHECKDB ('BootPageTest') WITH NO_INFOMSGS, ALL_ERRORMSGS;
GO

Msg 922, Level 14, State 1, Line 1
Database 'BootPageTest' is being recovered. Waiting until recovery is finished.

It's not looking good. Obviously the change to EMERGENCY mode couldn't complete properly. What's the database status?

SELECT [state_desc] FROM sys.databases WHERE [name] = 'BootPageTest';
GO

state_desc
------------------------------------------------------------
RECOVERY_PENDING

The boot page is inaccessible so in effect the database is inaccessible too, this is what the database state means in this case.

So what does this mean? If the boot page is corrupt, you can't run DBCC CHECKDB so you can't possibly run repair, and you can't put the database into EMERGENCY mode so you can't extract data into a new database. It means that there's NO WAY to recover from a corrupt boot page EXCEPT to restore from backups. One more reason to have backups...

Over the last few weeks I've seen (and helped correct) quite a few myths and misconceptions about index rebuild operations. There's enough now to make it worthwhile doing a blog post (and it's too hot here in Orlando for us to go sit by the pool so we're both sitting here blogging)...

Myth 1:  index rebuild pre-allocates the necessary space

This myth has two variations:

  1. The space for the new copy of the index is pre-allocated
  2. The space for the sort portion of the rebuild is pre-allocated

Neither of these are true. Index rebuild (whether online or offline, and at least as far back as 7.0) will create a new copy of the index before dropping the old copy. The pages and extents required to do this will always be allocated as needed, as with any other operation in SQL Server. The sort phase of an index rebuild, if required (in certain cases it is skipped in 2005), will adhere to the same allocation behavior.

Myth 2: indexes are rebuilt within a single file in a multi-file filegroup

This is a new one that I just heard yesterday - (paraphrasing) "In a two-file filegroup, an index in file 1 will be rebuilt into file 2. The next time it is rebuilt, it will be built in file 1. And so on".

This is untrue. Any time any allocations are done in a multi-file filegroup, the allocations are spread amongst all the files using the allocation system's proportional fill algorithm. In a nutshell, this says that space will be allocated more frequently from larger files with more free space than from smaller files with less free space. There is no concept in SQL Server of limiting allocations to a particular file in a multi-file filegroup.

Myth 3: non-clustered indexes are always rebuilt when a clustered index is rebuilt

This is untrue. The rules are a little complex here but can be summed up as follows:

  • In 2005+, rebuilding a unique or non-unique clustered index (without changing its definition) will NOT rebuild the non-clustered indexes
  • In 2000:
    • Rebuilding a non-unique clustered index WILL rebuild the non-clustered indexes
    • Rebuilding a unique clustered index will NOT rebuild the non-clustered indexes

The first few service packs of 2000 had bugs that changed the behavior of rebuilding unique clustered indexes back and forth - this is the source of much of the confusion around this myth.

For a much more detailed discussion of this, see my blog post from last Fall - Indexes From Every Angle: What happens to non-clustered indexes when the table structure is changed?.

Myth 4: BULK_LOGGED recovery mode decreases the size of the transaction log and log backups for an index rebuild

This myth is partly true.

Switching to the BULK_LOGGED recovery mode while doing an index rebuild operation WILL reduce the amount of transaction log generated, which is very useful for limiting the size of the transaction log file (note I say 'file', not 'files' - you only need one log file).

Switching to the BULK_LOGGED recovery mode while doing an index rebuild will NOT reduce the size of the transaction log BACKUP. Although the operation will be minimally-logged, the next transaction log backup will read all the transaction log since the last backup plus all the extents that were changed by the minimally-logged index rebuild. This will result in a log backup that's almost exactly the same size as for a fully-logged index rebuild. The ONLY time a log backup will contain data extents is when a minimally-logged operation has taken place since the last log backup - see here on MSDN for more info.

If you're considering using the BULK_LOGGED recovery mode, beware that you lose the ability to do point-in-time recovery to ANY point covered by a transaction log backup that contains even a single minimally-logged operation. Make sure that there's nothing else happening in the database that you may need to effectively roll-back with P.I.T. recovery. The operations you should perform if you're going to do this are:

  • In FULL recovery mode, take log backup immediately before switching to BULK_LOGGED
  • Switch to BULK_LOGGED and do the index rebuild
  • Switch back to FULL and immediately take a log backup

This limits the time period in which you can't do P.I.T. recovery.

Myth 5: online index rebuild doesn't take any locks

This myth is untrue. The 'online' in 'online index operations' is a bit of a misnomer.  Online index operations need to take two very short-term table locks. An S (Shared) table lock at the start of the operation to force all write plans that could touch the index to recompile, and a SCH-M (Schema-Modification - think of it as an Exclusive) table lock at the end of operation to force all read and write plans that could touch the index to recompile.

The most recent time this came up on the forums was someone noticing insert queries timing out after an online index rebuild operation had just started. The problem is that the  table lock that online index rebuild needs has to be entered into the grant queue in the lock manager until it can be acquired - and it will stay there until existing transactions that are holding conflicting locks either commit or roll-back. Any transaction that requires a conflicting lock AFTER the index rebuild lock has been queued but not acquired (and then released) will wait behind it in the lock grant queue. If the query timeout is reached before the transaction can get it's lock, it will timeout.

This is still much better than the table lock being held for the entire duration of the index rebuild operation. For more info, checkout this whitepaper on Online Index Operations in SQL Server 2005.

Here's a quickie just before we head off to SQL Connections in Orlando (see here for all out pre/post cons and sessions).

On one of the internal MS forums was the question - how can I tell through T-SQL the last time SQL Server restarted (i.e. the current 'uptime')? The answer relies on the fact that all the background tasks that start when SQL Server starts must record a 'login time'.

You can get this from:

SELECT [login_time] FROM sysprocesses WHERE spid = 1;
GO

Or more simply:

SELECT MIN ([login_time]) FROM sysprocesses;
GO

Pretty neat trick!

As with the last few conferences, I'll try to blog every day during SQL Connections under the Conference Questions Pot-Pourri category.

Hope to see a bunch of you there!

PS Some people have suggested that checking the creation date of tempdb will also do the trick. That's not a *guaranteed* method as PSS could have used T3609 to recover tempdb instead of recreating it (if they're troubleshooting some tempdb issue). In that case the creation date of tempdb will *not* be the time the server started. Checking the time in sysprocesses is the only infallible method.

Here's an interesting question that came in to our questions line (questions@SQLskills.com - no guarantee of an answer - I check it every so often):

I have seen demonstrations where a large database being broken down into smaller ones using synonym names. I think it was used on a data warehouse and allowed smaller database backups instead of doing one large one and simulating file group backups. Is there somebody who has worked with this variation and can identify when this would be an advantage over file group backups if there is even an advantage.

My answer will always be to keep the VLDB (Very Large DataBase) as a single unit and go with filegroups if you need to. Breaking the VLDB into smaller databases has some serious issues:

  • Queries become more complicated as they're now potentially cross-database. This means you need to keep all the security settings in all the databases synchronized.
  • Referential integrity becomes a big problem as you can't create foreign key constraints across databases
  • You have multiple transaction logs to manage instead of one. This means you need to be doing log backups of ALL the databases, vastly increasing the number of backup files to manage.
  • Point-in-time recovery becomes very hard as you have to restore ALL the databases to a single point-in-time. Now, this may not be too much of a problem if the data in the VLDB is essentially read-only, and gets updated en-masse every so often from your OLTP system - but for changing data it's a nightmare.
  • Implementing a high-availability solution becomes very challenging. As soon as you start to think of multiple databases that need to be in sync, you can pretty much forget about log shipping and database mirroring. You're going to need whole-instance failure protection - which means failover clustering. Then if you want to mitigate the single-point-of-failure in a failover cluster (the shared disks), you're going to need SAN replication to a remote failover cluster too - expensive!!!

These are just the ones that spring to mind in 5 minutes - I'm sure there are more if I sat and thought about it longer (e.g. how to create a database snapshot, run a consistency check, ...)

So - IMHO it's always going to be easier to backup and restore a single VLDB split into filegroups than a VLDB split into multiple databases.

PS If there's something you'd like to see me do a blog post on, shoot me an email here.

Here's a question that came in - what changed in SQL Server 2005 that allows concurrent log and full backups?

First a little background, in case you didn't know about the change in behavior. In SQL Server 2000, a concurrent log backup with either a full or diff backup (I'll just say 'full' from now on but take it to mean 'full or diff') was not permitted. The reason is that a log backup would clear the inactive portion of the log once it's been backed up, but a full backup may still need some of that log so it can't be cleared (see this post and this post for an explanation). The simple route was taken of disallowing concurrent log backups with fulls.

In SQL Server 2005, the restriction was lifted, but there's a subtle twist. You can do concurrent log backups with fulls BUT the log is not cleared when the log backup ends. The clearing of the inactive portion of the log is delayed until the full backup completes. This could cause you to have disk space problems if your log generation rate is huge and you're relying on very frequent backups to manage the log size.

So - what changed that allowed the SS2000 restriction to be lifted? Nothing - just the code was changed to delay the log clearing and allow the concurrent backups.

Pretty cool change - but watch out for the twist.

This post is based on one from my old MSDN blog but the topic has come up a few times in recent days so I want to revamp it and re-post.

There are two things that confuse people about mirrored backups - can you mix-n-match backup devices from the mirrors, and what exactly do the various sizes mean?

1) Single-device backup, no mirror

The code below creates a single-device backup with no mirror, and then examines it.

BACKUP DATABASE AdventureWorks TO
DISK
= N'C:\SQLskills\mediaset1device1.bck'
WITH FORMAT, STATS;
GO

RESTORE HEADERONLY FROM DISK = N'C:\SQLskills\mediaset1device1.bck';
GO

The BackupSize in the HEADERONLY output is 168,899,072 bytes and the on-disk size of the file mediaset1device1.bck is 161MB.

2) Single-device backup, mirrored

The code below creates a single-device backup with a mirror, and then examines it.

BACKUP DATABASE AdventureWorks TO
DISK
= N'C:\SQLskills\mediaset1device1.bck'
MIRROR TO DISK = N'C:\SQLskills\mediaset2device1.bck'
WITH FORMAT, STATS;
GO

RESTORE HEADERONLY FROM DISK = N'C:\SQLskills\mediaset1device1.bck';
RESTORE HEADERONLY FROM DISK = N'C:\SQLskills\mediaset2device1.bck';
GO

The BackupSize in the HEADERONLY output of both files is 337,798,144 bytes. This is double the size of the backup in case #1 above - and it because there are now two copies of the backup. The on-disk size of both files is 161MB, which is what we'd expect as mediaset2device1.bck is a copy of mediaset1device1.bck.

3) Two-device backup, no mirror

The code below creates a two-device backup with no mirror, and then examines it.

BACKUP DATABASE AdventureWorks TO
DISK
= N'C:\SQLskills\mediaset1device1.bck',
DISK = N'C:\SQLskills\mediaset1device2.bck'
WITH FORMAT, STATS;
GO

RESTORE HEADERONLY FROM DISK = N'C:\SQLskills\mediaset1device1.bck';
GO

The BackupSize in the HEADERONLY output is 169,959,424 bytes. This is nearly exactly the same as for the single-device backup in case #1, but includes a bit more to account for the extra metadata in the second device. This time, the on-disk size of the file mediaset1device1.bck is 81MB. This is half of the on-disk size from the single-device case #1 as the backup is now split between the two files.

4) Two-device backup, mirrored

The code below creates a single-device backup with a mirror, and then examines it.

BACKUP DATABASE AdventureWorks TO
DISK
= N'C:\SQLskills\mediaset1device1.bck',
DISK = N'C:\SQLskills\mediaset1device2.bck'
MIRROR TO DISK = N'C:\SQLskills\mediaset2device1.bck',
DISK = N'C:\SQLskills\mediaset2device2.bck'
WITH FORMAT, STATS;
GO

RESTORE HEADERONLY FROM DISK = N'C:\SQLskills\mediaset1device1.bck';
RESTORE HEADERONLY FROM DISK = N'C:\SQLskills\mediaset2device1.bck';
GO

The BackupSize in the HEADERONLY output of both files is 339,918,848 bytes - again, double the size of the non-mirrored backup in case #3. The on-disk size of each file is 81MB, as each file is one half of a copy of the backup.

Restoring

Now let's try to mix devices from the two backup media sets and see if it's possible:

RESTORE DATABASE AdventureWorks
FROM DISK = N'C:\SQLskills\mediaset1device1.bck'
,
DISK = N
'C:\SQLskills\mediaset2device2.bck'
WITH REPLACE, STATS
;
GO

And it works fine - excellent! That's the whole point of having mirrored backups.

One other question is - can backup device types can differ between media sets in the same backup. The answer to this is no - as documented in Books Online. All the backup devices involved in a single backup, regardless of whether they're part of a mirror media set or not, must be of the same type and have similar characteristics.

Hope this is useful.