Tuesday, February 19, 2008

A few short notes this morning regarding the blogs and other stuff.

We had a big outage over the weekend, which rather embarrassingly manifested itself as 'out-of-disk-space' errors for anyone trying to get to any of our blogs. As you all know we preach about pro-active monitoring of data and log file space, so this didn't look good IMHO. All I can say is that it was the website and blogs log drive on the hosting company's server that filled up, not something we have control over. Needless to say, their process has been fixed so that it shouldn't happen again. Sorry about that (and thanks to all of you who dropped me mail to let me know).

Now Kimberly and I have recovered from six straight weeks of teaching, we'll be making progress on other projects. I've had a bunch of people ask where the annotated slide decks are (see this post). We've been a little busy with our teaching projects the last few months (see the previous post) but we're working on getting the first deck ready - it'll be Disaster Recovery: From Planning to Practice to Post-Mortem.

As far as products go, I've had some good feedback from some people who've bought the DDM we have available. If anyone's interested in writing a review (or has already posted one - good, bad, or ugly) please let me know. I'd also like suggestions for new features for V2 as well.

Thanks!

Tuesday, February 19, 2008 1:10:31 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Thursday, February 14, 2008

Phew - last week Kimberly and I spent 3 days teaching the ins-and-outs of SQL Server 2008 for DBAs/IT-Pros to about 130 Microsoft SQL Server experts and MVPs (like Kalen Delaney, Adam Machanic and Ron Talmage). This was the (95% complete) Beta delivery of a course we've been developing for the last six months for Microsoft that they'll use to train their SQL experts around the world on the new release. It's been very interesting watching the features develop through the CTPs (especially since I left the fold last August) - and making demos work on pre-release builds of the CTPs.

Teaching the course was a *blast* - the thing I love about teaching a really geeky crowd is the plethora of great questions and opportunities for going deep with explanations. Our team actually wrote and delivered the concurrently presented Developer and BI tracks as well. As you can see from the list below (and this is just the features a DBA needs to use/know about), SQL Server 2008 isn't a dot release of Yukon at all, as some people have suggested. Over the three days we covered:

  • Database Mirroring (D)
  • Backup Compression
  • Peer-to-Peer Replication (D)
  • Transparent Data Encryption (D)
  • Extensible (Off-Box) Key Management
  • All Actions Audited (D)
  • Policy-Based Management
  • Resource Governor (D)
  • Extended Events (D)
  • Spatial Indexes
  • Integrated Full-Text Search
  • Sparse Columns (D)
  • Filtered Indexes
  • Change Tracking
  • Change Data Capture (D)
  • FILESTREAM (D)
  • Performance Data Collection
  • Query Optimizer Enhancements
  • Data Compression (D)
  • Service Broker
  • Partition-Level Lock Escalation (D)

The features marked with a (D) are ones I demo'd during the course (Kimberly demo'd a bunch of the others - especially the tools features). Some of the demos were challenging to make work in time as we only got a pre-CTP6 build mid-January just before we headed off to China.

So why am I posting this? Well, a bunch of these features are in CTP-6, which should be just around the corner, and I have some easy-to-understand demos of them that I'll be posting here over the next month or so. Also, if this course sounds interesting, Kimberly and I will be teaching it in various configurations over the next year - starting with SQL Connections in April, a soon-to-be-announced class in Iceland in March, and the ITPro portion of TechEd in June.

Watch this space starting next week (today's the last day of six straight weeks of teaching for us so this weekend's a break :-))

Thursday, February 14, 2008 4:38:30 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 

Last year I posted on my old blog about the active SQL Server team blogs. I just happened to post on February 14th and so in every class Kimberly teaches, she makes fun of how romantic I was to post that on Valentine's Day. So what better thing to post this year than an update to that old post!

Again, this is a list of 'active' blogs, not just one-post wonders, or blogs that are inactive but have a ton of fantastic info archived on them. I've grouped them by rough area and updated the list from last year, removing some that have been inactive for many months. I've also added a list of non-SQL team blogs that I follow too. Eventually I'll put this on our blogs page too - http://www.SQLskills.com/blogs.asp.

Enjoy (and Happy Valentine's Day again Kimberly! )

General SQL Server

Compact/Express

Data Programmability

Storage Engine

Service Broker

Relational Engine

Analysis Services / Data Mining

Reporting Services

Sync Services / Replication

SSIS / DTS

Manageability / Tools

SQLskills.com Team

Select MVPs I Read

Thursday, February 14, 2008 9:58:20 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, February 12, 2008

Almost a year ago to the day I asked a question on my old blog - how long does *your* CHECKDB take to run? Well, I had a bunch of replies but I never got around to posting the results. I was reminded of this recently so threw together some graphs of the results. I've plotted database size against run-time of DBCC CHECKDB, for a number of different numbers of CPUs, and mostly on SQL Server 2000. I've also added a linear trend line too.

Now - bear in mind this doesn't take into account the multitude of other factors that can affect how long DBCC CHECKDB takes to run (see my previous post here for the list).

Hopefully these are interesting to some of you!

Tuesday, February 12, 2008 12:35:20 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Thursday, February 07, 2008

After many reminders (thanks Adam Machanic!) I've added Conor and Simon to the two aggregated feeds over all the SQLskills.com blogs.

There are two feeds:

  1. SQL Server 2008 Category Posts
  2. All posts

The amount of content is really growing as Simon and Conor also seem to not sleep like me :-)

Enjoy!

Thursday, February 07, 2008 5:03:30 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, February 05, 2008

This post is based on one from my old MSDN blog but the topic has come up a few times in recent days so I want to revamp it and re-post.

There are two things that confuse people about mirrored backups - can you mix-n-match backup devices from the mirrors, and what exactly do the various sizes mean?

1) Single-device backup, no mirror

The code below creates a single-device backup with no mirror, and then examines it.

BACKUP DATABASE AdventureWorks TO
DISK
= N'C:\SQLskills\mediaset1device1.bck'
WITH FORMAT, STATS;
GO

RESTORE HEADERONLY FROM DISK = N'C:\SQLskills\mediaset1device1.bck';
GO

The BackupSize in the HEADERONLY output is 168,899,072 bytes and the on-disk size of the file mediaset1device1.bck is 161MB.

2) Single-device backup, mirrored

The code below creates a single-device backup with a mirror, and then examines it.

BACKUP DATABASE AdventureWorks TO
DISK
= N'C:\SQLskills\mediaset1device1.bck'
MIRROR TO DISK = N'C:\SQLskills\mediaset2device1.bck'
WITH FORMAT, STATS;
GO

RESTORE HEADERONLY FROM DISK = N'C:\SQLskills\mediaset1device1.bck';
RESTORE HEADERONLY FROM DISK = N'C:\SQLskills\mediaset2device1.bck';
GO

The BackupSize in the HEADERONLY output of both files is 337,798,144 bytes. This is double the size of the backup in case #1 above - and it because there are now two copies of the backup. The on-disk size of both files is 161MB, which is what we'd expect as mediaset2device1.bck is a copy of mediaset1device1.bck.

3) Two-device backup, no mirror

The code below creates a two-device backup with no mirror, and then examines it.

BACKUP DATABASE AdventureWorks TO
DISK
= N'C:\SQLskills\mediaset1device1.bck',
DISK = N'C:\SQLskills\mediaset1device2.bck'
WITH FORMAT, STATS;
GO

RESTORE HEADERONLY FROM DISK = N'C:\SQLskills\mediaset1device1.bck';
GO

The BackupSize in the HEADERONLY output is 169,959,424 bytes. This is nearly exactly the same as for the single-device backup in case #1, but includes a bit more to account for the extra metadata in the second device. This time, the on-disk size of the file mediaset1device1.bck is 81MB. This is half of the on-disk size from the single-device case #1 as the backup is now split between the two files.

4) Two-device backup, mirrored

The code below creates a single-device backup with a mirror, and then examines it.

BACKUP DATABASE AdventureWorks TO
DISK
= N'C:\SQLskills\mediaset1device1.bck',
DISK = N'C:\SQLskills\mediaset1device2.bck'
MIRROR TO DISK = N'C:\SQLskills\mediaset2device1.bck',
DISK = N'C:\SQLskills\mediaset2device2.bck'
WITH FORMAT, STATS;
GO

RESTORE HEADERONLY FROM DISK = N'C:\SQLskills\mediaset1device1.bck';
RESTORE HEADERONLY FROM DISK = N'C:\SQLskills\mediaset2device1.bck';
GO

The BackupSize in the HEADERONLY output of both files is 339,918,848 bytes - again, double the size of the non-mirrored backup in case #3. The on-disk size of each file is 81MB, as each file is one half of a copy of the backup.

Restoring

Now let's try to mix devices from the two backup media sets and see if it's possible:

RESTORE DATABASE AdventureWorks
FROM DISK = N'C:\SQLskills\mediaset1device1.bck'
,
DISK = N
'C:\SQLskills\mediaset2device2.bck'
WITH REPLACE, STATS
;
GO

And it works fine - excellent! That's the whole point of having mirrored backups.

One other question is - can backup device types can differ between media sets in the same backup. The answer to this is no - as documented in Books Online. All the backup devices involved in a single backup, regardless of whether they're part of a mirror media set or not, must be of the same type and have similar characteristics.

Hope this is useful.

Tuesday, February 05, 2008 9:39:45 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Monday, February 04, 2008

The second post in my series on FILESTREAM (see here) deals with how to enable FILESTREAM through T-SQL and using SSMS.

You can't just create FILESTREAM data - you have to specifically enable it at the instance level and then possibly restart the instance or host computer. The reason for this is that there's a file-system filter driver (called RsFx) installed to aid access to the data through the NTFS Streaming APIs. The rules for this are:

  • The computer needs to be restarted if this is the first time FILESTREAM has been enabled on any instance hosted on the machine
  • The instance needs to be restarted if you disable FILESTREAM after its been enabled

In the previous article I mentioned that FILESTREAM can't be enabled on a mirrored database - there's another restriction I forgot: FILESTREAM isn't supported for instances running on WOW64. I don't remember the architectural limitation that prevents this but I do remember the heated discussions in the dev team back in early 2007.

Anyway, I digress. You can enable FILESTREAM through T-SQL using the sp_filestream_configure stored procedure. It takes two parameters @enable_level and @share_name. The level of support is configured through the first and has the following options:

  • 0 - FILESTREAM is disabled for the instance
  • 1 - FILESTREAM is enabled for T-SQL access only
  • 2 - FILESTREAM is enabled for T-SQL AND local file-system access
  • 3 - FILESTREAM is enabled for T-SQL, local file-system, AND remote file-system access

If level 3 is specified, then the a share name for remote access must also be specified. For example:

EXEC sp_filestream_configure @enable_level = 3, @share_name = 'MyFilestreamSQLServerInstance';
GO

Note that once the share name is specified, it can't be changed without disabling and re-enabling FILESTREAM on the instance.

Now, if you don't want to have to remember the options you can use SSMS to do the enabling (with no change to the restart requirements listed above). To do this you right-click on the server name in the Object Explorer and select Properties. When the Server Properties window comes up, select the Advanced tab and right at the top is the FILESTREAM section. Here's what I mean:

Next up is creating a FILESTREAM filegroup and adding data.

Monday, February 04, 2008 11:57:33 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Sunday, February 03, 2008

Here's an issue that I thought was a one-off but it just popped up again over the weekend so I want to publicize it.

DBA runs a DBCC CHECKDB and gets output like the following:

Msg 5172, Level 16, State 15, Line 1
The header for file 'E:\Data\namechanged.mdf:MSSQL_DBCC14' is not a valid database file header. The PageAudit property is incorrect.
Msg 5120, Level 16, State 9, Line 1
Unable to open the physical file "E:\Data\namechanged.mdf:MSSQL_DBCC14". Operating system error 0: "(null)".
Msg 1823, Level 16, State 2, Line 1
A database snapshot cannot be created because it failed to start.
Msg 7928, Level 16, State 1, Line 1
The database snapshot for online checks could not be created. Either the reason is given in a previous error or one of the underlying volumes does not support sparse files or alternate streams. Attempting to get exclusive access to run checks offline.

etc

Wow! Looks like something's seriously wrong with that database. Until you find that this happens on *all* the databases on the server. And on all the other servers in the data center too. What on earth's going on? I saw a similar issue once before, back in 2004 or 2005 while customers were testing pre-RTM IDW builds of SQL Server 2005. After running a test and doing some investigation for me, the person who just hit this issue over the weekend confirmed my hypothesis that he'd hit the problem I saw a few years ago.

The problem is a 3rd-party file encryption solution. It installs a file-system filter driver that filters all reads and writes to the database files so it can perform the encryption/decryption. Unfortunately it doesn't cope with NTFS alternate streams - which is how the internal database snapshots that DBCC CHECKDB uses are created - and so returns garbage for all reads from the database snapshot. The simple workaround is to create your own database snapshot (whose constituent files will exist in their own right, rather than being alternate streams of the source database's files) and then run DBCC CHECKDB against that.

Now - I don't want to cry 'wolf' here, but be careful of attributing massive corruption problems to the real IO subsystem if there's any kind of file-system filter driver installed.

Sunday, February 03, 2008 10:36:34 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Saturday, February 02, 2008

Now we're back at home and working very strange hours as we deal with jet-lag. The flight back from Beijing to Vancouver was a great flight but with weird seats in business class - individual pods instead of actual seats. Nice to be able do lie down flat but not very wide - not the most comfortable for either of us given we both over 6-feet tall. Anyway, we didn't fly out until 5.30pm so we booked a final sightseeing tour for Friday morning.

Friday was bitterly cold and windy and unfortunately everything we went to see was outside. We started off at the 600-year old Temple of Heaven where the Emperors used to pray for a good harvest every year. It was built by the Yongle Emperor who also built the Forbidden City which we visited later. We got there early to see the locals practising Tai Chi and calligraphy. The calligraphists were painting with water that was freezing within 30 seconds or so - very cool (ha ha). As we got up to the temple, we could see some sort of ceremony taking place. It turned out to be a dress-rehearsal for Chinese New Year of a ceremony involving an Emperor and about 500 courtiers. It reminded me strongly of an early scene from one of my favorite movies - Bertolucci's The Last Emperor - where the young Emperor Puyi is crowned.

Here are a few photos - click them for larger versions (or get the zip here).

 

 

 

 

After the Temple of Heaven we went to Tiananmen Square, the largest open urban square in the world. The wind chill walking across the square was incredible - leading Kimberly to buy a very (not) tasteful hat which she refused to let me post a photo of (but I have below anyway :-)). The guide gave us a good lecture on it's history but didn't mention the protests of 1989 - big surprise. From there we went into the Forbidden City (this Wikipedia page has a ton of background info), which was amazing. It's the largest surviving Imperial palace in the world. There was a lot of refurbishment going on, partly in preparation for the Olympics later this year, but all part of an 18-year project started back in 2002. With only a couple of hours before we had to head back to the hotel to pack for the flights home, we couldn't cover everything but we still saw an amazing amount. Here are some photos - again click for larger versions (or get the zip here).

 

 

 

 

 

 

The tree in the final photo is two trees that have fused and are said to embody all that is good about being a Couple.

In summary, I think this was one of the best international trips we've taken together - a nice mix of work (mostly) and play. It was just fantastic to visit two places I've been wanting to see since I was a child - the Great Wall and the Forbidden City. We're planning to post about stuff we see on our trips in a new category Where In The World Are Paul And Kimberly - if this is interesting please let me know.

Thanks

Saturday, February 02, 2008 8:50:28 PM (Pacific Standard Time, UTC-08:00)  #    Comments [3]  | 
Thursday, January 31, 2008

In a previous post (see here) I debunked a myth about how much transaction log a full backup would include. I had a question in the blog post comments that asked (paraphrasing):

The full backup has to include all the transaction log from the begin LSN of the oldest active transaction at the time the data read portion of the backup ends, until the LSN at which the data read portion ends. If that begin LSN is later in time than the LSN of the checkpoint that backup does initially, why does the full backup need to include all thr transaction log between the checkpoint and the begin LSN? What is it used for?

I replied in the comments with a quip that it would be easier to reply with a whiteboard and a timeline - so I got all enthusiastic and created a picture in Powerpoint to help explain better.

Consider the timeline in the picture above for a full backup (the red numbers match the list below):

  1. The backup operation take a checkpoint to force all dirty pages in the buffer pool to disk - both those containing changes from transactions that have committed and those containing changes from transactions that are still in-flight. The backup operation then starts reading the allocated pages in the database.
  2. The read operation reads page X
  3. Transaction A starts
  4. Transaction A makes a change to page X. The copy in the backup is now out-of-date. Note that the backup will not read page X again - it's already passed that point in the database.
  5. Transaction B starts. It won't complete before the data read operation completes so it's begin LSN is the oldest active transaction begin LSN.
  6. Transaction A commits. This commits the changes to page X.
  7. The backup data read operation completes and transaction log reading starts.

Now, the reason that the transaction log is read is so that the restore operation can recover the database so it is transactionally consistent as of the point in time when the read data operation completed.

If the transaction log was only included from the oldest active transaction begin LSN (point 5), then the copy of page X that was restored from the backup (read at point 2) would not be updated with the changes from transaction A (that happened at point 4). This means that it would not be transactionally consistent with the rest of the database as of the time the read data operation completed (point 7).

So, (ignoring replication) the minimum LSN of the transaction log that's included in the full backup is MIN (LSN of last checkpoint, LSN of oldest active transaction). This ensures that recovery can REDO log records to bring pages up-to-date and UNDO log records for transactions that had not committed.

Much easier to explain with aid of a picture than without! :-)

Thursday, January 31, 2008 12:01:58 AM (Pacific Standard Time, UTC-08:00)  #    Comments [3]  | 
Tuesday, January 29, 2008

Well, I bet that title grabbed a bunch of people's attention :-)

In the Beijing Advanced Research Centre offices of Microsoft where we're teaching this week, the bathroom stalls have print-outs of jokes on them. The ones in the women's bathroom are pretty good - here are a couple of them that made us laugh (thanks to Kimberly for doing the research!)

#1: A man has a girlfriend, Lorraine, who's he's been dating for a while. He's pretty happy. Into his life suddenly comes a new woman, called Clearly, who is better than Lorraine in many ways - she knows the best place in town to visit, to eat, and to dance. Unfortunately he doesn't want to give up on Lorraine though. One day they're walking along the banks of a river when Lorraine falls in and is swept away. The man starts singing 'I can see Clearly now Lorraine has gone...'

#2: A man and his wife are lying in bed, just about to fall asleep. The conversation goes like this:

Wife: 'If I were to die, would you re-marry?'

Husband: 'Of course not!'

Wife: 'Why? Don't you like being married?'

Husband: 'Well, yes - I suppose I would re-marry'

Wife: 'Would you take down all the pictures of me and replace them with her?'

Husband: 'Well yes, that would be the proper thing to do'

Wife: 'Would you let her use my golf-clubs?'

Husband: 'No, she's left handed... oh shit!'

:-)

Tuesday, January 29, 2008 9:17:51 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

This is a post I've been meaning to do for a while - detailing some of the problems you can run into when using database snapshots.

Reverting to a snapshot

Many people use snapshots as a way of protecting against mistakes during a complicated set of changes to a database - you can just revert to the snapshot if something goes wrong. However, reverting to a snapshot has a little-known (but documented) problem. Let's see. I'm going to create a database plus a snapshot of it and then start taking backups.

CREATE DATABASE SnapshotTest;
GO
ALTER DATABASE SnapshotTest SET RECOVERY FULL;
GO
BACKUP DATABASE SnapshotTest TO DISK = 'C:\SQLskills\SnapshotTest.bak' WITH INIT;
GO

CREATE TABLE SnapshotTest.dbo.MyTable (c1 INT);
GO

BACKUP LOG SnapshotTest TO DISK = 'C:\SQLskills\SnapshotTest_log.bak' WITH INIT;
GO

-- Imagine a bunch of things happen here

-- Create the snapshot database, by first checking which files exist
SELECT * FROM SnapshotTest.sys.database_files;
GO

CREATE DATABASE ST_Snap ON
(NAME = SnapshotTest, FILENAME = 'C:\SQLskills\SnapshotTest_snap.snp')
AS SNAPSHOT OF SnapshotTest;
GO

Now I'm going to do some stuff that's a mistake:

DROP TABLE SnapshotTest.dbo.MyTable;
GO

Not a problem as I can revert to my database snapshot:

RESTORE DATABASE SnapshotTest FROM DATABASE_SNAPSHOT = 'ST_Snap';
GO

Oops! I forgot to take a log backup to capture everything that happened since the last log backup...

BACKUP LOG SnapshotTest TO DISK = 'C:\SQLskills\SnapshotTest_log.bak';
GO
Msg 4214, Level 16, State 1, Line 1
BACKUP LOG cannot be performed because there is no current database backup.
Msg 3013, Level 16, State 1, Line 1
BACKUP LOG is terminating abnormally.

Reverting to a snapshot rebuilds the transaction log and breaks the log backup chain. The only thing I can do is take a full or differential backup and then start taking log backups again. So - I lost the ability to do point-in-time restores in the period from the last log backup to the time when I reverted to the snapshot.

So how can I tell when the database was reverted? Books Online documents that the restorehistory table in msdb should have an entry with restore_type = 'R'. Let's try:

SELECT * FROM msdb.dbo.restorehistory WHERE [destination_database_name] = 'SnapshotTest';
GO

Nope - nothing. That functionality seems to be broken. The only way I could find to trace reverting was in the error log:

2008-01-30 11:09:21.73 spid53      Reverting database 'SnapshotTest' to the point in time of database snapshot 'ST_Snap' with split point LSN 26000000013800001 (0x0000001a:0000008a:0001). This is an informational message only. No user action is required.
2008-01-30 11:09:21.74 spid53      Starting up database 'SnapshotTest'.
2008-01-30 11:09:21.74 spid53      The database 'SnapshotTest' is marked RESTORING and is in a state that does not allow recovery to be run.
2008-01-30 11:09:21.85 spid53      Starting up database 'SnapshotTest'.
2008-01-30 11:09:21.87 spid53      Starting up database 'ST_Snap'.

Note that the snapshot database is still there and continues to work.

Running out of space in a snapshot

The next problem that can happen is if you create a database snapshot on a volume that doesn't have much disk space. In that case you can actually run out of space and the snapshot will go suspect and be unusable until it's dropped and recreated. Let's see what happens:

-- Create snapshot of large database on a volume with not much space
SELECT * FROM SalesDB.sys.database_files;
GO

CREATE DATABASE SalesDB_Snap ON
(NAME = SalesDBData, FILENAME = 'D:\sqlskills\SalesDB_snap.snp')
AS SNAPSHOT OF SalesDB;
GO

ALTER INDEX ALL ON SalesDB.dbo.Sales REBUILD;
GO
Msg 5128, Level 17, State 2, Line 1
Write to sparse file 'D:\sqlskills\SalesDB_snap.snp' failed due to lack of disk space.

The query that causes the sparse file to run out of space doesn't fail, but the user running the query will see this error. Note that it doesn't say the snapshot has gone suspect. If you try to use it then it will tell you, plus there info written to the error log:

2008-01-30 11:50:29.14 spid53      Error: 17053, Severity: 16, State: 1.
2008-01-30 11:50:29.14 spid53      D:\sqlskills\SalesDB_snap.snp: Operating system error 112(There is not enough space on the disk.) encountered.
2008-01-30 11:50:29.15 spid53      Error: 3420, Severity: 21, State: 1.
2008-01-30 11:50:29.15 spid53      Database snapshot 'SalesDB_Snap' has failed an IO operation and is marked suspect.  It must be dropped and recreated.

Running out of space while running DBCC CHECKDB

One thing that isn't common knowledge is that DBCC CHECKDB uses database snapshots as it's mechanism for running online in SQL Server 2005. You can't control where the snapshot is created (it's actually created in the same location as the files comprising the database being checked - see my previous post on the DBCC CHECKDB stages here for more info) but you can create your own snapshot and check that. Anyway - it's possible to run out of space in the snapshot while DBCC CHECKDB is running. I setup a situation with a large database on a drive with only 1MB of space left, started a large index rebuild in the database and then tried to run DBCC CHECKDB. Let's see what happened:

The index rebuild statement:

ALTER INDEX ALL ON SalesDB2.dbo.Sales REBUILD;
GO
Msg 5128, Level 17, State 2, Line 1
Write to sparse file 'd:\sqlskills\salesdbdata.mdf:MSSQL_DBCC20' failed due to lack of disk space.

So some poor unsuspecting user gets this error and has no idea why. For DBAs its a bit obscure too. The syntax is describing an alternate stream on the existing data file and the snapshot in question can't be accessed at all.

The DBCC CHECKDB statement:

DBCC CHECKDB (salesdb2) WITH NO_INFOMSGS;
GO
Msg 1823, Level 16, State 1, Line 1
A database snapshot cannot be created because it failed to start.
Msg 1823, Level 16, State 2, Line 1
A database snapshot cannot be created because it failed to start.
Msg 7928, Level 16, State 1, Line 1
The database snapshot for online checks could not be created. Either the reason is given in a previous error or one of the underlying volumes does not support sparse files or alternate streams. Attempting to get exclusive access to run checks offline.
Msg 5128, Level 17, State 2, Line 1
Write to sparse file 'd:\sqlskills\salesdbdata.mdf:MSSQL_DBCC20' failed due to lack of disk space.
Msg 3313, Level 21, State 2, Line 1
During redoing of a logged operation in database 'salesdb2', an error occurred at log record ID (1628:252:1). Typically, the specific failure is previously logged as an error in the Windows Event Log service. Restore the database from a full backup, or repair the database.
Msg 0, Level 20, State 0, Line 0
A severe error occurred on the current command.  The results, if any, should be discarded.

Wow - major amounts of errors - and it all boils down to running out of space.

Tuesday, January 29, 2008 8:26:11 PM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 

This has been causing some problems on the various groups and forums over the last few days so I thought I'd repost this from my old Storage Engine blog. The questions have been around attaching 2005 databases to 2000 servers - even databases that are in 80 compat mode - and it doesn't work. Why?

The confusion is between database compatibility level and database version. Here's a quick explanation of the difference.

Database version

The database version is a number stamped in the boot page of a database that indicates the SQL Server version of the most recent SQL Server instance the database was attached to. The database version number does not equal the SQL Server version. For example, doing the following:

SELECT @@version;
GO

on one SQL Server instance on my laptop returns:

Microsoft SQL Server 2005 - 9.00.3054.00 (Intel X86)   Feb 13 2007 23:02:48   Copyright (c) 1988-2005 Microsoft Corporation  Developer Edition on Windows NT 5.1 (Build 2600: Service Pack 2)

However, the database version is 611. You can see the database version but if you attach a database from an earlier version of SQL Server, you'll see these numbers in the error log as SQL Server reports what upgrade steps its doing. You can also see by doing the following:

USE master;
GO

SELECT DatabaseProperty ('dbccpagetest', 'version');
GO

Some things to note about database version:

  • SQL Server is not up-level compatible. You cannot attach a database that was created on (or has been upgraded to) SQL Server 2005 to any earlier version of SQL Server (also true for trying to attach a 2000 database to 7.0, and so on).
  • You cannot attach a database that was created on an earlier version without going through the proper upgrade procedures. Forcibly attaching a database using various hacky methods will result in all kinds of weird errors, and possibly crashes.

Database compatibility level

The database compatibility level determines how certain database behaviors work. For instance, in 90 compatibility, you need to use the OUTER JOIN syntax to do an outer join, whereas in earlier compatibility levels, you can use '*=' and '=*'. Contrary to popular myth, all of the behavioral differences ARE documented - in the Books Online section for sp_dbcmptlevel - the SP used to set the compatibility level.

There are 5 supported compatibility levels support by SQL Server 2005:

60 = SQL Server 6.0

65 = SQL Server 6.5

70 = SQL Server 7.0

80 = SQL Server 2000

90 = SQL Server 2005

You can see the compatibility level of all databases by doing:

SELECT name AS 'DB Name', compatibility_level AS 'Compatibility Level'
FROM master.sys.databases;
GO

Some things to note about compatibility levels:

  • A database created on SQL Server 2005 will have a default compatibility level of 90, unless the model database has a different compatibility level, in which case the new database inherits the compatibility level of model.
  • New features may work under older compatibility levels but beware of SET options.
  • An upgraded database retains its compatibility level. For example, a database that was created on SQL Server 2000, and didn't have its compatibility level altered, will stay in 80 compatibility level when its upgraded to SQL Server 2005.

Summary

This was just a quick - and by no means comprehensive - explanation of the difference between the two terms. Basically, there's no relationship between them.

Tuesday, January 29, 2008 3:51:14 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Monday, January 28, 2008

A bit more traffic on the thread (see previous post here) prompted me to give my thoughts on the many sweeping generalizations that plague the computer industry and make it difficult sometimes to give advice in forums and blogs. I'd like to repost here (with a few tweaks for clarity).

Some examples of questions that breed sweeping generalizations:

  • Should you have clustered indexes on all tables? The well-known clustered-index debate as Kimberly likes to call it.
  • Should you rebuild or reorganize indexes to remove fragmentation?
  • Which high-availabilty solution should you use?

The problem - as with most advice - is that it's extremely hard to make generalizations. This is both because:

  1. without lots of evidence many people (quite rightly) don't believe sweeping generalizations as they may have been bitten by one in the past
  2. nearly every situation is different so many generalizations are useless

What I'd love to see, (and I tried to do this when at MS, and like to think I do it here or when teaching classes or conferences) is for people to provide the justification for generalizations, plus some idea of the exceptions and the circumstances under which they arise.

As for this case (whether to create multiple files because there are multiple cores/CPUs), I think we've about done this one to death. The sweeping generalizations here are:

  1. for non-tempdb you usually don't need multiple files, unless you have a very high-end workload of the specific nature I described in my first post (rare)
  2. for tempdb you usually do, as long as your workload merits it on a multi-core/cpu box
  3. IO vendors may recommend it for increased IO throughput *on their specific hardware*
  4. there exist sweeping generalizations from various sources that dispute all of the above

Unfortunately, you're not going to get a definitive, authoritative answer to a design/strategy question such as this and you'll continue to find contradictions to anything anyone says on the forums, and even MS contradicting itself (sigh).

What I would suggest is the following:
1) go with the majority opinion of responses to questions asked, based on the respondents collective experience with many customers, databases, and workloads
2) do your own testing, on your own hardware, with your own workload and see what works for you (but beware that testing in a vacuum can prove or disprove anything you want - which is why you see so many contradictory statements)

One last thing on MS - it's a very big company, with lots of groups. Anyone can sponsor a whitepaper, write a blog post/MSDN article/technet article and publish it, or reply on a forum as a visible MS person and it has the 'official stamp' of coming from MS. When I was in the product group I was continually dismayed by the misinformation, bad advice, contradictions, and baseless assertions that I saw coming from MS employees who were just trying to be helpful.

Once something's published on the internet, it's *incredibly* hard to undo the damage done. There's a fundamental element of mistrust sometimes on forums and newsgroups which can be wearying when you're trying to help people out. It can be very hard to convince people that someone else's advice isn't the best to follow - I remember several times arguing with people about how CHECKDB works or what a corruption error message means and finally having to resort to 'I wrote that code - I'm afraid you *are* wrong' - which I really hate doing.

Anyway - rant over :-)

Monday, January 28, 2008 9:35:24 PM (Pacific Standard Time, UTC-08:00)  #    Comments [5]  | 

There's been a very interesting discussion going on over at SQLServerCentral.com about whether to create multiple files for a user database because the server has multiple CPUs - see the thread here. I wrote a couple of long responses during the thread that I wanted to duplicate here as I think it's of broad interest.

My first response was:

Doesn't make sense to split any database into multiple files for per-cpu performance, except tempdb which can suffer from multiple-CPUs all trying to modify the same allocation bitmaps under high load with small temp tables being created/deleted (see http://www.microsoft.com/technet/prodtechnol/sql/2005/workingwithtempdb.mspx for details).

Now, saying that, there's an exception - and that's when a non-tempdb database has the same allocation bitmap contention issues, but that only happens on the very high-end with thousands of single-row inserts per second happening on each CPU. It's pretty rare. I've never seen it but Kimberly has.

What you should be considering is splitting a database into filegroups to be able to isolate critical tables within separate filegroups, or splitting a partitioned table across multiple filegroups. For SS2005, these aren't for performance but instead for manageability (e.g. being able to restore a single filegroup offline while the rest of the database remains online, or setting several partitions of a table to be read-only). For SS2008, there are improvements to query plan generation over multiple partitions that could allow for better parallelism, but its still not optimal compared to non-partitioned data.

So - complex topic and these are simple guidelines. Hope they help.

This was followed by a discussion pointing out that various hardware vendors imply the opposite, as do several MS sources. So here was my latest response:

Hi folks,

What's really not made clear in the MS literature is that the multiple-files in a database really only applies to tempdb. Even on tempdb, on 2005 you don't need one file per core - more like 1/4 -1/2 the number of files as there are cores.

The tempdb problem is this - common workloads create and drop many worktables per second. The allocations that happen for a table are initially single-pages (as opposed to whole extents). This means that a search of the SGAM page for the 4GB interval needs to be done to find a mixed-extent with a free page to allocate. Multiple CPUs all hammering this page cause contention on it and performance problems. Then, a page needs to be allocated for the first IAM page - same thing happens. Then these pages need to be marked allocated in a PFS page - same thing happens. And then these pages need to inserted into the sysindexes row for the tabel - more contention. On 2000 this was particularly bad - so T1118 plus multiple files was the solution, where SQL Server would round-robin the single page allocations in the files in tempdb, alleviating the contention somewhat.

In SQL Server 2005, we changed the temp table mechanism so that whenever a temp table is dropped, one data page, one IAM page, and the system table entries (no longer sysindexes, but instead is a 'hidden' table called sys.allocation_units) are cached. When a new temp table is allocated, if there's a cached 'template temp table' it is picked up and used without so much contention on the various allocation bitmaps. On a heavily loaded system there can still be contention and so you still need multiple files for an SMP box, but just not so many. And you don't need T1118 any more.

So - this is more prevalent on tempdb, but CAN happen on a user database under extreme load on monster hardware. Testing should show whether this is happening to you - if not, don't create multiple files for performance.

Now, in terms of what works for your particular vendors IO scalability - that's beyond me and you may need to think about that if its something they recommend. However - I'd still take it with a pinch of salt and do your own testing. See the whitepaper at http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/pdpliobp.mspx for some testing/banchmarking info.

Hope this helps clarify a bit more - great discussion!

I'm interested to hear any other opinions on this - especially cases where you've found it necessary to create multiple files for performance.

Thanks!

Monday, January 28, 2008 6:10:21 PM (Pacific Standard Time, UTC-08:00)  #    Comments [3]  | 

This kind of follows on from my previous post about making sure you have character column lengths that can handle data from different countries (e.g. city names that may be longer in one country than another). A question on the forums today asked what info there is available to help in designing a global-ready database.

It turns out that there's a wealth of information right there under your nose - type in 'globalization' in the Index of Books Online. It'll get you to a section titled 'International Considerations for SQL Server' that has a link to a sub-section for every component of SQL Server! Very impressive. For instance, the one for the Database Engine has everything you need (I've made these links to the latest online BOL entries on MSDN):

Check it out and save yourself some pain when your database/application suddenly needs to support customers outside your home country.

Monday, January 28, 2008 12:33:25 AM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 
Sunday, January 27, 2008

Microsoft has announced that the RTM date of SQL Server 2008 has slipped out to Q3 of this year. See this official blog post here. The official launch will still go ahead as planned on February 27th.

While some people may see this as disappointing, I don't see many large customers going straight into SS2008 when it comes out so I, for one, am glad they're taking the time to ensure a high-quality release.

Sunday, January 27, 2008 5:59:02 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Start of the second week in China. We flew up from Shanghai to Beijing yesterday, and just in time it seems. As we flew out of Shanghai it was being enveloped in a snowstorm. Beijing's much colder than Shanghai though - being further north and inland quite a way compared to Shanghai.

As I said a few days ago, the last time I was here I stayed in the Jiu Hua International Conference & Exhibition Center Hotel, which has to be one of the worst hotel experiences ever. I remember going to the buffet breakfast one day and picking up a pork bun to find some dead moths stuck to the bottom - that was my breakfast appetite gone for the day. Whenever I had laundry done they demanded cash before they'd give it back to me - no such thing as charging to the room. This time we're in the Grand Hyatt Beijing which is right in the middle of the city and is far more pleasant. Our room's not so high as in Shanghai as the height of buildings in the old part of Beijing is limited. Here's the view from our room this afternoon. (Click on it for a bigger version)

It's looking out on the Peking Union Medical College Hospital, formerly the Yuwang Residence (what our guide said was a house/palace for a prince). The hotel's just a few blocks from the Forbidden City and Tiananmen Square, and we can see into the Forbidden City (to the left of the view above). We're going to do a tour of these on Friday, along with the stunning Temple of Heaven, before catching the late flight back to Vancouver and on to Seattle.

Today was a free day for us so we decided to take a tour o