Friday, May 09, 2008

Have you ever written something and then lost it... for whatever reason: your own stupidity (come on we've all accidentally done something at some point where we lost data or a spreadsheet or a document or something...), the software eats it (this might be self-inflicted but I've been in apps that just hang and that's it... there's nothing you can do except power off), or....whatever.

Well, during those times... have you ever thought - I'd do almost anything to get that data (and time) back?

Before I go any further - yes, backups are good. No, let me be clearer... Backups are an absolute requirement of ANY environment!!! 

Even personal/home environments should have something to protect the data. Something to consider is even offsite copies. Burn a DVD or two of your precious family photos and then swap DVDs with a friend... there's your simple/cheap/offsite data recovery. We all focus on critical data when a business depends on it - what about the personal stuff... Paul and I always talk about database backups and this post is not really targeting enterprise systems or even business critical systems... Really, there's NOTHING THAT SHOULD EVER REPLACE A GOOD DISASTER RECOVERY STRATEGY...

Having said that though, I have an interesting story (with a data-saved/happy ending)!

I was on a plane flying back from SQL Connections (I was actually flying from Tampa after having visited family in St. Pete) and well, disaster struck again (yes, this has NOT been a good year for hard drives for me as this was my 3rd and MOST catastrophic disaster so far...).

Anyway, Paul and I were "discussing" what I STILL think is a bug in a SQL function (ok, I'll get to that in a later post) and I had been feverishly completing a multi-page blog post AND some index examples/metadata queries, etc. when my laptop slipped off of my crappy airplane table (I was in an exit row so I had a table that came out of an arm chair and folded - it was very wobbly) and crashed to the floor (and, the irony of being in an exit row... had I been in a regular seat there wouldn't have been room for the laptop to have fallen to the ground :).

To my surprise, I picked it up and still saw the screen I was working on. I typed in another line... and then got a little dialog that said "windows hard error" or something... I don't even remember now. The only thing I could do is power it off. My laptop was dead. Very dead. I [expect] that I had had a major head crash when my laptop hit the floor because it's likely the disks were still spinning. However, I still did not know this at the time.

I rebooted and received "Error 2100 - Hard drive initialization error"... so, when we landed in Denver, I was off to one of the small stores to buy an eyeglasses kit (I needed a screwdriver :) and then I went to find a spot to do some laptop surgery. The good news is that I've had so many disasters over the years that almost none really freak me out anymore (this is probably the 10th drive I've had personally fail) and, I always carry at least one spare laptop... But after the terrible time I had March in India, I actually had 2 spare laptops on this trip (yes, airline security hates me even more now - oh, and Paul travels with 2... so, we go through security with 5 laptops... we get some interesting looks!).

Anyway, I took the take the drive out, loaded it into my secondary drive bay (if you travel a lot - having this second bay that can hold a primary/secondary drive is INVALUABLE as these secondary drive bay usually use the same setup (Serial ATA in my case) as the boot drive). So, if your boot drive doesn't boot, you *might* be able to still read and/or save data by using the secondary drive. So, on a second machine, I gave it a try to see if it would spin. No dice. I even tried my third machine (my primary was XP, my secondary is XP and my backup/backup is Vista... I thought... well, maybe?) Ah, I thought wrong. My drive would never spin again. So, on my flight from Denver to Seattle, I was not overly pleased (this is an understatement to be honest - just ask Paul) because I was at least a couple of days from having done a backup AND I was even more frustrated about having lost the detailed indexing post I was working on - and even losing the code that was on screen just THAT day.

We finally got home (which seemed like a much-longer-than-normal trip ;) and I got online thinking/hoping - is there ANYTHING I can do... And, I stumbled on a reference to a possibly out of date BIOS version and the needed update which also happens to generate this SAME error. And, being hopeful (and opportunistically forgetting the drop/crash/thump which led me to this problem), I *attempted* the BIOS update and well, it didn't recognize that a drive was attached. OK, that was my last hope. Hours lost. Let's move over to my backup laptop and shift everything I do have backed up over........ which I did and I was up and running that morning. Yes, I had lost a few things and yes I was pretty frustrated but, I wasn't totally down. It wasn't as bad as it could have been without any backups...... but, I was still annoyed.

Then, I thought... is there any other option(s)? It's been at least 10 years since I sent a drive into a drive recovery place. So, I thought this needed some research. I wanted to see what it might take to (and/or IF I could) recover data. About 10 years ago, I had a single drive of a RAID 0 array fail and the disk recovery place couldn't recover any of it (well, I think they could get 128kb out it and it was going to be 800 bucks). But, that was 10 years agao. Have things changed??! Hmmm... what could they do?

Anyway, I got in touch with Drive Solutions, Inc. and they gave me the rundown of what it would take to get data back. I wasn't sure if it was really going to be worth it (especially for the costs) but I still wanted to go through the process (for multiple reasons - some of it was for the data but some of it was for this post - and to remind people of what's possible (myself included)) given that we talk about the importance of backups and the UNLIKELY potential for data recovery off of damaged disks...

The long story short (ah, too late :) is that they can do amazing things these days (NO GUARANTEES THOUGH!!!) and they can completely rebuild the drives in a clean room - replacing drive heads, etc. Once complete, IF they get anything back, they'll give you a directory structure of what they've recovered and different options for getting it back to you (DVDs of just critical directories (there's a cost for each DVD after the first one or two) OR you can purchase a new comparable drive and they'll copy it over). The whole process took about a week (and this was for expedited service - which was also an extra charge). However, and amazingly, they recovered everything (well, I've only done a bit of spelunking but so far, so good)...

And, here's the coolest part, I was working on some SQL files at the exact time of the disaster and well, I went to the \Documents and Settings\username\My Documents\SQL Server Management Studio\Backup Files\ directory and found a directory of Solution1. In it were 3 files with similarly ugly names (~AutoRecover.~vsC.sql, ~AutoRecover.~vsC.sql~RF93d469.TMP, etc...) and the AutoRecover file was 2KB (the others were 0KB). Anyway, I opened up the file and viola! I actually recovered the .sql script I was working on at the time I dropped my laptop (well, I still blame this on Paul cause he made me lean over to talk to him and this ultimately pushed my laptop off the crappy airline table ;) ;) ;) :).

Needless to say, I am pretty amazed at what they can do now... but, I'm certainly not going to rely on that AND, it wasn't cheap!!!

So... what did I learn??!

1) First and foremost, data recovery is NEVER GUARANTEED. (yes, ok, we all knew that. However, I think we just need to say it out loud a few times :).
2) Even if they can recover some of the data, data recovery is not lightning fast. Even with the more expensive expedited service it takes time to ship (overnight), get the drive into a clean room, rebuild it from parts of an exactly matching (including BIOS/firmware) drive, test it, copy it to another drive, ship it back (overnight). Maybe you can find a place that's local, that would help but, it's still time...
3) It's expensive... expect about $10-15 per GB. And, I guess that some will think that's a crazy amount... Again, I had multiple motivating factors - one of which was also related to some pictures I had taken over the weekend with family in FL, some was for data, some was even better to understand this overall process... I expect that got about 10 hours back plus some photos and, I'm pretty impressed with the overall process (in general).

More than anything, I'm going to get even better at daily full system backups when I'm on the road (scheduled/overnight to an external drive) and I might even copy critical stuff to something like Windows Live Skydrive (or something like that). And, while on a plane, I might keep a small/simple USB stick handy if I do something that I really don't want to lose while on a long plane ride. I think new technologies like "mesh" and "cloud" are really interesting and definitely the direction to better performance AND *possibly* minimizing data loss but you're always at risk if there's only one copy. All of this might seem crazy but well...... I've been called worse ;-)).

So, just to wrap things up, I'll be doing a bit of final tweaking on my indexing blog post + my indexing demo scripts from Connections + my metadata script that I was working on at the time of the laptop disaster (which is also why it's been a while since my last post) and I'll be leveraging some of my favorite tools (Beyond Compare) to determine all of the differences between my recovered data/directories and the system I rebuilt from a backup (especially now that it has also changed over the course of this week since I moved over to my backup laptop). And, I've now ordered a new harddrive for my primary laptop. Sadly, I'm getting good at laptop rebuilding.

Thanks for reading,
kt

PS - When did you last backup your home/personal/less-critical system... is it really less-critical?

Friday, May 09, 2008 2:11:21 PM (Pacific Standard Time, UTC-08:00)  #    Comments [6]  | 
Sunday, April 27, 2008

OK, we were in Iceland and then Florida for our Accidental DBA workshops and both went really well. People agree that there are quite a few involuntary/accidental DBAs out there and overall, we helped quite a few to see a lot of options for better performance, availability, recovery, and/or just manageability.

So, this is our "resources post". We waited until after the SQL Connections delivery to post these as we figured we might add a few more to the list (as is typical when you deliver content more than once - it's really never the same twice!).

Also, I used a few "interactive" (or build) slides in my presentation - specifically on transaction log backups and the concepts of "clearing the log" which really only clears the inactive portion of the log. To help you visualize this, I've added these slides here: TrippRandal_ClearingTheLog-BuildSlides.zip (647.2 KB).

Finally, we've taken all of the scripts that we demo'ed and placed them on SQLskills on our Past Events page here: http://www.sqlskills.com/pastConferences.asp.

And, if you were there and you think we missed something, feel free to ping me (or Paul!) with an email and we'll make sure to update this resources post (and/or [at least] help you find it what you're looking for!!).

Next stop - Microsoft TechEd ITPro in June (we're back in Orlando again)!
kt

Sunday, April 27, 2008 5:29:12 PM (Pacific Standard Time, UTC-08:00)  #    Comments [3]  | 
Wednesday, April 02, 2008

OK - so this has been frustrating me for many months... when you create indexes with included columns (which was a new feature of SQL Server 2005), they're not shown by sp_helpindex or by DBCC SHOW_STATISTICS. I understand this not showing for statistics because included columns are not factored into the histogram (that's only the high order element which is the first column in the index) OR the density vector (which only shows the densities (or averages) for the left-based subsets of the key). So, why doesn't sp_helpindex show it? Well... I guess it just didn't get updated for SQL 2005. So, in SQL 2008, I was hoping I'd not only see included columns but also filtered indexes... well, neither is there and sp_helpindex is still the same old proc. Don't get me wrong, you can use SSMS to see all of the index properties for a single index (pane, by pane for each property) OR you can run queries to find the included columns for a given index:

SELECT
(CASE ic.key_ordinal WHENTHEN CAST(AS tinyint) ELSE ic.key_ordinal END) AS [ID],
clmns.name AS [Name],
CAST(COLUMNPROPERTY(ic.object_id, clmns.name, N'IsComputed') AS bit) AS [IsComputed],
ic.is_descending_key AS [Descending],
ic.is_included_column AS [IsIncluded]
FROM sys.tables AS tbl
   
INNER JOIN sys.indexes AS
      
ON (i.index_id >AND i.is_hypothetical = 0) AND (i.object_id = tbl.object_id)
   INNER
JOIN sys.index_columns AS ic 
      
ON (ic.column_id >AND (ic.key_ordinal >OR ic.partition_ordinal =OR ic.is_included_column != 0)) 
         
AND (ic.index_id = CAST(i.index_id AS int) AND ic.object_id = i.object_id)
   INNER
JOIN sys.columns AS clmns 
   
ON clmns.object_id = ic.object_id AND clmns.column_id = ic.column_id
WHERE (i.name = N'[MyIndex]') AND ((tbl.name = N'[MyTable]' AND SCHEMA_NAME(tbl.schema_id) = N'[MySchema]'))
ORDER BY IsIncluded, [ID] ASC

but, there isn't a nice clean way to show all of the included columns for all indexes for a particular table... until now :)

A couple of weeks ago I sat down and rewrote sp_helpindex. I was actually on a plane from Hyderabad to Frankfurt or from Frankfurt to San Fran or from San Fran to Seattle (it was a long day :) and I was using (and well, forcing myself to learn how to use :) my new Vista laptop. OK, that's a HUGE story in and of itself and it definitely warrants its own post but I'll sum up the story with the fact that I had to purchase a new laptop while in Hyderabad because BOTH my primary laptop (T61p) AND my backup laptop (T60p) BOTH (yes, BOTH!!!) suffered catastrophic disk failures on their boot drives within 24 hours of each other. In the end, I really cannot believe the "coincidence" of two laptops crashing within 24 hours of each other. Yes, I thought MTBF too (at first) but the laptops were two Lenovos - one Lenovo (the T60p) was purchased in Feb 2007 and the second, a Lenovo T61p was purchase in Oct 2007. And, it was the T61p that went first. The only thing I can even begin to speculate about and/or think to attribute it to (as I was in India for 17 days from Mar 3 through Mar 20 and this all started on Mar 17) was an overactive metal detector at the hotel at which I was staying (or something related to St. Patrick but I've since ruled that out - and no, I wasn't drinking green beer either...). OK, I really need to do another post to give you all of the details about this trip BUT, I did get a new laptop... and, having just bought it only shortly before I flew back I felt like I really needed to get my money's worth so I just *had* to work on the flights home (ah, security with *3* laptops was NOT fun and I'm *VERY* glad that none of them asked me to "boot" my laptops to prove they were working... that could have been a VERY bad situation... lol).

OK - so back to the story... I was working on the flights and I was preparing to deliver some content on the Friday after I returned (yes, I taught a full day in India on Wednesday then flew back leaving India at 2:15am Thursday morning so that I could arrive back in Redmond at roughly 7pm Thursday night - about 30 hours later - and then teach Friday morning for an 8:30 start time... ah, I was *really* tired on Friday night :). Anyway, in preparing, I decided that I finally needed to re-write sp_helpindex. When I was first writing it, I was only thinking of SQL Server 2005. So, here's the 2005 version that I wrote: sp_helpindex2_2005.zip (2.71 KB).

So, I had wanted to blog that when I got back to Redmond but in preparing for the trip we're on now AND rebuilding my primary and backup laptops, well, it got tabled. So now, today, Paul and I are in Iceland (working with our great friends at Miracle Iceland) and we're teaching "the Accidental DBA" (this past Monday) and SQL Server 2008 New Features in Database Infrastructure and Scalability (Tue through Thursday)... I was giving a lecture on Filtered Indexes in SQL Server 2008 and I, once again, found myself needing a better sp_helpindex. So, when Paul got up to talk about Compression (which is no short lecture for him :), I had time to rewrite sp_helpindex... again. And, here's what I ended up with...

exec sp_helpindex2 'member'

index_name index_description index_keys included_columns filter_definition
member_corporation_link nonclustered located on PRIMARY corp_no NULL NULL
member_ident clustered, unique, primary key located on PRIMARY member_no NULL NULL
member_region_link nonclustered located on PRIMARY region_no NULL NULL
NCIndexCoveringLnFnMiIncludePhone nonclustered located on PRIMARY lastname, firstname, middleinitial phone_no NULL
NCIndexCoversAll4Cols nonclustered located on PRIMARY lastname, firstname, middleinitial, phone_no NULL NULL
NCIndexLNinKeyInclude3OtherCols nonclustered located on PRIMARY lastname firstname, middleinitial, phone_no NULL
NCIndexLNOnly nonclustered located on PRIMARY lastname NULL NULL
QuickFilterTest nonclustered located on PRIMARY lastname phone ([lastname]>'S' AND [lastname]<'T')

So, in the end, I can quickly see whether or not my index has included_columns and/or a filter_definition. Don't get me wrong, these indexes above are NOT necessarily a good combination of indexes (or recommendation of ANY kind) to have - these were just created to make sure that my code works. And, as my good friend Gunnar would say - "it's not my best code but it's not my worst code either" <G>. And, so, here it is: sp_helpindex2_2008.zip (2.75 KB).

Pretty darn useful for sure! Oh, and I used the undoc'ed sp_MS_marksystemobject so that I could still create the sp_ in master but then execute it in all other databases. It's frustrating that this behavior (with sp_ named objects) no longers works in 2005/2008 but at least the sp_MS_marksystemobject still sets the behavior so that we can create this one proc in master but use it in all other databases.

Have fun!
kt

Wednesday, April 02, 2008 10:22:40 AM (Pacific Standard Time, UTC-08:00)  #    Comments [4]  | 
Wednesday, March 12, 2008

A couple of weeks ago, Paul and I recorded two interviews with TechNet Radio... both are ready for download and in multiple formats! 

Our specific interviews can be downloaded from the following links/formats:
  SQL 2008 Part 1 of 2: Security and Availability WMA | MP3 High | MP3 Low
  SQL 2008 Part 2 of 2: Management, Troubleshooting and Throttling  WMA | MP3 High | MP3 Low
  More TechNet Radio interviews (and *lots* of other shows), can be found on Channel 9.

Enjoy! 
kt

Wednesday, March 12, 2008 8:11:17 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, February 05, 2008

I know that Paul and I recommended that you subscribe to Conor's blog... but have you? He's posted some great details on Partitioning (Part 1 and Part 2) as well as statistics and it always reminds me of how much I can learn from other people's perspectives!

And, just to dove-tail on some of his statistics comments... I, too, have found that as tables get significantly larger AND have non-standard distributions of more than 200 distinct values (and un-even distribution between those values as well), that the optimizer just cannot possibly do a perfect job. The only way an optimizer can be good is when it can "find a good plan fast" (which I first heard from Nigel Ellis (former Development Manager of the Query Processor team) - back when he delivered a Pacific Northwest SQL Server User Group meeting many moons ago). The most important thing to realize is that it's just not possible to waste time to find the absolutely best plan... mathematically analyzing all permutations would be prohibitive - you'd have to take a vacation between query executions (wait, that's not a bad idea... I digress :).

The point:

  1. Make sure that statistics are up-to-date (either through the database option: auto update stats OR by manually updating statistics)
  2. Consider re-evaluating statistics over large tables (and, when poor performance occurs - look at the estimated rows v. the actual rows - if the estimate/actual are off by a fact of 10, then it could be the statistics). I'd try updating stats first and then if that doesn't work, updating with a fullscan. If neither of those work, I'd also re-evaluate other possible indexes (there are some distributions between tables being joined that just can't show a correct correlation between the values when in multiple indexes... sometimes the best index is a multi-column (ie. composite index)). 
  3. Consider breaking very large tables down into smaller chunks (not just table index partitioning but possibly Partitioned Tables AND Partitioned Views) as this can give the optimizer additional details about partiticularly interesting data sets. Even in SQL Server 2008, statistics are still table-level (filtered indexes can provide some, but not complete, relief... I'll give more details in a later post) but I'd often argue that some of the best table designs are not just for a single table. Consider the statistical, locking, and indexing implications for mixed workloads against a single table (and the tremendous amount of blocking that could occur in addition to varying access patterns). And, even while 2008 will offer Partition-level lock escalation, well-designed tables may not need it! I know I've mentioned this before but different perspectives on statistics, optimizers and the fact that a good optimizer has to be efficient in-and-of itself, remind me of some of the most basic things that are also the most common problems contributing to poor query performance.

Returning to the basics and optimizing a system from the ground up always leads to better scalability!

Enjoy!
kt

 

Tuesday, February 05, 2008 12:06:01 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Sunday, January 13, 2008

I had started to write this blog post when we (Paul and I) were on our way back from Zurich on November 21. We had been in Zurich presenting a TechNet DeepDive on Database Maintenance Best Practices...after presenting at ITForum in Barcelona...after presenting at SQLConnections in Las Vegas (well, we did spend 30 hours at home in between those last two conferences :)). Once we returned to the US on Wednesday night, it was just before Thanksgiving and well, a few personal things prevented me from getting as much work done as planned in December... and then the holidays hit... and then I realized I was horribly behind and so I've been playing catchup ever since. Ah, Happy New Year. 

Now, I'm really back in the swing of things and I wanted to let you know what we've done!

First things first - we've posted almost all of our resources from our conferences in November (we still have a few more to tweak/post). We'll catch up with the remaining scripts this week! Check them out on our Past Events page here: http://www.sqlskills.com/pastConferences.asp.

And, now, here are my thoughts as I wrote them on November 21 - with a few updates along the way in this font:

**********

It's been a great 2.5 weeks with 4 full-day workshops, 17 sessions, 3 interviews and 8 flights to conferences/events in 3 countries. OK, Paul has been with me too and that hasn't stopped him from blogging (I know you're all thinking this :) but, he blogs shorter blogs posts and I go for quality rather than quantity (TOTALLY kidding...lol...I am SO in trouble for that one (update: I'm glad I didn't post this until after the holidays!)). And, well I've also been pushing through a bit of a cold (and, Paul doesn't sleep ;=).

So, here's a long post to catch you up on all the travels and even some of the great questions we've been asked while running around from conference to conference. First, it was a great week in Vegas for SQLConnections. We ended that week with a full-day workshop that was all hands-on on SQL Server 2005. Our 80+ attendees downloaded a VPC image and used that for the base environment for labs on Database Mirroring, Database Snapshots, Partitioning, Partial Database Availability and Online Piecemeal Restore... (Update: We received our evaluations from this session (only 37 evals were submitted) but they really seemed to enjoy it! Our reviews were fantastic (literally 4 of 4) and so, we're planning to do this again for the Spring SQL Connections. More details coming.)

And, speaking of the VPC/DVD... We've had a lot of requests for these resources (the DVD, the lab manual, the utilities, etc.) and we're already working on an updated version of this for SQL Server 2008 CTP5 which just came out on Friday (November 16) (update 1/13/2008: we've finished the November CTP update and it will be available on the SQL Server 2008 Readiness Kit). And, that's just a start! (update 1/13/2008: we've also released the DDM, check out Paul's post here). With 9 labs on the current SQL Server 2005 version, a few new exercises planned for the first 2008 version (the labs include exercises on automatic page repair for Database Mirroring AND updates for Peer to Peer including the Peer Topology Viewer, etc.) AND a second DVD on Manageability (already working on labs for SQL Server Policy-based Managment and Performance Studio (update 1/13/2008: these two are done too and they will also be available on the SQL Server 2008 Readiness Kit). I think we'll have a TON of resources to help you get started with SQL Server 2008 by the time it releases. We're also looking (for the first time!) to put together a way for you to access these resources more directly. I *promise* we'll keep you posted on that as soon as we have the final outcome! (update 1/13/2008: and, that's the Readiness Kit! Now, we just need to find out all of the details on how/when you can access it. We'll let you know as soon as it's available! It's likely that you will receive it as an attendee at a launch event - February 27, 2008. Check out the launch portal here and you can see when/where a launch event is coming to you: http://www.microsoft.com/heroeshappenhere/register/default.mspx. I can't promise that they'll all give out the Readiness Kit but, that's the most likely place (of which I'm aware at this point) where you'll receive one. If I find out any additional information, I'll post it on my blog.)

As for some of the favorites from the labs - people seem to love the Database Mirroring SQLCMD master script that sets up the High Availability Configuration for Database Mirroring. So, I thought that this might be an interesting script to post here: GenericDatabaseMirroringSetup.sql (16.73 KB). And, to make it even more flexible, I have modified this script quite a bit and made almost everything parameterized (PrincipalServer, PrincipalDNS, PrincipalPort, MirrorServer, MirrorDNS, MirrorPort, WitnessServer, WitnessDNS, WitnessPort, Database2Mirror, RestoreWithMove, BackupLocation). Also, you can decide to "move" the database being restored on the mirror (to the mirror instance's default data root) OR keep the database backup directories exactly the same (which is generally the recommended configuration). In our lab, we move the database to the instance's default directory when it's restored on the mirror because all three instances are on the same virtual machine. However, in a real-world database mirroring environment, you want to try and avoid moving the database to a different drive location on the mirror because future changes might cause the mirroring partnership to become broken. So, if you want to play with this script, be sure to modify the parameters at the beginning of the script and then make sure you test this in your environment (and, let me know if you run into any snags... I know this works in our VPC but I just whipped up the modified script relatively quickly so it's a *learning* script more than anything!). And, here are a few tips for successfully implementing a database mirroring partnership:

  1. Make sure that the principal and mirror server are either identical to each other (in every way) from disk to memory to CPU, etc. When the principal and mirror are identical, you are more likely to minimize performance problems during a failover AND you're less likely to have additional slowdowns on the principal (in a synchronous mirroring configuration) by slower hardware on the mirror. If you don't have identical hardware then you want to either choose an asynchronous database mirroring configuration OR be sure to thoroughly test your before and after failover configurations to ensure that performance is "good enough" in a failover sitation. Some shops find it acceptable to run slower during a diaster (v. having downtime) and therefore they choose a secondary server (for the mirror) that's not quite as powerful as the principal... However, beware that if you're in a synchronous mirroring configuration then your mirror's performance might affect the performance of the principal.
  2. Test BOTH your OLTP activity load as well as your batch load AND be sure to test over an entire business cycle. You might be surprised to find that month end processing might surprise you AND/OR a batch process that runs maintenance (for example an index rebuild OR an index defrag over a heavily fragmented table). We've seen a few customers that had network problems when under a significant load and if not tested, this could compromise your mirroring partnership by not allowing the mirror to stay in sync OR it might cause "throttling" at the principal. If the database mirroring partnership has over 1MB of of unsent log waiting, then performance on the principal will be slowed to try and help the mirror catch up.
  3. Be sure not to make any assumptions about the way that database mirroring works - even if you have multiple databases being mirrored on the same server. Database mirroring is always between only ONE "principal" database and it's mirror copy. If you want to mirror multiple databases on the same server - to the same secondary server - that's possible. However, this presents additional problems in that a failover is between ONLY a single database and it's mirror. If you have an isolated database failure that causes one of (let's say four) related databases to failover, then three-part naming will suddently start failing as you have a combination of mirrors and principals on the same server and local (three part naming) won't work against a mirror database as the mirror database(s) cannot be accessed directly. As a workaround, you could create an alert on the WMI event: database_mirroring_state_change which then forces a failover for the remaining databases... Effectively all four would then failover. While this will work - and, the alert will fire relatively quickly after the first database fails over, it's important to realize that alerts (and this WMI event) are all *after* the actual failover. The secondary failovers would be asynchronous, response-based events. As a result, there could be a few transactions that fail between the first database failing over and the remaining databases failing over. In a diaster case however, this might be acceptable and if your application is well designed, you might be able to make this relatively seemless...

Also, within the SQL Server 2005 Always On VPC, we found a bug (yes, I know... the shame of it!! ;-) but (and this is our saving grace...) it's fairly minor and only requires a couple of quick changes to database mail configuration options. We decided late in the game (which is always a bad mistake) that we needed wanted to change the Windows Server name (and we all know what a pain that is!). In fact, there have always been problems associated with a server name change (at least problems for SQL Server picking up this change):

  • The server's @@servername setting will not be correct. There are actually a few problems that can occur as a result of this but it's also very easy to fix. The steps are quick and Tibor (an MVP in Sweden) already blogged them here: http://www.karaszi.com/SQLServer/info_change_server_name.asp so I won't repeat.
  • Jobs will have problems when edited. The problem with jobs is that they have a different "originating_server" than the new (when changed) servername and so when you open a job and try to modify it, SQL Server thinks that the server is a target server in a master/target environment. And, when you're just a target server receiving jobs from a master, those jobs cannot be modified, they can only be executed. As a result, the job cannot be changed. If you're going to change your servername, you need to make sure that your jobs change with it. Tibor also blogged the update to msdb.dbo.sysjobs that's necessary.
  • This one is NEW for SQL Server 2005 - The Windows groups that SQL Server creates on non-failover cluster servers (if you're on a failover cluster, you must create the security groups manually - check out this whitepaper for more details on failover cluster setup: http://www.microsoft.com/downloads/details.aspx?FamilyID=818234dc-a17b-4f09-b282-c6830fead499&DisplayLang=en) will no longer correspond to the server name (after the change). So far, I have not seen nor heard of a case where this creates a problem - but I won't be surprised if one of you responds with an issue or two! SQL Server 2005 uses security GROUPs to manage security and service/component-level permissions so that service account changes don't require permissions to be removed from that account name (in 2000 they did this and the side-effect was that permissions that had already been granted for some other reason - and just happened to be duplicated with the required SQL Server permissions - were removed when you changed the service account. As part of the service account name change and cleanup, they removed all of the permissions needed to run an instance of SQL Server. In SQL Server 2005, they changed this model solely to put you into the correct server group (like SQLServer2005MSSQLUser$servername$instancename) and take the former service account out.

And, of course, there are potentially a lot of [additional/other] external dependencies when you make a servername change... and, well, that's where we missed one. Part of it is because we also setup a POP3/SMTP mail server inside the VPC and when we changed the servername, we also changed the mailserver name (and then forgot to change the Database Mail settings, Outlook Express Account and SQL Server Agent Operator settings). So, you need to change the references to it. The servername was SQLHASP1 and in the June edition of the Always On DVD, we upgraded to SP2 and changed the servername to SQLHAVPC (notice the more generic name... duh!). For completeness, I'll put a bit more detail here:

Database Mail - Manage account, View existing account, check correct domain and server name
SQL Agent - Ensure operators have correct domain and NET SEND address
Outlook Express - Ensure correct server name + domain name

And, now, you should be able to sucessfully use this POP3 mail server in the lab exercises on WMI Events with Database Mirroring. For more details and references (even if you don't have the lab content) check out the links to all of the Database Mirroring whitepapers here: http://www.sqlskills.com/whitepapers.asp. (update: Paul blogged about a bunch of new whitepapers here and some of these are very new!)

**********

And, that's all I had in my November 21 version of the blog post (see, this is what happens, I write a huge post and then start researching things and put the post on the side to post later... and, later is WAY later in this case.... Sadly, I have a bunch of half-finished posts like this. More posts than time :)

So, a lot more to post over the coming days/weeks as the SQLskills team is heads down in SQL Server 2008 working to create some great content (for example, the labs mentioned above).

And, HAPPY NEW YEAR! (at least I said this before January was even over :)

Cheers,
kt

Sunday, January 13, 2008 12:50:05 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Saturday, January 12, 2008

Over the past couple of years, we've had *tons* of requests for the DDM (the Dual Database Monitor - which is a little utility for testing and checking out a mirroring partnership...) and finally, we've (ha, pretty much only Paul :)) made it more flexible (it was hard-coded with specific insert procs). He's created an ini to allow implicit or explicit client redirection (to see the benefits of Transparent Client Redirect), written a help file, and he'll keep you posted with updates/bug fixes...

If you're interested, Paul's blogged about it here: http://www.sqlskills.com/blogs/paul/2008/01/13/DualDatabaseMonitorNowAvailableToOrder.aspx

Have fun and HAPPY NEW YEAR!
kt 

Saturday, January 12, 2008 11:41:12 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Wednesday, December 05, 2007

Every time we're at a conference, we get asked whether we're going to write any books on the kinds of things we talk about. Well, at present the answer is no - we just don't have time unfortunately. And, honestly, we like doing smaller chunks of more timely content rather than a complete book which takes a lot longer to "get to press". However, we have a plan that we'd like to gauge interest in to see whether or not it's worthwhile (Paul already blogged about this).

Basically, there are two full-day workshops that we presented this year, Database Maintenance: From Planning to Practice to Post-Mortem and Disaster Recovery: From Planning to Practice to Post-Mortem. Each of these has 100+ slides in them and we add *a lot* of content while we're presenting...just looking at the slides really isn't enough. However, our plan is to take each of these slide decks and then add comprehensive notes for each slide. Once complete, we'll create a full printout of the entire deck - with notes - for US $99.99. In addition to the printout, we will create a section on our website that includes relevant whitepaper links, KB article links, and blog post links - corresponding with specific slides/topics from the deck. Additionally, as we'd be shipping the book to you anyway, we'll throw in a free copy of the AlwaysOn DVD and the Manageability DVD, with around 20 hours of self-paced labs on them. Shipping and handling would be included for domestic US orders, and US $10 extra for non-US orders.

So, if there's enough interest in this, then we'll go forward and try to make these available for ordering by January/February 2008. So - if you're interested in this (no obligation) please drop Paul a mail.

Thanks,

kt (& pr)

Wednesday, December 05, 2007 5:51:51 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Sunday, November 04, 2007

OK, well, the first day is over and we're starting to relax... just had a nice meal in our room and we're off to each do a blog post (or what might turn into a couple :) regarding the things we each discussed in our full day workshop today.

So, it was a great day and a great way to start the Connections event. We had roughly 170 people attend our Sunday workshop that started at 9am which is especially impressive in Vegas!! And, it was a great way to start because it felt like everyone was ready with questions and a few folks (especially those with jet-lag) said that they didn't think they'd make it through the day but then were surprised at how fast it went and that we managed to keep them awake :)...

As for the questions, we answered a lot of them during the session but as it always happens, we each remembered additional references, sites, and/or details that we always want to post after the fact - this post is the result. I'll try to put headers on the sections but I think this will be a long one!

Here's a link to Paul's Q&A blog post from today's session:
http://www.sqlskills.com/blogs/paul/2007/11/05/ConferenceQuestionsPotPourri1IndexesStatsCorruptionAndEnterpriseonlyFeatures.aspx

So, have fun and we both look forward to seeing you tomorrow in the pre-conference workshop on Disaster Recovery - from Planning to Practice to Post-Mortem. The rest of this post contains random but helpful stuff (and in NO particular order at all). And, there's at least one more post coming after this one as I still have more to add!

Resource Reference List

  • KB#909369: SQL Server 2000 Bug where checkpoint is not appropriately clearing the inactive portion of the log
  • KB#329526: File allocation extension in SQL Server 2000 (64-bit) and SQL Server 2005 (this allows 256KB block allocations)
  • KB#328551: PRB: Concurrency enhancements for the tempdb database (this refers to trace flag -T1118 for SQL Server 2000 AND how to create tempdb on multiple files when on multiproc machines). A good and very complementary read is the "Working with tempdb" whitepaper. AND, the primary author of this whitepaper (Sunil) will be here at Connections this week, so there's another session to visit.
  • KB#224071: How to move SQL Server databases to a new location by using Detach and Attach functions in SQL Server
  • Performance Dashboard Reports (only in SQL Server Management Studio for SQL Server 2005 SP2)
  • sysinternals Zoomit tool: this is what I was using to magnify various parts of the screen, etc.
  • SQL Server Customer Advisory Team Blog and specifically the entry on DMVStats. Also, here's an excellent whitepaper called Troubleshooting Performance Problems in SQL Server 2005 as this has numerous DMV queries and helpful details on troubleshooting.
  • SQLskills Whitepapers list

Progress Report: Online Index Operation
As for the details on this - you want to capture these events: EventClassTextDataEventSubClassObjectIDObjectName, and BigIntData1

And, for the EventSubClass there are multiple values that will be returned. What I think is the most interesting is that many of these events return only one of the values for either ObjectId or ObjectName...so, beware of filtering!

  1-Start (both ObjectID/ObjectName)
  2-Stage1 start (null for both ID/Name)
  6-Insert row count... (ObjectID only)
  3-Stage1 end (null for both ID/Name)
  7-Complete (both ObjectID/ObjectName)

EventSubClass 6 is definitely the most helpful. In the BigIntData1 column, they will show the current row number being processed (essentially) and so, you can guage roughly how far you have gone as well as how long you have to go. Here's a screen shot to help you get some insight into what you will see: OnlineProgressReport.pdf (690.78 KB).

Some operations (e.g. DBCC CHECKDB (and related), SHRINKFILE, ALTER INDEX...REORGANIZE) produce a value for the percent_complete column in sys.dm_exec_requests. Both of these are helpful for assessing where you're at...

Index Creation/Rebuild Order
Paul's written about this AND he's linking to some relevant posts in his blog post from today's session but here's a VERY quick highlight of a few things we chatted about:

Index CREATION order DOES matter (you should always create the clustered index first and then create the non-clustered (as non-clustered indexes depend on the clustering key)
Index REBUILD order does NOT matter as the physical location of the data isn't interesting to the non-clustered indexes as they reference the data by the row's clustering key

VLFs - Virtual Log Files... too many (and typically very small) OR even possibly too large of VLFs....
We had some great discussions around VLFs and I know I've posted a bit on this in the past but I'm not sure I've posted this much detail. So, here are a few relevant points.

First, the size and number of VLFs is determined by the size of the "fragment" added to the transaction log. The "fragment" is added at the time the log is created or anytime the log grows (either manually or automatically). The number of VLFs can be predicted from the following quick guide:

  • If the fragment is less than 64MB, then up to 4 VLFs will be added (the reason I say "up to" 4 VLFs are added is because really small fragments - fragments under 1MB - will actually add even fewer VLFs and I can't remember (nor do I care :) what the exact cut-offs are for < 1MB.
  • If the fragment is greater than or equal to 64MB and less than 1GB, then 8 VLFs will be added.
  • If the fragment is greater than or equal to 1GB, then 16 VLFs will be added.

So, if you create a database with a 100MB log (just as a simple example), you will get 8 - 12.5MB logs... if you then increase the log to 150MB then you will add 4 more VLFs. This really isn't all that bad and in general, I recommend that you have less than 100 VLFs. Often I'll find folks that have thousands and this results in VLF fragmentation. I blogged about this originally here: http://www.sqlskills.com/blogs/kimberly/2005/06/25/8StepsToBetterTransactionLogThroughput.aspx and the steps to help minimize it are there as well. Also, as an update to that post - use DBCC SHRINKFILE with the NOTRUNCATE as it appears to have special meaning on the transaction log... but, I still need to investigate this one a bit more.

Finally, if you pre-create too large of a transaction log then you might have new/different problems... the most likely is that the log doesn't clear as often (because you haven't spilled into a new VLF OR you have a transaction that happens to span two VERY large VLFs) and the effect can be that performance is slowed by the less frequent clearing that has to happen on a very large log... As a way to minimize this, the best way to create larger logs (logs in the 10GB+ range), is to create them in multiple chunks (for example 4-8GB chunks so that you end up with VLFs that are 256MB-512MB instead of 4GB because you created a 48GB log to start).

Installation Instructions for the post-conference workshop AND the HOLs DVD
And, for those of you who want to play with the DVD we handed out today... here are the "generic" setup instructions:
Generic HOLs DVD SETUP Instructions.pdf (20.31 KB)

OK, that's it for today and we look forward to another infomation/question packed session tomorrow!

Cheers,
kt

Sunday, November 04, 2007 8:11:20 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  |