Well - the first half of the year is over and it's been a pretty busy time for Kimberly and I. I only managed to get one day at home in June but I'm at home for whole month of July, and this year I'm taking a proper break. I'm going to be essentially dark through July - no blogging, no Twitter, no forums or newsgroups - so don't expect to hear from me until August.

Hope you have a great summer, and Happy July 4th for those in the US.

Cheers

Categories:
Personal

It's very commonly known that you can use the Script Wizard in Management Studio to script out tables or even an entire database. It's NOT commonly known that in SQL Server 2008, the wizard was upgraded and you can now script out the data too. The feature isn't enabled by default, which is why not many people know about it, and I couldn't find it described in Books Online either, further contributing to it's obscurity - but it's definitely there.

If you go into the wizard, and go down to the Table/View Options, there's a setting 'Script Data'. Set that to True and you'll get a bunch of INSERT statements in the resulting script too. Very cool.

Here's a screenshot showing what I mean (click it for a 1024x768 version).

 

Enjoy!

Categories:
SQL Server 2008 | Tools

In this week's survey I'd like to know how often you run consistency checks on your *most critical* production database, regardless of *how* you run them (we did that survery already - see Importance of how you run consistency checks). I'll report on the results around July 4th.

I'd only like you to answer for your *most critical* production database, as the frequency will probably vary wildly by database, server, production vs. dev/QA and so on. If everyone answers for their most critical database then we won't get skewed results.

*Please* no comments on this post - wait for the survey results post to avoid skewing the answers. I'm very interested in your reasoning, but not until everyone else responds.

As always, a big Thanks! for contributing to the blog/community by responding. Please shoot me an email (Contact button, bottom left of the blog), or ping me on Twitter (@PaulRandal) if you have an idea for a good survey.

PS Thanks to Pat Wright for suggesting this week's topic on Twitter.

In last week's survey I asked how you manage the size of your database *data* files - see here for the survey. Here are the results as of 6/24/09.

The 'other' values are as follows:

  1. 5 x 'manual file growths and a custom mom alter to when the datafiles are 98% full. autogrow set to fixed amount in case we miss the mom e-mail'
  2. 1 x 'auto-grow with a procedure to keep the log file size at 20% relative to the total data file size'
  3. 1 x 'Create with very large initial file size, set auto-growth to %'
  4. 1 x 'Lots of white space in files. Auto-grow to a fixed size (in case of emergency).'
  5. 1 x 'set initial size for 1 yr usage, monitor size, manual grow, autogrow percentage - send alert if it does grow'
  6. 1 x 'set initial size for 2 year growth, capture growth stats daily, monitor physical disk space daily'
  7. 1 x 'Set to autogrow by fixed size to cater for emergencies, otherwise maintain 80-90% free space by daily reporting then manual off-peak size increase if necessary.'

As I mentioned in the survey itself, this is just about database *data* files. I covered log file size management in a previous survey - see Importance of proper transaction log size management.

There are really four parts to data file size management, which I'll discuss below.

The first thing I'll say is that if you're able to, enable instant file initialization on 2005+ - as it will vastly reduce the time required to create data files, grow data files, and restore backups (if the restore has to create data files). We're talking minutes/hours down to single-digit seconds. See Misconceptions around instant file initialization. If you're not one of the miniscule fraction of a percent of customers who have volumes shared between secure file servers and SQL Server instances, turn this on ASAP. Most DBAs don't know about this feature, but everyone I teach it to or show it to are amazed and then go turn it on. No brainer.

Initial data file sizing. This can be tricky. Without getting into the how-to-calculate-the-database-size quagmire, I'll simply say that you should provision as much space as you can, based upon your sizing estimates. Don't just provision for the here-and-now - if you're able to, provision space for the next year's worth of anticipated growth - to prevent auto-growth having to kick-in at all. I rewrote all the Books Online pages for 2005 (links are to the 2008 BOL) on Estimating the Size of Heaps, Clustered Indexes, and Nonclustered Indexes and in the blog post Tool for estimating the size of a database I link to a tool someone put together that codified all my formulas. You can also get sizing tools from hardware vendors too.

Data file growth. If you're able to, auto-grow should ALWAYS be turned on, as an emergency measure in case your monitoring fails - you don't want the database to have to grow but it's unable to and then it stops and the application is offline. However, you shouldn't *rely* on auto-grow - it's just for emergencies. The auto-growth default for data files used to be 10% for 2000 and before, but then changed to 1MB from 2005 onwards (log file default auto-growth remained at 10%). Neither of these are good choices. A percentage-based auto-growth means that as your files get bigger, so does the auto-growth, and potentially the time it takes if you don't have instant file initialization enabled. A 1MB autogrowth is just nonsensical. Your auto-growth should be set to a fixed size - but what that should be is a great big 'it depends'. You need to decide whether you want the auto-growth to be a quick stop-gap, or whether the auto-growth will replace manual growth after monitoring. You also need to consider how long the auto-growth will take, especially without instant file initialization. I can't give any guidance here as to what a good number is, but I'd probably settle on something around 10% (fixed), with the percentage steadily falling as the database size grows. It's very important that you have alerts setup to you can tell when auto-growth does occur, so you can then take any necessary action to grow it even more or tweak your settings.

'Other' response #2 is interesting. There's been a 'best-practice' around for a while that the log file should be sized to be a percentage of the data file size. It's totally unfounded and in most cases bears no relation to reality. The vast majority of the time, the size of the log is *NOT* dependent on the data file sizes in any way. Imagine a 10TB database - would you provision a 2TB log? Of course not. Now, I can see special cases where the operations performed on the tables in the database might affect a fixed portion of the largest table in a single batch, and that could generate enough log (with reserved space too) to equal 20% of the data file size - but that's a pretty nonsensical special case, to be honest. You shouldn't use 'set the log as a percentage of the data file' as a guideline.

Data file size/usage monitoring. There's a growing movement towards monitoring the data file usage and manually growing the files as they approach 100% full - avoiding auto-growth altogether, but still having it enabled for emergencies. In my book, this is the best way to go as you have all the control over what happens and more importantly, when it happens - especially without instant file initialization. There are some quirks here though. SCOM, for instance, has logic that disables file size and usage monitoring if you enable auto-grow. It assumes that if you enable auto-grow then you're not interested in monitoring. I happened to have one of the SCOM devs in my last maintenance class I taught on the Redmond MS campus and he's going to try to get that logic fixed.

Data file shrinking. Just this morning I wrote a long blog post about this - see Why you should not shrink your data files. Running data file shrink causes index fragmentation, uses lots of resources, and the vast majority of the time when people use it, is unnecessary and wasteful. It should NEVER be part of a regular maintenance plan - as you get into the shrink-grow-shrink-grow cycle which becomes a zero-sum game with a ton of transaction log being generated. Think about this - if you run a shrink, which is fully logged, then all the log has to be backed up, log-shipped, database mirrored, scanned by the replication log reader agent, and so on. And then the database will probably auto-grow again through normal operation, or some part of the maintenance job that rebuilds indexes. And then you shrink it again. And the cycle continues...

Bottom line - make sure you size the data files with some free space, have auto-growth set appropriately, have instant file initialization enabled if you can, monitor file sizes and usage, alert on auto-grows, and don't use shrink. And don't use shrink. Really.

Next up - this week's survey!

This afternoon we headed down to Lake Storey in Galesburg, IL to chill out and get some sun. While the kid's were splashing about in the lake and Kimberly was soaking up rays, I went on an hour long nature ramble along the lakeside. I saw lots of cool stuff - herons, turtles, frogs, dragonflies, woodpeckers, and more.

I've put a bunch of photos up in a Facebook album - you can get to them here (no login required). Here a three examples for you.

   

Enjoy!

PS Should mention - camera is a Panasonic DMC-TZ5 with a LEICA 10x optical zoom. Which I'd had one our 20Ds with the 300IS lens...

(Been a while longer than usual since blog posts - I've been really busy flying around the country doing stuff onsite with clients. Normal service will be resumed at the start of July :-)

One of my biggest hot-buttons is around shrinking data files. Although I used to own the shrink code while I was at Microsoft, I never had a chance to rewrite it so that data file shrink is a more palatable operation. I really don't like shrink.

Now, don't confuse shrinking the transaction log with shrinking data files. Shrinking the log is necessary if your log has grown out of control, or as part of a process to remove excessive VLF fragmentation (see Kimberly's excellent posts on this here and here). However, shrinking the log should be a rare operation and should not be part of any regular maintenance you perform.

Shrinking of data files should be performed even more rarely, if at all. Here's why - data file shrink causes *massive* index fragmentation. Let me demonstrate with a simple script you can run. The script below will create a data file, create a 10MB 'filler' table at the start of the data file, create a 10MB 'production' clustered index, drop the 'filler' table and then run a shrink to reclaim the space.

USE MASTER;
GO

IF DATABASEPROPERTYEX ('DBMaint2008', 'Version') > 0
 DROP DATABASE DBMaint2008;

CREATE DATABASE DBMaint2008;
GO
USE DBMaint2008;
GO

SET NOCOUNT ON;
GO

-- Create the 10MB filler table at the 'front' of the data file
CREATE TABLE FillerTable (c1 INT IDENTITY,  c2 CHAR (8000) DEFAULT 'filler');
GO

-- Fill up the filler table
INSERT INTO FillerTable DEFAULT VALUES;
GO 1280

-- Create the production table, which will be 'after' the filler table in the data file
CREATE TABLE ProdTable (c1 INT IDENTITY,  c2 CHAR (8000) DEFAULT 'production');
CREATE CLUSTERED INDEX prod_cl ON ProdTable (c1);
GO

INSERT INTO ProdTable DEFAULT VALUES;
GO 1280

-- check the fragmentation of the production table
SELECT [avg_fragmentation_in_percent] FROM sys.dm_db_index_physical_stats (
    DB_ID ('DBMaint2008'), OBJECT_ID ('ProdTable'), 1, NULL, 'LIMITED');
GO

-- drop the filler table, creating 10MB of free space at the 'front' of the data file
DROP TABLE FillerTable;
GO

-- shrink the database
DBCC SHRINKDATABASE (DBMaint2008);
GO

-- check the index fragmentation again
SELECT [avg_fragmentation_in_percent] FROM sys.dm_db_index_physical_stats (
    DB_ID ('DBMaint2008'), OBJECT_ID ('ProdTable'), 1, NULL, 'LIMITED');
GO

avg_fragmentation_in_percent
----------------------------
0.390625

DbId   FileId      CurrentSize MinimumSize UsedPages   EstimatedPages
------ ----------- ----------- ----------- ----------- --------------
6      1           1456        152         1448        1440
6      2           63          63          56          56

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

avg_fragmentation_in_percent
----------------------------
99.296875

Look at the output from the script! The logical fragmentation of the clustered index before the shrink is a near-perfect 0.4%. After the shrink, it's almost 100%. The shrink operation *completely* fragmented the index, removing any chance of efficient range scans on it by ensuring the all range-scan readahead I/Os will be single-page I/Os.

Why does this happen? A data file shrink operation works on a single file at a time, and uses the GAM bitmaps (see Inside The Storage Engine: GAM, SGAM, PFS and other allocation maps) to find the highest page allocated in the file. It then moves it as far towards the front of the file as it can, and so on, and so on. In the case above, it completely reversed the order of the clustered index, taking it from perfectly defragmented to perfectly fragmented.

The same code is used for DBCC SHRINKFILE, DBCC SHRINKDATABASE, and auto-shrink - they're equally as bad. As well as introducing index fragmentation, data file shrink also generates a lot of I/O, uses a lot of CPU, and generates *loads* of transaction log - as everything it does is fully logged.

Data file shrink should never be  part of regular maintenance, and you should NEVER, NEVER have auto-shrink enabled. I tried to have it removed from the product for SQL 2005 and SQL 2008 when I was in a position to do so - the only reason it's still there is for backwards compatibility. Don't fall into the trap of having a maintenance plan that rebuilds all indexes and then tries to reclaim the space required to rebuild the indexes by running a shrink - that's a zero-sum game where all you do is generate a log of transaction log for no actual gain in performance.

So what if you *do* need to run a shrink? For instance, if you've deleted a large proportion of a very large database and the database isn't likely to grow, or you need to empty a file before removing it?

The method I like to recommend is as follows:

  • Create a new filegroup
  • Move all affected tables and indexes into the new filegroup using the CREATE INDEX ... WITH (DROP_EXISTING) ON <filegroup> syntax, to move the tables and remove fragmentation from them at the same time
  • Drop the old filegroup that you were going to shrink anyway (or shrink it way down if its the primary filegroup)

Basically you need to provision some more space before you can shrink the old files, but it's a much cleaner mechanism.

If you absolutely have no choice and have to run a data file shrink operation, be aware that you're going to cause index fragmentation and you should take steps to remove it afterwards if it's going to cause performance problems. The only way to remove index fragmentation without causing data file growth again is to use DBCC INDEXDEFRAG or ALTER INDEX ... REORGANIZE. These commands only require a single 8KB page of extra space, instead of needing to build a whole new index in the case of an index rebuild operation.

Bottom line - try to avoid running data file shrink at all costs!

I came across a question on ServerFault this afternoon that inflamed my desire to be ultra-geeky (it was really already inflamed after teaching backup and restore internals all afternoon). Basically the question boiled down to how to find out who dropped a table if there's no other way except the transaction log (e.g. no tracing is available, even the default trace). So I hacked around and figured out at least how to find out *when* a particular table was dropped plus the UID and SPID of who dropped it.

Everything hinges on using undocumented commands to look into the transaction log. I've played with this before on the blog: fn_dblog.

First off I created a script to create a database, populate a table and then drop it.

USE master;
GO
CREATE DATABASE FnDbLogTest;
GO
USE FnDbLogTest;
GO

CREATE TABLE TestTable (
    c1 INT IDENTITY,
    c2 CHAR (100) DEFAULT 'a');
GO
CREATE CLUSTERED INDEX TT_CL ON TestTable (c1);
GO
INSERT INTO TestTable DEFAULT VALUES;
GO

SELECT OBJECT_ID ('TestTable');
GO

DROP TABLE TestTable;
GO

First we need to find the transactions that drop tables in the log. The following code works for 2000, 2005, and 2008:

SELECT [Transaction Id], [Begin Time], [UID], [SPID]
FROM ::fn_dblog (NULL, NULL)
WHERE [Transaction Name] = 'DROPOBJ'
GO

The (NULL, NULL) is the starting LSN and ending LSN to process - NULL means process everything available. 

Results on SQL Server 2000 will look as follows for me logging in as a Windows login in the SA role:

Transaction Id Begin Time              UID         SPID       
-------------- ----------------------- ----------- -----------
0000:000000e0  2009/06/16 18:23:03:320 1           51

And for the same circumstances on 2005 and 2008, the results look like:

Transaction Id Begin Time               UID         SPID
-------------- ------------------------ ----------- -----------
0000:00000587  2009/06/16 17:49:56:927  -1          51

If the user who dropped the table logged in as a role member, and they're not connected as that SPID any more, you may not be able to tell who it was unless you're also tracking successful logins into your server - but you'll at least know what role it was.

Now, this only shows us that a table was dropped, not which table it was. There's no way to get the name of the table that was dropped, only the object ID - so you'll need to have some other way to determine what the table ID is if there are multiple table drops and only one of them is malignant.

For SQL Server 2000, the code to find which object ID we're talking about is as follows (dropping the Transacation Id into the WHERE clause):

SELECT DISTINCT [Object Name] FROM ::fn_dblog (NULL, NULL)
WHERE [Transaction Id] = '0000:000000e0'
AND [Context] = 'LCX_IAM';
GO

Object Name     
--------------------
(2009058193)

The object ID in parentheses is the ID of the table that was dropped. 

For SQL Server 2005 and 2008, the code is as follows (with the same Transaction Id substitution):

SELECT TOP (1) [Lock Information] FROM ::fn_dblog (NULL, NULL)
WHERE [Transaction Id] = '0000:00000587'
AND [Lock Information] LIKE '%SCH_M OBJECT%';
GO

Lock Information
--------------------------------------------
ACQUIRE_LOCK_SCH_M OBJECT: 8:2073058421:0

The 8:2073058421 is the database ID and object ID of the table that was dropped.

Now you can go find whoever it was and take whatever action you deem appropriate Wink

Hope this helps!

PS If you find the you don't get enough info from ::fn_dblog, try turning on trace flag 2537. It allows the function to look at *all* possible log, not just the active log.

It's lunchtime for my class - time for a blog post! This is an interesting one that crops up every so often (just an hour ago on SQL Server Central) and is not known very widely at all.

There's a misconception that you cannot run DMVs in databases that have compatibility levels of 80 or less. It's not true.

Here's an example:

USE master;
GO
EXEC sp_dbcmptlevel AdventureWorks, 80;

USE AdventureWorks;
GO

SELECT [object_id], [index_id]
FROM sys.dm_db_index_physical_stats (
    DB_ID ('AdventureWorks'), NULL, NULL, NULL, NULL)
WHERE [avg_fragmentation_in_percent] > 30;
GO

You get the incredibly useful and explanatory message:

Msg 102, Level 15, State 1, Line 2
Incorrect syntax near '('.

What the heck does that mean? It really means that the query processor refused to recognize the DMV name because the database is running in 80 or lower compatibility level.

Here's the trick - execute the DMV in the context of database that's in 90 compatibility level or higher, and it will let you do it - even if you're actually targeting a database in a lower compatibility level.

USE master;
GO

SELECT [object_id], [index_id]
FROM sys.dm_db_index_physical_stats (
    DB_ID ('AdventureWorks'), NULL, NULL, NULL, NULL)
WHERE [avg_fragmentation_in_percent] > 30;
GO

object_id   index_id
----------- -----------
18099105    2
30623152    2
30623152    3
30623152    4
66099276    1
66099276    2

...

Note that I executed the DMV from the context of master, but targeted AdventureWorks, which is in the 80 compatibility level. 

Have fun!

Categories:
Misconceptions | Tools

A couple of weeks ago while Kimberly was in Mountain View at the same client I'm at, our good friend Richard Campbell sent me email saying 'Bored? How about doing a .NET Rocks!?'. I haven't been on the show since the infamous 'coming out' episode in 2007 where we hijacked Kimberly and mercilessly taunted her about us being together (Carl: So... when was the last time you made... 'whoopie'? Me: Is that some kind of Jello?). See show #217 for that one.

Anyway, in this show we talk about what developers need to know about how SQL Server behaves and for once we actually focus on the content (GUIDs, LOB storage, SharePoint, performance tuning, backups, and of course Lego) and don't devolve into madness and crudity (well, except maybe for a little bit at the end).

Check it out - show #455. Enjoy!

Categories:
Interviews

In this week's survey I'd like to know how you manage the sizes of your database *data* files (remember we've already done log file size management). I'll report on the results around 6/21/09.

I say this every week in the PS, but I'm moving it up here because I don't like having to delete comments and send email, *please* no comments on this post - wait for the survey results post to avoid skewing the answers. I'm very interested in your reasoning, but not until everyone else responds.

As always, a big Thanks! for contributing to the blog/community by responding. Please shoot me an email (Contact button, bottom left of the blog), or ping me on Twitter (@PaulRandal) if you have an idea for a good survey.

PS Thanks to Steve Jones (of SQL Server Central fame) for suggesting this week's topic on Twitter.

In last week's survey I asked you what you think is the hardest thing when becoming an involuntary DBA - see here for the survey. Here are the results as of 6/15/09.

The 'other' responses were:

  • 3 x 'Learning to tell good advice from bad advice'
  • 1 x 'Learning to know *that* you don't know'
  • 1 x 'Management having unrealistic expectation of my capabilities'

All great answers.

We all know the definition of 'involuntary DBA' - someone who is made to assume a DBA role or take on DBA responsibilities, usually against their will. I used to call them 'accidental DBAs' but too many people thought I meant DBAs who had accidents, so I changed my phrase. Let's go through some of the answers, ranked by number of responses.

Learning to know what you don't know. Actually the first step here is learning to know *that* you don't know - if you think you know everything (really in any field and with any expertise level) then you're most likely wrong, no matter who you are and what the field is. It can be a humbling experience making (and accepting) that realization and then quite daunting when you realize that there is maybe *so much* that you don't know and might have to learn. For an involuntary DBA this can be doubly daunting, because not only do you realize that there's a ton you don't know about being a DBA, but that means there's a ton you don't know about managing this (maybe mission-critical) SQL Server that you're suddenly responsible for and your job security is now suddenly dependent on doing the right thing.

Being the DBA as well as my actual job. Many times an involuntary DBA is actually a bolted-on side-role for someone with a real daytime job too - sometimes not even an especially technical daytime job. I'm teaching a class right now for a large customer of ours and about 1/3 of the 30+ people in the class are involuntary DBAs as well as operations engineers. Some other common examples - SharePoint admins who suddenly find themselves with an Enterprise-class SQL Server instance underpinning their SharePoint farm, developers using SQL Express as part of a client application who have to think about the maintenance implications of having SQL Server installed, the person who sits nearest the SQL Server instance when the real DBA leaves or is fired. Sometimes becoming an involuntary DBA can be very rewarding and take you down an unexpected but ultimately welcome career path, but many times its an unwelcome burden, and very occasionally it can be disastrous. A few weeks ago I heard on the forums from someone who had become an involuntary DBA after the actual DBA has been fired - and his job was on the line if he couldn't get the broken database fixed in 24 hours. Brutal. Being a DBA can be a full-time job in itself, and if you ask many production DBAs, they'll tell you that it can be way more than the standard 40-hour work week (seriously, does that even exist any more in our industry?) - so to try and do that *as well* as a regular 9-5 job might be nigh on impossible.

Finding information on what to do. This can be the real kicker - you've realized that there's lots you don't know - and you want to learn - but where to start? Who to ask? There's a great community of SQL Server folks that are very willing to help out - if you're reading this then you've already stumbled in to the community somehow. There are lots of blogs with information on - some of what I post is deeper information than you *really* need to know to get by, but the editorials in the Weekly Survey series are a good overview of some of the topics to be concerned about, and Kimberly also posts a ton of good info on her blog. In TechNet Magazine, I've been writing many articles aimed specifically at involuntary DBAs - here are some links:

Here's a quote from an interview I did with SQLRockstar recently (hope you don't mind Tom):

SR: What advice would you give to new SQL DBA’s just entering the field?

PR: There’s a huge amount you don’t know - that’s just a fact. SQL Server can appear very easy to setup and get running, but there are lots of ways to shoot yourself in the foot if you’re not careful. I’m not trying to be scary, just engender a healthy amount of trepidation. Get yourself a mentor if possible, get some training, practice, practice, practice, read a bunch of people’s blogs, follow folks on Twitter, and so on. You need to be an information sponge for a couple of years at least, and get as much experience as you can with the varied aspects of SQL Server - perf tuning, disaster recovery, security, design, etc. And the more you tinker, play, and learn - the more you’ll realize there’s lots you still don’t know. Accept that and be willing to take advice, make mistakes, and learn and you’ll go a long way. Don’t be intimidated to ask people questions, most of us are happy to help - we all started with zero SQL Server knowledge, no matter where we are now.

The last sentence is the most critical in that paragraph. No matter how knowledgeable anyone in the SQL community might seem, every single one of us started with *no* SQL Server knowledge at all. I had never even heard of SQL Server until I joined Microsoft in early 1999 - and there's a *huge* amount of SQL Server I know very little about - I can barely even spell BI for instance. But I know what I don't know and I know where to go to search for information, or who to ask.

And for all those of you in the SQL community, when you're answering what might, at first glance, look like a stupid question from someone with zero reputation or forum points, consider the fact that they may well be an involuntary DBA trying to find an answer because they know they need help and don't have anywhere better to go. Be kind. There's no such thing as a stupid question - the only stupid thing is *not* asking a question when you know you don't know the answer. A sure-fire way to put someone off asking any more questions is to slap them down when they ask their first.

Now, apart from asking the questions on forums etc, another problem that involuntary DBAs have is knowing what information/answers to trust. This one's very hard, as unless you get to know people in the community, you've got no clue who these people are and why you should believe them. The onus is also on us in the community to make damn sure that when we answer a question it's the *right* answer. This is especially true in the corruption forums where telling someone the wrong answer can deepen the corruption hole the poor DBA is already in.

Management that don't understand databases. And also don't understand that the new involuntary DBA isn't going to immediately and magically become 'super-DBA' overnight and fix all the problems with the SQL Server instance. It takes quite a while to learn the ins-and-outs of being a good DBA, and having the confidence to be able to argue with management about what is and isn't possible. There's no easy solution to this one I'm afraid - education is the only option.

Dealing with actual DBAs and developers. There's a tendency in human nature to be scornful and dismissive of people in the same field who are not as knowledgeable or talented - this can be especially true in the technical area we inhabit where sometimes the people we work with may be egotistical and possibly even lacking in social skills - the 'typical' geek developer. Apologies to all those out there who *are* nice and *do* have great social skills - but after 9 years as a developer and manager of various technical disciplines at Microsoft, I know of what I speak. As an involuntary DBA you may come up against some of this antipathy. My advice - you're just going to have to deal with it until you become more knowledgeable. It's unfortunate, but it's human nature. One way to gain respect and trust from these people is to learn some stuff and solve some problems - don't ever BS about something you don't know about - guaranteed way to quickly lose whatever respect you've built up. You might try complaining to your management, but if it's the same management that added being an involuntary DBA to your regular job, from what I've heard, they're unlikely to listen to you. I don't mean to sound depressing, I'm just being honest.

Performance tuning, disaster recovery, database maintenance, implementing an HA/DR strategy. It's hard to pick which of these is the most important for a new involuntary DBA to focus on - but I think I'd go for database maintenance. That addresses some perf issues (index fragmentation management, statistics maintenance) and some disaster recovery issues (taking backups, running consistency checks). Checkout the TechNet Magazine article above for (what I think is) a great primer, and for perf tuning, see the editorial from a couple of weeks ago that has lots of links: Important considerations when performance tuning.

So if you're an involuntary DBA, the bottom line is that it's going to be a struggle for you - but there are lots of resources out there and people willing to help out.

Don't be afraid to ask.

PS By all means add comments with more involuntary DBA resources and I'll collect them together.

Categories:
Involuntary DBA | Surveys

There are a couple of issues that I've heard of in the last few weeks (one while onsite at a customer) and I think they might bite some people so I'd like to share them with you.

DBCC CHECKDB in 2005 onwards uses a hidden database snapshot to create the transactionally-consistent point-in-time view of the database that it requires to run the consistency checks. The hidden database snapshot is created as a set of NTFS alternate streams on the existing database data files. The alternative to having DBCC CHECKDB do this automatically is to manually create your own database snapshot and run DBCC CHECKDB against that - it's the same thing really.

More info on DBCC CHECKDB's use of snapshots, and potential problems can be found at:

The two issues that I've heard of both are around an inability of DBCC CHECKDB to create the hidden snapshot. In that case it is forced to use locks to stabilize the database, which usually fails because the exclusive database lock required for running the allocation checks portion cannot be acquired.

The first issue is around the permissions of the SQL Server service account. To be able to create the NTFS alternate streams, the service account must have the privileges to create files in the DATA directory of the SQL Server instance. This is a really difficult problem to track down as the actual NTFS failure message is not surfaced by the snapshot creation code.

The second issue is around the use of HP PolyServe. Upgrading to Matrix Server 3.6.1 disables support for alternate streams in the filesystem, effectively breaking DBCC CHECKDB. Here's the paragraph from the 3.6.1 upgrade guide (available here):

In previous releases, MxDB for SQL Server provided ADS support internally for use with various SQL Server features such as the DBCC CHECKDB command. This internal support has been removed in HP PolyServe Software for Microsoft SQL Server. Instead, after all servers are upgraded to 3.6.1, you will need to enable ADS support on all filesystems previously used with MxDB for SQL Server. During the upgrade to 3.6.1, SQL Server operations requiring ADS will fail, as the new ADS support feature is not yet in place on the nodes running 3.6.1. For continuity of SQL Server operations, it is important to upgrade all nodes to 3.6.1 and upgrade filesystems for ADS as quickly as possible.

Enabling support after the upgrade means running the PolyServe psfscheck command (which I believe just runs the NTFS fsutil command under the covers), which unfortunately means taking the volume momentarily offline.

Hope this helps!

On Friday we had a planned spare day in Houston after spending the week onsite with a client. It was the 3-year anniversary of our meeting for the first time at TechEd US 2006 so we decided to go on a road trip! We headed down to the Gulf Coast and along the Bolivar Pensinsula, then across the ferry to Galveston and back up to Houston, stopping at the Hard Rock Cafe (my 12th) for dinner.

The birdwatching was just stunning. We went to the Smith Oaks Bird Sanctuary (one of seven in the area owned by the Houston Audubon Society). It's home to a huge colony of Roseate Spoonbills, various Egrets, and Anhingas (look like Cormorants). Then we headed along the coast, being amazed by the damage from Hurricane Ike - gas stations, houses, and grocery stores reduced to concrete pads, and then hit Port Bolivar to take in some of the sloughs and marshes around there.

I've posted a bunch of stunning bird photos on my Facebook page - here's an example, a Great Egret with chicks - check them out on Facebook here (no login required).

Enjoy!

We're onsite with a great client down here in (very hot) Houston and Kimberly's lecturing right now so I'm digging around finding some info on problems they're seeing. Saw that Tom (aka. SQLRockstar) had posted the email interview we recently did and wanted to get it out there. If nothing else, checkout the photo link in his first paragraph (I only did that once!) and the about-Paul questions at the end.

Great fun  to do - and the first about-me interview I've done for three years.

Thanks to Tom for doing the interview - check it out at http://www.thomaslarock.com/2009/06/interview-with-paul-randal/.

Enjoy!

Categories:
Interviews

This week's survey is all about being a DBA or involuntary DBA - what do you think is the hardest thing when becoming an involuntary DBA? I'll report on the results sometime over the weekend of 6/13/09.

As always, a big Thanks! for contributing to the blog by responding. Please shoot me an email (Contact button, bottom left of the blog), or ping me on Twitter (@PaulRandal) if you have an idea for a good survey.

PS And again, as always, no comments on this post please (I'll delete them straight away) - please wait for the survey results post to avoid skewing the answers. I'm very interested in your reasoning, but not until everyone else responds.

Categories:
Involuntary DBA | Surveys

In last week's survey I asked what you think is the most important thing when it comes to performance tuning, if you *had* to choose one - see here for the survey. Here are the results as of 6/7/09.

Now of course you're all calling 'foul' because I didn't put an 'it depends' option on there which you all would have chosen, but that would have been boring, and I wanted you to pick what you really think is the #1 thing in the majority of cases. Of course, in general the answer is always 'it depends', but some things turn out to be the #1 problem more often than others. For the record, mine and Kimberly's answers were overall indexing strategy, but I deliberately broke them out into clustered and nonclustered to see what people would pick. I'm not surprised that application issues and nonclustered indexing strategy came out top, but good to see that all answers were represented and there's a variety of opinion out there. Variety is the spice of life as they say, or is it garlic? Depends who's cooking I suppose - but I digress.

Let's walk through each option and see how badly they can affect performance if something's wrong. This is by no means an exhaustive list of what could cause problems, just me rambling on at the end of a Sunday, as usual. Blog posts are so much easier than articles because you can have run-on sentences, short little fragments and all sorts of other verboten grammatical beasts.

IO subsystem design/tuning, including RAID: Quite a few things here could be wrong. Is there a write-intensive workload (either data files or log files) sitting on a RAID-5 array (with RAID-5 there's a performance penalty on writes)? Is the disk partitioning alignment, RAID stripe size, or NTFS allocation unit size incorrectly set? (See Are your disk partition offsets, RAID stripe sizes, and NTFS allocation units set correctly? for details.) Is the I/O subsystem just not up to the load being placed on it - high disk queue lengths, page IO latch wait errors, long-running IO errors? Is tempdb the bottle-neck because there a multiple procs with multiple users all trying to create/drop lots of temp tables? (See Misconceptions around TF 1118 for details.) Are there multiple databases using the same I/O subsystem such that disk-head contention is causing performance issues? Is there a network bottleneck with iSCSI storage? Are there filter drivers getting in the way and slowing things down?

Server hardware (CPUs/memory/other): Ok - I'm going to sound like a stuck-record saying 'quite a few things here could be wrong'. Let's just take it as read that each category has lots that could go wrong. A heavy load with a single CPU isn't going to perform well. However, sometimes excessive parallelization can lead to the dreaded CX Packet waits - multiple threads in the query processor waiting for information from each other because the server is overloaded - or blocking issues as locks start getting in the way - but that's more an application problem - oops, wrong section. Having more CPUs is usually a good thing as parallel plans usually run faster than single-threaded plans. Having too little memory can severely limit the buffer pool size and force buffer pool thrashing (low page life expectancies), and large operations to spool data to tempdb, placing further load on resources. Not sure what I meant now by 'other' - maybe having a wireless keyboard could slow down commands getting to SQL Server? Ok - I remember - CPU architecture. 64-bit vs. 32-bit. With 64-bit there's no real limit on the amount of memory that can be addressed, and no mucking around with AWE. There are some tricky issues with NUMA to make sure you don't get cross-node memory accesses (just like CPU cache invalidation could hit you on SMP architectures) but that's getting a little advanced and geeky. Btw - watch out for this when you're using Extended Events in SQL 2008 on multi-proc boxes - there's a setting where you can tell it what kind of CPU architecture you have.

Virtualized vs. 'real' server: Now, I'm the first to admit that I'm not an expert in virtualization, but now I have 3 8-ways and lots of disks to play with at home, I want to be. That's why I picked up a Hyper-V book today after reading Tom's virtual bookshelf post here. What I do know is what I've heard from talking to people at conferences and during classes - the main thing here is I/O virtualization. If there's software virtualization of I/Os, then performance is going to suck under load and you better not do it in production. Microsoft's Hyper-V gets around this, as do things like VMware ESX, by allowing you to assign actual real physical resources to virtual machines. And that's as much as I know about it until the book arrives and I get back from Houston to read it. And no doubt I'll be posting more about it.

Database physical layout (files/filegroups): Tempdb is the obvious case, and I've covered that above. User databases are a bit trickier, and it really depends on the underlying I/O subsystem. A single monolithic data file on a single physical disk isn't going to perform well under heavy concurrent load. But it may do fine on a RAID-10 array sliced-and-diced in a clever way by an expensive SAN. In today's world, more and more databases are on SANs so user database performance is taken care of in that respect, and all you need to worry about is tempdb. I touched on this a little bit in the editorial of a previous survey Physical database layout vs. database size.

Table design: Both ends of the spectrum work well here - from the plastic Costco tables with the fold-down metal legs to antique French-oak farmhouse tables. Both will give many years of service with no degradation of performance. Just like regular tables, SQL Server tables come in all shapes and sizes and you can easily pick the wrong one for what you need, leading to poor performance. This is a massive can of worms - in fact Kimberly spends 2 brain-busting days on this in one class she teaches and she could easily spend a week of 10-hour days on it (oh yes, it's called the SQL Masters program :-) From picking the right column types (see Michelle's post at Performance Considerations of Data Types) to deciding how to store you LOB values (see my post at Importance of choosing the right LOB storage technique) to deciding on a good primary key. Just don't pick a random GUID, or if you have one, replace it. Clustered index keys should be unique, narrow, static and ever increasing. Table design encompasses so many things I can't do it justice in a late evening paragraph. But you get the idea.

Heaps vs. clustered indexes: Kimberly likes to wrap this up with nonclustered indexes too, but I wanted to break them out so I could reference a whitepaper that strongly suggests you should use clustered indexes: Comparing Tables Organized with Clustered Indexes versus Heaps. Read it yourself and draw your own conclusions. Basically, with the possibility of forwarding records in heaps, and the majority of a clustered index's upper levels being in memory, the extra random physical I/Os to access a record in a heap outweigh the aggregated cost of the in-memory binary-searches at each level of clustered index tree navigation. I'm sure some people will argue about this - bring it on. There are very special cases where heaps are better, but not in general.

Nonclustered index strategy: Biggie #1. Where to start here and what to say? For the majority of scenarios, if you don't have a good nonclustered indexing strategy you're not going to have good performance. There's a simple reason for this - without the right nonclustered indexes to allow the query optimizer to choose them, you're going to get table scans. Of your 124 billion row table. How fast is that going to be? Again, Kimberly has this class that she teaches on this where she goes into details on how to index for ORs, for ANDs, for JOINs, and all sorts of other kinky stuff. I need to sit and listen to her one day so I can appear knowledgeable about this. I just know that you need to have indexes with the right selectivity. Ok - I just called Kimberly downstairs in the office to make sure I'm right before I blog something and look like an idiot and she starts with 'erm, well it depends'. I give up. I do *corruption* and *HA* and *maintenance*, not performance tuning. It's bloody hard. Luckily I'm married to one of the best people on the planet for SQL index tuning - wow, what hope do you have? (Of index tuning, not of marrying Kimberly :-) Seriously, index tuning isn't that bad - I just had a moment of weakness. You need to make sure you have the right indexes and no completely unused indexes otherwise you're wasting resources maintaining them. You could do worse than listen to her on RunAs Radio Interview Posted - "Kim Tripp Indexes Everything".

Statistics: If the statistics are out-of-date, the optimizer will not be able to pick a good plan and your performance will suck. It's that simple. Turn on AUTO_UPDATE_STATISTICS and make statistics updating is part of your regular maintenance. Don't forget to update non-index statistics too.

Application (design and code): Biggie #2. Sometimes no amount of cleverness can wring good performance from the twisted logic of a deranged application programmer. An application that is written with absolutely no concern for how SQL Server works is likely to not perform well. An application that is written with too much knowledge of how SQL Server works is likely to fall foul of relying on 'accidental' behaviors, or behaviors that are limited to certain data volumes or workloads.  An application that is going to make heavy use of SQL server has to take into account how SQL Server is going to behave under a variety of workloads, on a variety of servers, and on a variety of I/O subsystems. The common application test framework? A ten-row customer table with a single connection. 'Excellent - my 16-table join with CLR-based aggregations runs in less than 2 seconds'. For 10 rows. Six months later: 'Now we've got 3 million customers, why does performance suck so bad?' Go figure.

Database maintenance: Ok - starting to get tired now and I still need to do this week's survey. Maybe I could just say 'do maintenance' and be done with it? No? Ok - how about go checkout the article I wrote last August for TechNet Magazine: Effective Database Maintenance article. If you don't maintain your indexes, they'll likely get fragmented and affect performance in one way or another and you'll need to play with FILLFACTOR (see Kimberly's Database Maintenance Best Practices Part II – Setting FILLFACTOR). If you don't take care of transaction log VLF fragmentation, it will affect log-based operations (Kimberly again at Transaction Log VLFs - too many or too few?). Seriously - she says I blog so much - how many links are in this post to her stuff? She blogs a lot too!. If you have corruption, it could manifest itself as long-running I/Os. If I don't put my food on to cook, then I'll be eating way too late again. That was a 'Paul maintenance' one that slipped in - and can have disastrous effects on blogging performance. Seriously, you can't just put a database into production and walk away. It will slowly degrade over time. Like red jello melting on a hot summer's day, but probably not as fast, unless you use random GUID primary keys, or it's really hot. What?!?

Performance tuning is an art and a science. But there's a huge amount of science behind it before you have to get into the art side of things. I've just scratched the surface here in a blog post that took me more than an hour to write, banging away non-stop as I do. And I don't do the art side of things. I leave that to Kimberly - she just got one of our client's batch jobs from a 72-hour run-time down to a 6.5 hour run-time. They were pretty happy. We can't all be the Goddess of Performance Tuning Smile

Next post - this week's survey!

Categories:
Performance | Surveys

The July 2009 edition of TechNet Magazine is available on the web and include a feature article I wrote explaining backups. It's the first in a 3-part series, with parts 2 and 3 being on recovering from disasters using backups and recovering from disasters without backups.

The backup article covers:

  • Full backups
  • Differential backups
  • Log backups
  • Putting together a backup strategy
  • Bakcup integrity

It debunks a lot of myths about backups, explains in detail how each backup works, and explains the pros and cons of some different backup strategies.

Check it out at Understanding SQL Server Backups - enjoy!

PS I'll also be presenting a session on backup strategies at SQL Connections in November and covering them in my post-con workshop at PASS in November the week before.

PPS I've just signed a contract with TechNet Magazine so that there will be something on SQL Server in every issue - either my SQL Q&A column or a feature article. Very cool!

There's another DBA 'chain-blog' going around, this time started by Tim Ford, and I was tagged by Tom LaRock. The idea is that you're stuck on a desert island for six months with WiFi and you have to spend it doing something related to work. What would you spend the six months doing? If I break the chain then I won't get enough birthday cards to get in the Guinness Book of Records, or something else depressing, so I'd better join in.

At the moment, I'd probably say continue losing my life to Twitter, SQL forums, and blogging - but then I'd be blogging about blogging and there's the risk I'd get sucked into an infinitely-recursive blog post and I wouldn't get anything done. And Kimberly already thinks I blog too much, so I'd steer clear of that. I'd really prefer to spend the six months diving in the reefs around the island and checking out the stars in a sky with no light pollution, but I think it would be hard to make either scuba gear or a telescope from just coconut shells and sand, and I'm pretty certain that even if I could, the coconut-shell based air compressor for filling the scuba tanks wouldn't hold together when I powered it on either. But hey, where would the power come from? But if there's no power, how would the WiFi work? Hmm - the scenario's falling apart quickly - best cut to the chase.

I spent 15 minutes looking for a clean desert island joke to include here, then gave up, sat watching a heron fly past the deck and then figured out what I'd do. Now, you all know that I'm not a DBA, so I wouldn't have anything to do on a set of systems, but I do have a lot of things I'd like to do around SQL Server but have never had the time. Here are my top four:

Tool to explain corruption messages. The expert system of how to analyze all the corrution error messages that CHECKDB produces to find out what exactly it's telling you, and what may be deleted by repair, basically resides in my head. There's another copy that resides in Ryan Stonecipher's head (the guy that took over DBCC from me), but he's even less likely to have time to do this since he still works at MS. I'd love to program that expert system and make it available. Just like I'd like to put together the 'what exactly does each error message mean' PDF. I did it for 2000 and 2005 while at MS, and it took me 3 weeks. I have all that in my head too. I thought that writing the 2008 Internals book with Kalen would allow by in-head lazywriter to clean out some buffers, but it's all still there.

Write the book on maintenance and DR for involuntary DBAs. I keep getting asked what the good info is out there, and there's no one good place to go. The closest I've come to doing this is the first article I wrote for TechNet Magazine back in August 2008 - Top Tips for Effective Database Maintenance. There's a big need for a book aimed at non-SQL professionals as I see the same mistakes being made over and over in forums. This is another thing where I've got all the details in my head and spread through my blog and Kimberlys blog, but just need time to pull it together into a coherent story.

Write a script that will figure out the size of the next log backup. This is on my list of cool tools to write and give away on my blog. I've already done the how big will the next differential backup be, and the how much data will the next log backup include, but this one is a *lot* harder. I have a good idea how to do it, but I need a bunch of time to sit down and figure it all out in a solid script. And then test it lots.

Write a script that will produce a page checksum on every page. This has been on my list even longer. I know several ways to do this, but they all have undesirable side-effects, except one, which I haven't tried yet. This will be useful because once you enable page checksums after upgrading - nothing happens. A page doesn't get a page checksum until it's read into the buffer pool, altered, and then written out again. And there's no 'touch every page' tool.

Get hold of the SQL source code and add the following: online index rebuild of a partition, diff-based mirroring of FILESTREAM data, proper page-split monitoring, a new shrink algorithm that doesn't cause fragmentation, CHECKDB of a database within a backup without restoring the backup (my patent).

Ok - I guess I should tag some people too otherwise the chain will break and our house will be devoured by giant military squirrels... or something equally unlikely but tangibly scary. I choose my good friends Ward Pond, Greg Linwood, and Adam Machanic.

PS The title is a play on the BBC Radio 4 show 'Desert Island Discs'. It's supposed to be a geeky play on words. Well, I though it was funny and Tom had already used the 'Lovely Bunch of Coconuts' line. Thanks Tom.

Categories:
General

Just found this excellent blog post (continually updated) that lists a ton of free SQL Server tools for all sorts of purposes. Check it out at http://weblogs.sqlteam.com/mladenp/archive/2007/11/20/Free-SQL-Server-tools-that-might-make-your-life-a.aspx.

Categories:
Tools

Wow. This is a driving stunts video by Ken Block - it's just jaw-dropping what this guy can do flying around a dockyard in a car.

Click the image to get to the video. Skip 1 minute in to get past the intro and then it's awe-time for 5 minutes (then they show you pictures of shoes and stuff you can skip). Best watched with the sound turned up.

Enjoy!

PS Thanks to Brent Ozar for showing me this.

Categories:
TGIF

Theme design by Nukeation based on Jelle Druyts