This month's T-SQL Tuesday is being run by newly-minted fellow-MVP Jen McCown (blog|twitter) and the topic is about resolutions. And specifically about resolutions around technology you've been pondering.

One of the things I try very hard to do is make it a bit fun when teaching people about SQL Server. Let's face it, it's a very, very dry topic and it would be really easy to get bored. If you know me at all, you'll know I have a pretty good (quirky) sense of humor.

So, my resolution this year is to continue to have fun!

This involves (in no particular order of priority):

  • Singing
    • I'm the first to admit that I have no singing voice whatsoever. That did not stop us recording a Christmas medley of songs and publishing them for you all to laugh at.
    • Here are the words to my Silent Night rendition:
      • Silent night, workload is light
      • Pager's quiet, all is right
      • Then a disk failure shakes off your delight
      • 824 errors plague you through the nightAll of your backups are baaaad
      • But CHECKDB puts it all ri-ight, CHECKDB puts it all riiiiight.
  • Sheep
    • And talking of baaaad.... I like sheep. In a totally family-friendly way. Everyone needs to go watch this video about a massive pong game on the side of a hill involving cleverly herding hundreds of sheep in the dark, all of the wearing jackets made from hundreds of LEDs.
    • Check it out: http://www.youtube.com/watch?v=D2FX9rviEhw 
  • Electronics

What does all this have to do with SQL Server? Nothing. But it's what makes life fun! And that's my resolution - to keep having fun!

Categories:
Personal | T-SQL Tuesday

Last Tuesday I hosted T-SQL Tuesday #012 (see here for the call for participation) and posed the question "why are DBA skills necessary?" This month broke the participation record with 46 people contributing posts - fabulous! Lots of people contributed for the first time too. And what I think is also really cool is that 45 out of these 46 are on Twitter too.

The downside of so many people participating is that it's taken me quite a while to read through all the posts to compile this summary - maybe I should have picked something obscure and boring instead! One of the most interesting things about doing this wrap-up was experiencing a huge variety of blog themes and fonts - a few of which actually made my eyes hurt!

A HUGE thank you to everyone that participated, T-SQL Tuesday doesn't work without you. I really enjoyed reading through everyone's posts and seeing the passion out there for the DBA role. As I've summarized the posts I've added my own comments too.

Here are the posts from all the participants in the order in which their links appeared in my blog post comments (or I dug out additional posts using Google).

Matt Whitfield (blog | @atlantis_uk): T-SQL Tuesday #12 - Why are DBA skills necessary?

Matt tells the story of himself as a database designer making a plethora of hard-to-remedy mistakes 9 years ago before he had DBA skills that would have helped him not make those mistakes. His take is that it's essential to have someone with DBA skills involved in system design as newbie DBAs or developers don't know the impact of design choices on SQL Server. I love this line from his post: "There is a whole world of epic-database-fail out there, and that world needs us."

Noel McKinney (blog | @NoelMcKinney): T-SQL Tuesday #012 – DBA Skill or Nil?

Noel takes a look at different RDBMS platforms and how the need for DBA skills could change - from Oracle to MySQL to the cloud to NoSQL. Even with automated solutions, he still thinks you wouldn't want to just apply all the default policies and walk away. I'd have to agree. 

Pinal Dave (blog | @PinalDave): SQL SERVER – Are you a Database Administrator or a Database Developer?

Pinal's short post focuses on how the line is often blurred between being a database administrator and a database developer. He has a survey in the post which is currently showing 60% of respondents believe they're more a dev than a DBA.

Audrey Hammonds (blog | @Datachix2): T-SQL Tuesday: Why are DBA Skills Necessary? – A Datachix Perspective

Audrey's post is about how some of the knowledge a DBA has is essential for a database developer to have too - but does that make you a DBA? No. I love her medical analogy: "See, asking me, a non-DBA, to attempt to be a DBA is like asking a psychiatrist to perform heart surgery". DBAs are essential to look after SQL Server.

Erik Bitemo (blog): T-SQL Tuesday #12 – Why are DBA skills necessary?

Erik's post explains some of the knowledge DBAs should have as a way of illustrating the problems that can occur if they don't. He also reminds us that one of the major causes of SQL Server being regarded as not requiring a DBA was the marketing around the 7.0/2000 time-frame that proclaimed how easy SQL Server was to use. IMHO that really hurt the image of SQL Server and SQL Server DBAs.

Rob Farley (blog | @rob_farley): Why bother with database professionals?

Rob's argument is that you don't necessarily need DBA skills - you need people who can apply the right paradigm to their work. For database work that paradigm is set based.

Aaron Bertrand (blog | @AaronBertrand): T-SQL Tuesday : Are hotshot DBA skills necessary?

Aaron (one of my favorite MVP friends) makes a great point that many small business *don't* need a full-time rockstar DBA and that many of the defaults will allow them to continue along happily. But at some point as the business grows and the data/transaction volume grows with it, you cross a line and then a DBA really *is* necessary. I totally agree. When I'm teaching about index fragmentation I always say that at the low end, rebuilding all your indexes every night is probably fine, but as things get larger and downtime becomes important, DBA skills are needed to take a more analytical approach.

Robert Hartskeerl (blog | @rhartskeerl): T-SQL Tuesday #12–Why are DBA skills necessary?

Robert explains how even though there are tools to assist with managing SQL Server, someone has to interpret what the tool is saying and no how to fix the problem. And the tools can be wrong - like the RESTORE bug in SSMS in SQL 20008. Robert also used my favorite phrase... "it depends!"

Erin Stellato (blog | @erinstellato): TSQL Tuesday #012: Why are DBA Skills Necessary?

Erin (my favorite stalker :) recounts some of the problems she's helped with on customer systems that affected business continuity. Her take is that if the data is really important, you need a DBA to make sure it doeesn't get irretrievably damaged or destroyed.

Grant Fritchey (blog | @GFritchey): TSQL Tuesday: Why Are DBA Skills Necessary

Grant (who scares me sometimes) tells some good stories and preaches the same mantra: at some point you need a DBA who knows how to protect and recover data, perform tuning etc. Even with NoSQL - you're still storing data and you still need to be able to restore it, get at it quickly, etc. Hear, hear.

Steve Jones (blog | @way0utwest): T-SQL Tuesday #12 – Why Do You Need a DBA?

Steve (who scares me most of the time with his pink cowboy hat) joins the chorus that DBAs aren't always required, but when they are they can save a company a lot of time and money and hence can be a really worthwhile investment.

Jes Borland (blog | @grrl_geek): T-SQL Tuesday #012: Why Are DBA Skills Necessary?

Jes explains why DBAs should market their skills as a problem-solvers and money-savers, and that a DBA isn't an IT function, it's a business need. Oh yes, and she says that we should all wear super-hero capes!

Bob Pusateri (blog | @SQLBob): T-SQL Tuesday #12: DBA Skills

Bob tells some stories from his DBA past but says that one of the most important DBA skills is to know when to ask others for help. He goes on to say that "presence of DBA skills can make a big difference between just getting something done, and doing something exceptionally well."

Mike Reigler (blog | @RMikeReigler): TSQL Tuesday #12 – I’m not a DBA, But I Got DBA Skills

Mike lists a bunch of SQL knowledge that most developers wouldn't know or care about, but makes the point that *someone* has to know and care about it, or people are going to get annoyed when performance tanks. He also explains how knowing some DBA skills sets him apart from other developers.

Mark Blakey (blog | @Blakmk): T-SQL Tuesday #12 – Why are DBA skills necessary?

Mark's the first poster to bring up the fact that developers and DBAs often have different priorities - with devs churning out code quickly while DBAs are trying to preserve resources.

Oscar Zamora (blog | @ZamoraO): Who needs DBA Skills? [#TSQL2sDay]

Oscar (who I just met at SQL PASS on Monday) lists 14 reasons why DBA skills are necessary and throws down the gauntlet by stating he thinks "most developers have no business in the DB world".

Tamera Clark (blog | @tameraclark): TSQL Tuesday – Why are DBA skills necessary?

Tamera explains how some DBA skills can help write better T-SQL as you'll know more about how SQL Server is handling the T-SQL you casually throw at it.

Jeremy Carter (blog): TSQL Tuesday – Why are dba skills necessary?

Jeremy explains that even with small databases, sometimes a DBA needs to be involved from the get-go, before the hardware is even provisioned.

Kerry Tyler (blog | @AirborneGeek): T-SQL Tuesday #12: Why are DBA Skills Necessary?

Kerry explains that databases are hard, and that it takes many different roles to look after a corporations data properly - the DBA being just one of them.

Andy Lohn (blog | @SQLQuill): T-SQL Tuesday #012 – Smart, Technologically Up-to-date, Well-Meaning Application Developers

Andy explains how he picked up DBA skills when he became a database developer and how useful they've been for his company.

Ricardo Leka (blog | @BigLeka): Por que habilidades de DBA são necessárias?

Portuguese? Google Translate to the rescue! Hmm - Google Translate didn't do such a good job, but I think the gist of Ricardo's post is that too many companies that think that SQL Server doesn't need a DBA.

Andrew Vogel (blog | @sqlreader): T-SQL Tuesday: Why Are DBA Skills Necessary?

Andrew makes a very interesting point that consultants know what's best about a product, but only you know what's best for your environment. Totally true - so many clients expect us (consultants) to provide the best answer for them straight away.

Wendy Pastrick (blog | @wendy_dance): Who Needs DBA Skillz?

Wendy (who I just met at SQL PASS on Monday) says anyone who works with a database needs some DBA skills. And if you don't have anyone with DBA skills, check out the #sqlhelp tag on Twitter. Hear hear!

Pat Wright (blog | @SqlAsylum): T-SQL Tuesday, Why are DBA skills necessary?

Pat (who I just met at SQL PASS on Monday) introduces me to an expansion of the DBA TLA I've never heard before: Default Blame Acceptor. While I've never heard it, I totally get it. DBAs are the unsung heroes of countless nasty situations caused by others.

Cade Roux (blog | @caderoux): T-SQL Tuesday: Why are DBA skills necessary? I pick the most important one...

Cade focuses on the skills a DBA has and explains why he thinks troubleshooting is the number-one skill he looks for in a DBA.

Tim Ford (blog | @sqlagentman): T-SQL Tuesday – Sometimes DBAs Are Required – Even for the Amish

Tim (who looks great in a Utilikilt) manages to unite Amish people and brothels in the same post. That takes a different kind of skill.

Geoff Hiten (blog | @SQLCraftsman): Why DBA skills are important (T-SQL Tuesday)

Geoff's short post tells the tale of someone attempting to use replication as a DR solution without understanding what replication does and does not provide. And that person was a 'DBA', but without DBA skills.

Jason Strate (blog | @StrateSQL): Why are DBA skills necessary? #TSQL2sDay

Jason gives some examples of why "what you don't know can hurt you" and then explains three critical things you should know if you're a DBA.

Jason Brimhall (blog | @sqlrnnr): T-SQL Tuesday #012 – Skills

Jason departs from the norm and discusses the NON-technical skills that a successful DBA needs to have. I particularly like the emphasis on a sense of community. I'm a huge community guy and I really like to see those that benefit from the SQL community eventually starting to give back in the form of things like blog posts, tweets, and user group presentations. People really ARE interested in what you've experienced as that will help them when they have the same problem.

Gill Rowley (blog | @SQLGill): T-SQL Tuesday #012: Why Are DBA Skills Necessary?

Gill tells a story of trying to get an IT department out of the dark ages and under control using his DBA skills.

Brent Ozar (blog | @BrentO): T-SQL Tuesday: Why Do You Need DBA Skills?

Brent (our very own Mr Wonderful) talks about when saying 'whoops' isn't code for "oh no, I dropped the database" and how real DBAs need to know enough to fix the junior DBA's "fixes"...

Matt Velic (blog | @mvelic): November’s T-SQL Tuesday: Importance of DBA Skills

Matt went as far as recording his T-SQL Tuesday message, describing how valuable DBA skills are.

Gethyn Ellis (blog | @SQLGRE): TSQL2SDAY #12 - Why are DBA Skills Necessary?

Gethyn explains some of the problems that can cause business continuity to suffer, all of which could be solved with backups, being taked by someone with DBA skills.

Jack Corbett (blog | @unclebiguns): T-SQL Tuesday–Why are DBA Skills Necessary?

Jack (whose Twitter alias I've been afraid to ask about...) tells a story if disaster and data loss that would have been prevented had someone with DBA skills been involved.

Chris Shaw (blog | @SQLShaw): T-SQL Tuesday

Chris gives some examples of why companies need databases today to stay competitive, and hence need DBAs to keep those databases healthy.

Sal Young (blog | @EmpiricalDataMa): We don’t need no stinking DBAs! (T-SQL Tuesday #12)

Sal describes some of the different DBA specialties there are, and how you need to be prepared to be dealt a hard knock by clueless executives bent on saving company money by laying off employees.

Robert L Davis (blog | @SQLSoldier): T-SQL Tuesday #012 – the DBA as a Modern-day Specialist

Robert (DBM book writer extraordinaire) tells a *really* interesting story of a high-school injury that nearly left him paralyzed because it was mis-diagnosed by GPs rather than a specialist - and draws the analogy to under-qualified people trying to diagnose problems affecting an important database.

Kimberly L Tripp (blog | @KimberlyLTripp): TSQL Tuesday - Why DBA skills are important

Kimberly (lovely wife) highlights a couple of areas where you can get bitten without DBA skills, and explains why *someone* in the organization has to have DBA skills because SQL Server is a *general purpose* RDBMS, and so it needs to be tweaked to work best for your environment.

Kendra Little (blog | @Kendra_Little): TSQL Tuesday #12: Why Are DBA Skills Necessary? Fido, Please Turn Your Head And Cough.

Kendra draws a very clever analogy between DBAs and veterinarians, saying that both look after birth, death, growth, prevention of illness, and so on.

Jeremiah Peschka (blog | @PeschkaJ): T-SQL Tuesday – Why Are DBA Skills Necessary?

Jeremiah (who has quite amazing tattoos) says that DBA skills aren't necessary, but instead the ability to learn is what sets the successful apart from the unsuccessful.

Josh Feierman (blog | @awanderingmind): Who needs a DBA? – TSqlTuesday #12

Josh recalls how he started out as an involuntary DBA, and thinks about all the mistakes he made that a real DBA would have picked up on. One thing I think we sometimes lose sight of is that *everyone* started with zero SQL Server knowledge...

Josef Richberg (blog | @sqlrunner): T-SQL Tuesday 12: Why are DBA skills necessary?

Josef laments that developers are seen as THE all-knowing entities, which can lead unsuitable designs making it into production and causing problems - which then leads to angst about the database platform which fuels the NoSQL movement.

Jonathan Kehayias (blog | @SQLSarg): T-SQL Tuesday #12 - Why are DBA skills necessary?

Jonathan (extended events expert) uses some examples from his work to illustrate some of the problems facing companies without DBAs, and explains how it can take time for a new DBA to earn the respect of the incumbent IT staff.

Cameron Mergel (blog | @CameronMergel): T-SQL Tuesday – Why Are DBA Skills Necessary?

Cameron reiterates that as the operations you want to do on your database become more advanced, you need someone with higher skills - just like there's a point at which you stop doing maintenance on your car yourself and trust it to a mechanic.

Kelly Martinez (blog | @GreeleyGeek): Why are DBA skills necessary? #TSQL2sDay

Kelly talks about some of the things a DBA should know, but the most interesting point in his post is about how some vendors seem to have no clue about the database they're running on.

Sean Gallardy (blog | @SeanGallardy): T-SQL Tuesday – Why are DBA skills necessary?

Last but not least, Sean makes the very good point that DBA skills are really necessary when searching for problem solutions on the Internet. Without DBA skills, how do you tell if the advice/solution you're reading is correct and appropriate?

In November we're doing T-SQL Tuesday a week early (Tuesday November 2nd) so that it doesn't clash with PASS and I'll be your host!

What is T-SQL Tuesday?

If you haven't seen it yet, T-SQL Tuesday is the brain-child of fellow-MVP Adam Machanic (blog|twitter). It happens once a month on the 2nd Tuesday (different this month) and is hosted by a different person in the SQL community each time. The idea is that it's kind of a rallying point to get a bunch of people to blog about a particular topic on the same day, giving a wide and varied set of opinions and interpretations of that month's topic. The hoster then posts a wrap-up commenting on all the contributions. I think it's a great idea and I contribute whenever I have time.

What is the topic for November?

This month I'd like to step back from the deep technical stuff and ask "why are DBA skills necessary?"

I don't want to color people's opinions by giving my own yet, but some things to consider are:

  • What problems have you seen in your career that could have been avoided with some DBA skills?
  • At what point does a SQL Server installation need a real DBA to look after it?
  • How could DBA input help prevent design problems in data applications?
  • Should there be cross-over been developer skills and DBA skills? What about architects? Storage admins?
  • How can business continuity be affected by lack of DBA skills?
  • How much can we rely on auto-tuning to ensure performant work loads?
  • Is Microsoft doing enough to foster DBA skills as a point of excellence?
  • What about on other RDBMS platforms? What about no-SQL?

I could go on for hours... I'm really looking forward to seeing where you take this topic and I'm expecting some great posts.

What are the rules?

As with any organized event, there are inevitably some rules that help make things run smoothly.

  • You most post your contribution between Tuesday November 2nd 00:00:00 GMT and Wednesday November 3rd 00:00:00 GMT (i.e. Monday 8pm-Tuesday 8pm EST, Monday 5pm- Tuesday 5pm PST, etc)
  • Note: there was a daylight-savings change on October 31st in Europe. I will accept posts from 4pm PST Monday to 5pm PST Tuesday as even I made that mistake :-)
  • Your post has to link back to the hosting blog, and the link must be anchored from the logo (found above) which must also appear at the top of the post.
  • Leave a comment here (below) or I won’t be able to find your post. I expect trackbacks work properly, but if they don’t check back here just in case and leave a comment if necessary.

Sample HTML to include the logo and link-back is:

<a href="http://www.sqlskills.com/BLOGS/PAUL/post/Invitation-to-participate-in-T-SQL-Tuesday-12-e28093-Why-are-DBA-skills-necessary.aspx"><img src="http://www.SQLskills.com/BLOGS/PAUL/image.axd?picture=2010%2f7%2fTSQL2sDay150x150.jpg" border="0" alt="" width="150" height="150" /></a>

NOTE: THIS DOES NOT DO A 'TRACKBACK' - YOU WILL NEED TO ADD A LINK IN A COMMENT ON THIS POST.

It would also be really nice if you...

  • Include a reference to T-SQL Tuesday in the title of your post. (The more we bloggers advertise T-SQL Tuesday, the more we bloggers get T-SQL tuesday readers)
  • Tweet using the hash tag #TSQL2sDay to follow links and other relevant conversations.
  • Consider hosting T-SQL Tuesday yourself. If you’re interested let Adam Machanic Know. If you’ve participated in two T-SQL Tuesdays previously and you don’t let your blog go stale (blog once a month for the last six months) then he’ll put you in the rotation.

Summary

Don't fret about the rules, it's nothing too complex. If you're having trouble linking properly on the day, drop me an email (paul@SQLskills.com) or tweet (@PaulRandal) and I'll help you out - I'll be watching things all day.

As I said, I'm really looking forward to seeing a really diverse set of posts and I'll very quickly post a long summary with lots of editorial commentary and opinions too.

Have fun!

Categories:
General | T-SQL Tuesday

This month's T-SQL Tuesday is being run by fellow-MVP Sankar Reddy (blog|twitter) and the topic is about Misconceptions in SQL Server.

(Check out the 60-page PDF with all the myths and misconceptions blog posts collected together: CommonSQLServerMyths.pdf (732.96 kb))

Back in April I spent an entire month doing a DBA-Myth-A-Day series once a day which was all about misconceptions. So, rather than repeating some of what I said then, or spilling the beans on some more myths I'll be discussing at PASS and SQL Connections in November, this post will be an uber-list of all the misconceptions I debunked during April - and, boy, there are a lot of them! I keep hearing these myths over and over, from all kinds of people, including SQL MVPs, vendors, and consultants.

These posts explain in detail why the misconception is a misconception and in many cases use a script to prove it as well.

And there you have it - over 100 myths and misconceptions debunked, explained and, in many cases, proven!

Every month there's a flurry of blog posts around the same topic - it's called T-SQL Tuesday and is a neat concept. This month it's being driven by Robert Davis (blog|twitter), who I had the pleasure of teaching in the March rotation of the Microsoft Certified Master - SQL class (he passed first time) and previously in internal Microsoft classes. The topic is about how to teach and learn, so I want to twist it around a little and talk about things from an instructor's perspective.

Kimberly and I teach a *huge* amount - classes, workshops, conferences, private clients - and we've seen the whole gamut of student types and classroom antics. In this post I'd like to lay out what I consider to be the things most likely to annoy your fellow students, annoy the instructor, and/or prevent you from getting the most from your class. Think of it as a not-too-subtle rant at a few bad apples out there. Either read along and feel guilty that you've done some of these, or read along and tut-tut that you've seen someone do this and it sucked.

These are in loose order, with #1 being the worst mistake to make. Some of these are controversial, but I'm an honest kind of guy, and people like how I run my classroom, so I want to get them out there. Here you go:

10 Take a phone call during class

If you take a phone call during class, I'll ask you to leave the room. At the start of the class I always ask for phones to be on vibrate and to step out if you have a call. I don't mind people walking in and out a few times to take calls - it's very hard to put me off when I'm teaching. But talking on a phone in class (apart from saying 'hold on a sec while I go outside') is just antisocial and inconsiderate. Don't do it.

If a phone rings during class, I'll start to dance. Everyone laughs. I'm letting you know that we all realize you totally ignored the instructions about phones that everyone else adhered to.

And if you *make* a call during class, expect no mercy. I've had this happen once. After he came back in from making the call, and at the next break, I went over and explained how incredibly rude that was and he could choose to stay in the class without his phone or leave. He stayed.

9 Sit at the back and do email/surf and then ask questions

There's one person in every class like this - who surfaces every so often and asks questions about stuff we just covered. My response is usually something like "we just covered that ten minutes ago, read the slides and let me know if you have questions". If you can't get it together to pay attention, at least check where we are in class before asking questions that tell everyone else you've been doing something else and are now wasting their time.

8 Persist with a tangential rat-hole

While laying out the ground-rules of the class at the start, I talk about how questions are excellent, the whole point is that you're here to learn, but that long discussions about your particular situation will have to go to the break, lunch, or after class. And I mean it. Classes are carefully planned to have a certain percentage of question and discussion time (some more than others) and so if you're going on and on about something that's not relevant for the rest of the class, you'll need to wait to monopolize the instructor's time when it's not everyone else's time too. I've actually had to say "ok - stop talking about that now, we have to move on with the class". Most often these people are really trying to do #1 below.

7 Bring your smelly lunch into the classroom

Everyone will hate you.

6 Come to a class where you don't understand the language it's being taught in

I struggled over whether to include this one, but it has to be said. Don't come to a class where you can't understand the language it's being taught in. I speak English, reasonably well :-), and I make a point of speaking clearly and explain things in a concise, unambiguous way. If I'm teaching a class in the US, the UK, or any other English-is-the-first-language country, I expect that students in a deep technical class about an engineering topic, with lots of arcane terms and the need for precision in explanations, are able to understand the language. I know there are a lot of ESL (English-as-a-Second-Language) folks in these countries, but if you come to a class with a bunch of other people and ask me at lunch on the first day to speak a lot slower and with smaller words because you don't understand English very well, the answer has to be no. I'm not being inconsiderate, you are. On the other hand, if I'm teaching in China, for instance, I'll seriously go out of my way to speak slowly and avoid language complexities and colloquialisms as that's the totally different audience.

The MCM has a prerequisite that you have to understand English really well before being accepted on the course, as it's fast-paced and deeply technical. A couple of ESL folks have fudged that requirement, come on the course, and failed because they couldn't keep up. It's really not fair to everyone else to have to slow right down for one person in a face-to-face class.

That's the most controversial of the mistakes I wanted to list, but I stand by what I've said. I'm not against ESL students in any way - many of the people I teach inside Microsoft are ESL - but you have to have a certain level of proficiency in the language the class is being taught in to be able to keep up. I've had people in classes that knew so little English they couldn't even ask a question I could understand - and I'm very patient and usually able to understand most people.

5 Come to a class without the required experience and knowledge

Most classes list the detailed agenda and the prerequisite knowledge, if applicable. This is so that you can gauge whether you're qualified to take the class. Don't come to an advanced class on disaster recovery and ask how to take backups using SSMS, or come to a workshop on performance tuning using wait stats and ask what an index is. You wouldn't send someone who can't swim to a class on cave diving, or send a freshman medical student to a symposium on endovascular aneurysm repair techniques, would you? So don't take a SQL class that you're not qualified to understand. You will end up a) not being able to follow the class and getting frustrated b) asking really basic questions that annoy the rest of the class and the instructor.

Oh, and by the way, reading a book about SQL Server doesn't remotely equal having experience as a DBA - so if you simply read a book to pass a qualification, you're doing yourself and whoever employs you a disservice.

4 Don't take notes

If you really want to learn, take notes about what gets drawn on the whiteboard and salient points of what gets discussed. That's why we give you a printout of the slides - so you can take notes on them. This may be more necessary with some instructors than with others - our slides are pretty dense so you can follow the story when reading them later (but that's a whole other discussion...) If you don't take notes, you'll forget things. And if you ask the same thing several times because you didn't note down the answer the first time, you'll really piss off the instructor. I had a class earlier this year where someone asked me the same thing 4 times over the course of 3 days. I was not happy, and I made sure it showed the last time by starting with "you've already asked me that three times..." as it was beyond ridiculous.

3 Ask questions to try to make it look like you know more than the instructor

You don't look cool. You look like a fool. Everyone is rolling their eyes at you, but you just can't see it. Yes, really.

Every so often I'll have someone in a class who wants to prove to everyone that they're very clever and know more than everyone else, and really doesn't need to be in the class because they're so smart. 100% of the time it's a man. There's nothing to be gained from trying to one-up the instructor. If you succeed, you may sit back all smug, but everyone else is thinking 'jerk' (or worse). These kinds of questions are usually about really narrow scenarios, or deep internals, that are beyond the scope of the class and most often the tactic fails, which makes the questioner more frustrated and ask more questions...

Invariably this leads to #2...

2 Argue that the instructor is wrong

Cardinal sin. If you think the instructor is wrong there are two correct ways to express that opinion: 1) say something along the lines of seeing different behavior in some circumstances, which leads to a nice discussion where everyone can agree and the instructor can explain he can't remember everything with a smile 2) come up to the instructor at the break to discuss it. Never accuse the instructor of being downright wrong in front of everyone. If you do, you'd better be 100%-absolutely-sure-beyond-a-shadow-of-a-doubt because one of two things is going to happen: 1) you'll be proved right and everyone will think 'jerk' (or worse). Or, and this is much, much, much more likely, 2) you'll be proved wrong, become embarrassed, frustrated, and angry and everyone will think 'jerk' (or worse).

Arguing obnoxiously is not the way to win friends and influence people, or to endear you to the class and the instructor. Most often the instructor is there because he or she knows way more than anyone in the class about the topic at hand - which is the whole point, so it's unlikely that they're wrong. It does happen, people are not infallible, but point it out nicely. And be really, really sure you know who you're arguing with before you start - pay attention to the two minute bio at the start of the class, because that's the explanation of why the instructor is qualified to teach the class, and what their expertise is. Every few classes I find myself arguing with someone about how DBCC works, or what allows the log to clear, or this or that and very occasionally I have to resort to one of the trump cards, which I hate doing, by saying "I'm sorry, you are wrong - I wrote that code", or "I'm sorry, you are wrong, I designed that feature". That sucks because I feel like I'm being arrogant. Sigh.

1 Come to class looking for "the answer"

There's one of these people in every class, who simply wants to know "how to index for *this* query" or "the *best* backup strategy". I like to joke that the answer to every question about SQL Server is "it depends!", with one exception: "should auto-shrink be enabled?". That's because there are no hard and fast answers - the answer really does depend on the circumstances. A good instructor does not teach answers, but instead teaches methodologies, theory, and background information, along with real-life examples of applying all of those so that you can find the answer for yourself, and even pass along the knowledge to your team/company. There's no point just teaching the answer, because what happens next week when you have another question? If you don't understand how the first answer was derived, you'll be stuck again and no better off for attending the class.

I see this over and over and it's depressing.

Summary

Ah - that's better. If you avoid doing all these things then you'll have a great learning experience and the atmosphere in the classroom will be conducive to being a sponge to the fire-hose of information. If not, then now you know why the instructor is looking at you disdainfully...

This turned out to be a lot longer than I expected. Now, don't take this the wrong way - I *really* love teaching, which is why I do it so much, so I'm not being a jerk saying all of this - I expect that when you come into a class, you come to learn. I don't expect you to disrupt things for the other students, and disrespect me as the instructor. I guarantee you that everyone reading this who's ever been an instructor has agreed with everything I've written above.

Don't be that person.

Categories:
General | T-SQL Tuesday

Way back in the mists of time, at the end of the last century, I wrote DBCC SHOWCONTIG for SQL Server 2000, to complement my new invention DBCC INDEXDEFRAG.

I also used to wear shorts all the time, with luminous orange, yellow, or green socks.

Many things change - I now have (some) dress sense, for one. One other thing that changed was that DMVs came onto the scene with SQL Server 2005. DBCC SHOWCONTIG was replaced by sys.dm_db_index_physical_stats. Under the covers though, they both use the same code - and the I/O characteristics haven't changed.

This is a blog post I've been meaning to do for a while now, and I finally had the impetus to do it when I heard about today's T-SQL Tuesday on I/O in general being run by Mike Walsh (Twitter|blog). It's a neat idea so I decided to join in this time. In retrospect, reading this over before hitting 'publish', I got a bit carried away (spending two hours on this) - but it's one of my babies, so I'm entitled to! :-)

This isn't a post about how to use DMVs in general, how to use this DMV in particular, or anything about index fragmentation. This is a blog post about how the DMV works.

DMV is a catch-all phrase that most people (myself included) use to describe all the various utility views in SQL Server 2005 and 2008. DMV = Dynamic Management View. There's a catch with the catch-all though - some of the DMVs aren't views at all, they're functions. A pure DMV gets info from SQL Server's memory (or system tables) and displays it in some form. A DMF, on the other hand, has to go and so some work before it can give you some results. The sys.dm_db_index_physical_stats DMV (which I'm going to call 'the DMV' from now on) is by far the most expensive of these - but only in terms of I/O.

The idea of the DMV is to display physical attributes of indexes (and the special case of a heap) - to do this it has to scan the pages comprising the index, calculating statistics as it goes. Many DMVs support what's called predicate pushdown, which means if you specify a WHERE clause, the DMV takes that into account as it prepares the information. This DMV doesn't. If you ask it for only the indexes in the database that have logical fragmentation > 30%, it will scan all the indexes, and then just tell you about those meeting your criteria. It has to do this because it has no way of knowing which ones meet your criteria until it analyzes them - so can't support predicate pushdown.

This is where understanding what it's doing under the covers comes in - the meat of this post.

LIMITED

The default operating mode of the DMV is called LIMITED. Kimberly always makes fun of the equivalent option for DBCC SHOWCONTIG, which I named as a young and foolish developer - calling it WITH FAST. Hey - it's descriptive!

The LIMITED mode can only return the logical fragmentation of the leaf level plus the page count. It doesn't actually read the leaf level. It makes use of the fact that the next level up in the index contains a key-ordered list of page IDs of the pages at the leaf level - so it's trivial to examine the key-ordered list and see if the page IDs are also in allocation order or not, thus calculating logical fragmentation.

The idea behind this option is to allow you to find the fragmentation of an index by reading the minimum number of pages, i.e. in the smallest amount of time. This option can be magnitudes faster than using the DETAILED mode scan, and it depends on how big the index's fanout is. Without getting too much into the guts of indexes, the fanout is based on the index key size, and determines the number of child-page pointers an index page can hold (e.g. the number of leaf-level pages that a page in the next level up has information about).

Consider an index with a char(800) key. Each entry in a page in the level above the leaf has to include a key value (the lowest key that can possibly appear on the page being referred to), plus a page ID, plus record overhead, plus slot array entry - so 812 bytes. So a page can only hold 8096/812 = 9 such entries. The fanout is at most 9.

Consider an index with a bigint key. Each entry is 13 bytes, so a page can hold 8096/13 = 622 entries. The fanout is at most 622, but will likely be smaller, depending on operations on the index causing fragmentation at the non-leaf levels.

For a table with 1 million pages at the leaf level, the first index will have 1 million/9 = 111112 pages at least at the level above the leaf. The second index will have at least 1608 pages. The savings in I/O from using the LIMITED mode scan will clearly differ based on the fanout.

I've created a 100GB clustered index (on the same hardware as I'm using for the benchmarking series) with 13421760 leaf-level pages and a maximum fanout of 540. In reality, I populated the index using 16 concurrent threads, so there's some fragmentation. The level above the leaf has 63012 pages, an effective fanout of 213. Still, the LIMITED mode scan will read 213x less than a DETAILED scan, but will it be 213x faster?

Here's a perfmon capture of the LIMITED mode scan on my index:

 

There's nothing special going on under the covers in a LIMITED mode scan - the chain of pages at the level above the leaf is read in page-linkage order, with no readahead. The perfmon capture shows:

  • Avg. Disk Read Queue Length (light blue) is a steady 1.
  • Avg. disk sec/Read (pink) is a steady 4ms.
  • Disk Read Bytes/sec (green) is roughly 14.5million.
  • Page reads/sec (dark blue) is roughly 1800.

DETAILED 

The DETAILED mode does two things:

  • Calculate fragmentation by doing a LIMITED mode scan
  • Calculate all other statistics by reading all pages at every level of the index

And so it's obviously the slowest. It has to do the LIMITED mode scan first to be able to calculate the logical fragmentation, because it reads the leaf level pages in the fastest possible way - in allocation order. DBCC has a customized read-ahead mechanism for allocation order scans that it uses for this DMV and for DBCC CHECK* commands. It's *incredibly* aggressive and will hit the disks as hard as it possibly can, especially with DBCC CHECK* running in parallel.

Here's a perfmon capture of the DETAILED mode scan on my index:

 

Not quite as pretty as the LIMITED mode scan, but I like it :-) Here's what it's showing:

  • Avg. Disk Read Queue Length (black) is in the multiple hundreds. Clearly it's appetite for data is outstripping what my RAID array can do. It basically tries to saturate the I/O subsystem to get as much data as possible flowing into SQL Server.
  • Avg. disk sec/Read (pink line at the bottom) is actually measuring in whole seconds, rather than ms. Given the disk queue length, I'd expect that.
  • DBCC Logical Scan Bytes/sec (red) varies substantially as the readahead mechanism throttles up and down, but it's driving anywhere up to 80MB/sec. You can see around 9:49:20 AM when it drops to zero for a few seconds.
  • Readahead pages/sec (green) is tracking the DBCC scan. This is a buffer pool counter, the DBCC one is an Access Methods counter (the dev team I used to run during 2005 development). If I had Disk Read Bytes/sec and Pages reads/sec showing, they'd track the other two perfectly - I turned them off for clarity.

So the DETAILED mode not only reads more data, but it does it a heck of a lot more aggressively so has a much more detrimental effect on the overall I/O capabilities of the system while it's running.

SAMPLED

There is a third mode that was introduced just for the DMV. The idea is that if you have a very large table and you want an idea of some of the leaf level statistics, but you don't want to take the perf hit of running a DETAILED scan, you can use this mode. It does:

  • LIMITED mode scan
  • If the number of leaf level pages is < 10000, read all the pages, otherwise read every 100th pages (i.e. a 1% sample)

Summary

There's no progress reporting from the DMV (or DBCC SHOWCONTIG) but if you look at the reads column in sys.dm_exec_sessions you can see how far through the operation it is. This method works best for DETAILED scans, where can compare that number against the in_row_data_page_count for the index in sys.dm_db_partition_stats (yes, you'll need to mess around a bit if the index is actually partitioned).

In terms of timing, I ran all three scan modes to completion. The results:

  • LIMITED mode: 282 seconds
  • SAMPLED mode: 414 seconds
  • DETAILED mode: 3700 seconds

Although the LIMITED mode scan read roughly 200x less than the DETAILED scan, it was only 13 times faster, because the readahead mechanism for the DETAILED scan is way more efficient than the (necessary) follow-the-page-linkages scan of the LIMITED mode.

Just for kicks, I ran a SELECT COUNT(*) on the index to see how the regular Access Methods readahead mechanism would fare - it completed in 3870 seconds - 5% slower, and it had less processing to do than the DMV. Clearly DBCC rules! :-)

Although the DETAILED mode gives the most comprehensive output, it has to do the most work. For very large indexes, this could mean that your buffer pool is thrashed by the lazy writer making space available for the DMV to read and process the pages (it won't flush out the buffer pool though, as the pages read in for the DMV are the first ones the lazywriter will kick out again). One of the reasons I advise people to only run the DMV on indexes they know they're interested in - and better yet, run it on a restored backup of the database.

Hope this is helpful!

PS Oh, also beware of using the SSMS fragmentation wizard. It uses a SAMPLED mode scan, but I found it impossible to cancel!

Theme design by Nukeation based on Jelle Druyts