We decided to escape the wintery weather in Seattle this year by coming to Indonesia for January (our third trip here), including a long dive trip in the remote Raja Ampat region near Papua. Before heading out there we spent a few days in Bali and yesterday we hooked up with our good friend Ketut to drive around some of the places we haven't seen (we've been here a few times before). We decided to head out to the western portion of the island where Ketut's village is, and after visiting his family we headed up into the mountains to check out the rice fields and the incredibly lush and verdant foliage. Btw, the monkeys are Macaques and hang out by the side of the road on the way down the mountain.

All the photos were taken with a Canon EOS5D on aperture priority with an EF 24-105mm f/4.0L IS lens (except the one obvious fisheye shot which was using an 8-15mm f/4.0L fisheye).

Click each photo for a 1024x683 enlargement.

Enjoy!

 

 

 

 

 

 

 

Back in November I kicked off a survey that had you run some code to get some details about your cluster keys, nonclustered indexes, and table size. I got results from more than 500 systems across the world, resulting in 97565 lines of data - thanks!

The purpose of the survey is to highlight one of the side-effects of not adhering to the general guidelines (i.e. there are exceptions to these) for choosing a clustered index key. It should be, if possible:

  1. Narrow
  2. Static
  3. Ever-increasing
  4. Unique

The survey and this post are intended to show how not adhering to Rule #1 can lead to performance problems.

Both Kimberly and I have explained in the past the architecture of nonclustered indexes - where every nonclustered index row has to have a link back to the matching heap or clustered index record. The link must be a unique value as it must definitively match a single record in the heap or clustered index. For nonclustered indexes on a table with a clustered index, this link is the cluster key (or keys) as these are guaranteed to be unique. Ah, you say, but what about when the clustered index is NOT defined as unique? That's where Rule #4 comes in. For a non-unique clustered index, there will be a hidden 4-byte column (called the uniquifier) added when necessary as a tie-breaker when multiple clustered index records have the same key values. This increases the clustered index key size by 4 bytes (the uniquifier is an integer) when needed.

But I digress. The crux of the matter is that every nonclustered index record will include the cluster keys. The wider the cluster key size is (e.g. a few natural keys), the more overhead there is in each nonclustered index record, compared to using, for instance, an integer (4-byte) or bigint (8-byte) surrogate cluster key. This can mean you've got the potential for saving huge amounts of space by moving to smaller clustered index keys - as we'll see from the data I collected.

The survey code I got you to run returned:

  1. Number of nonclustered indexes
  2. Number of cluster keys
  3. Total cluster key size
  4. Number of table rows
  5. Calculation of bytes used in all the nonclustered indexes to store the cluster keys in each row

I did not take into account filtered indexes in 2008, or variable length cluster key columns, as to be honest although these will make a difference, for the purposes of my discussion here (making you aware of the problem), they're irrelevant. It also would have made the survey code much more complex for me to figure out :-)

Now let's look at some of the results I received. To make things a little simpler, I discarded results from tables with less then ten thousand rows, and with clustered index key sizes less than 9. This dropped the number of data points from 97565 down to 22425.

The graphs below show the estimated amount off savings that could be had in GB from moving to an 8-byte bigint, plotted against the first four factors in the list above.

 

 

 

 

And here are the top 20 in terms of potential savings so you can see how the rough table schema:

NCIndexes  ClusterKeys  KeyWidth  TableRows      KeySpaceGB  SavingsGB
---------  -----------  --------  -------------  ----------  ---------
6          4            72        891,751,171    358.8       352.1
6          3            16        3,189,075,035  285.1       261.4
1          5            45        4,479,327,954  187.7       154.4
6          4            72        453,251,463    182.4       179.0
4          3            16        2,766,814,206  164.9       144.3
4          2            89        371,745,035    123.3       120.5
2          4            774       76,337,053     110.1       109.5
2          4            774       76,331,676     110.0       109.5
2          4            774       75,924,837     109.5       108.9
2          4            774       75,533,539     108.9       108.3
5          4            72        318,217,628    106.7       104.3
7          1            60        269,590,810    105.5       103.4
22         3            13        389,203,725    103.7       100.8
22         3            13        329,772,049    87.8        85.4
2          2            509       90,311,271     85.6        85.0
17         1            510       9,334,362      75.4        75.3
22         3            13        267,380,864    71.2        69.2
2          7            172       219,929,560    70.5        68.8
22         3            13        261,967,851    69.8        67.8
6          5            31        395,800,250    68.6        65.6

Wow - that's some pretty amazing stuff - and that doesn't even account for the space taken up by page headers etc.

You might be thinking - why do I care? There are plenty of reasons:

  • If you can save tens or hundreds of GBs by changing the cluster key to something much smaller, that translates directly into a reduction in size of your backups and data file disk space requirements.
  • Smaller databases mean faster backups and restores.
  • Making the nonclustered indexes smaller means that index maintenance (from inserts/updates/deletes) and index fragmentation removal will be much faster and generate less transaction log.
  • Making the nonclustered indexes smaller means that consistency checking will be much faster - nonclustered index checking takes 30% of the CPU usage of DBCC CHECKDB.
  • Reducing the width of nonclustered index records means the density of records (number of records per nonclustered index page) increases dramatically, leading to faster index processing, more efficient buffer pool (i.e. memory) usage, and fewer I/Os as more of the indexes can fit in memory.
  • Anything you can do to reduce the amount of transaction log directly affects the performance of log backups, replication, database mirroring, and log shipping.

As you can see, there are many reasons to keep the cluster key as small as possible - all directly translating into performance improvements. For those of you that think that moving to a bigint may cause you to run out of possible keys, see this blog post where I debunk that - unless you've got 3 million years and 150 thousand petabytes to spare...

One thing I'm not doing in this post is advocating any particular key over any other (although bigint identity does fit all the criteria from the top of the post) - except to try to keep it as small as possible. Choosing a good cluster key entails understanding the data and workload as well as the performance considerations of key size that I've presented here. And in some very narrow cases, not having a cluster key at all is acceptable - which means there's 8 bytes in each nonclustered index record (just to forestall those who may want to post a comment arguing against clustered indexes in general :-)

Changing the cluster key can be tricky - Kimberly blogged a set of steps to follow plus some code to help you on our SQL Server Magazine blog back in April 2010.

Later this week I'll blog some code that will run through your databases and spit out table names that could have significant space savings from changing the cluster key.

I won't be blogging or tweeting much in January as we'll be in Indonesia diving, but I will be posting photos later in the month.

Enjoy!

The last post of the year! It's been an excellent year all round - I thought that instead of doing a long post about goals etc, I'd again count down the numbers that have been my life this year.

  • 7374413 (roughly): the number of page views on my blog as tracked by Google, from 141 countries. 5 times more than last year! WOW!
  • 103659: the number of miles I flew on United
  • 21799: my current tweet total
  • 11985: the number of emails I sent
  • 6500 (roughly): the number of people who've joined our Insider mailing list since February.
  • 4160: the number of people who follow my twitter ramblings
  • 520: the number of books I've bought but not yet read
  • 160: roughly the number of nights away from home
  • 150.4: miles-per-hour we reached driving down the German autobahns in June this year in a rented 7-Series BMW. Zoooooooooooooom!
  • 140: number of feet deep I dove in the Blue Hole in Belize (and I didn't get narc'd at all!)
  • 117: the number of SQLskills blog posts I wrote, including this one
  • 103: the number of photos I posted in travel blog posts (see here)
  • 100: percentage of the 4 technical employees at SQLskills who have taught MCM classes for Microsoft
  • 92: the number of days I was teaching classes or presenting at conferences
  • 53: the number of books I read this year (see my wrap-up post)
  • 45: roughly the percentage of time we were away from our 'vacation home'
  • 44: the number of flights we took
  • 30: the number of posts I did on our joint SQL Magazine blog
  • 26: the number of dives we did (taking my total to 122)
  • 24: the number of different hotels we stayed in
  • 19: the number of multi-day classes I taught
  • 16: the number of large tins of real haggis I ate this year
  • 11: the number of new bird species I saw (taking my total to 407)
  • 10: the number of countries we visited this year
  • 6: the number of new countries I visited (Luxembourg, Germany, Liechtenstein, Belgium, Netherlands, Belize - taking my total to 29)
  • 5: the number of hours I estimate we saved by using Global Entry to skip customs and immigration lines this year
  • 4: number of SQL Server MCMs (and honorary - as Kimberly and I can't take the exams as we wrote them) on staff at SQLskills
  • 4: the number of user group presentations I did
  • 4: the number of squirrels we rescued (long story...)
  • 3: the number of big Lego models I made
  • 3: the number of Nixie-tube clocks I made (see here)
  • 3: the top-10 PASS 2010 position of my Mythbusters 2 session
  • 3: the number of new US states I visited (taking my total to 22)
  • 2: the number of incredible Principal Consultants we hired this year: Jonathan Kehayias and Joe Sack - just the best employees and friends!
  • 2: the number of excellent young girls - my daughters - 10 and 12 - who just passed their Scuba Diver certifications yesterday! Woohoo!
  • 1: the number of kegs of Mac'n'Jacks African Amber we drank on our roof-top deck this year - my favorite!
  • 1: the number of utterly indispensable and fabulous long-time assistants and great friends, Libby Hagen - without whom our lives would be a constant apocalyptic mess - who's battling breast cancer right now. Read her blog about her struggle: Stop the War in My Raq!
  • 1: number of wives who beat me at PASS this year (who else can say that? :-)
  • Finally, the one and only best person in my life: Kimberly, without whom I would be lost...

Thank you to everyone who reads my blog, follows me on Twitter, sends me questions, watches our videos, comes to our classes, and generally makes being deeply involved in the SQL community a joy.

I sincerely wish you all a happy, healthy, and properous New Year!

Cheers!

(About 40 feet underwater while diving off the coast of Belize in November.)

 

Categories:

Back in 2009 I started posting a summary at the end of the year of what I read during the year (see 2009, 2010) and people have been enjoying it, so I present the 2011 end-of-year post. I set a moderate goal of 50 books this year and I managed 53. Next year we have a lot more travel coming up and I'm going to aim for 60 books read. For the record, I read 'real' books - i.e. not in electronic form - I don't like reading off a screen. Yes, I've seen electronic readers - we both have iPads - and I'm not interested in ever reading electronically.

Choosing my favorite book of the year was again easy - there's alway one book that sticks in your mind as being the memorable strongest highlight of the year, among many highlights. This year it's The Cellist of Sarajevo by Steven Galloway. This was easily the most powerful book I read this year - about the lives of several people during the Seige of Sarajevo. Go buy it and read it - you won't be disappointed.

Now the details. I enjoy putting this together as it will also serve as a record for me many years from now. I hope you get inspired to try some of these books - push yourself with new authors and very often you'll be surprisingly pleased.

Once again I leave you with a quote that describes a big part of my psychological make-up:

In omnibus requiem quaesivi, et nusquam inveni nisi in angulo cum libro!


Analysis of what I read

I read 21526 pages, or 59.0 pages a day, and a book every 7 days or so. I'm a fast reader but I don't speed read or skim ever.

   

The average book length was 406 pages, a good 85 pages shorter than last year - which also contributed to the overall total being lower. I picked my books differently this year too - reading less science fiction and a lot more contemporary fiction. At least I think that's interesting :-)

The Top-10 Top-15

Well I really struggled with this as I read a lot of truly *superb* books in 2011. I didn't want to leave any out of my short list and deny you the chance of having them suggested so I present you with my top-15! If you don't read much, at least consider looking at some of these in 2011. It's impossible to put them into a priority order so I've listed them in the order I read them, along with the short Facebook review I wrote at the time. One thing to note is that all three books I read by Gore Vidal are in my top-15 - I have high hopes for the remaining 6 in his Narratives of Empire series.
  #3 Wolf Hall; Hilary Mantel; 592pp; Historical Fiction; April 2; (Fabulous historical fiction recounting Thomas Cromwell's time at Henry VIII's court during downfall of Wolsey and rising of the Boleyns. Very detailed and extraordinary character development. Very strongly recommended.)
  #5 Creation; Gore Vidal; 592pp; Historical Fiction; April 17; (Very complex book based on reminiscences of a Persian ambassador to Greece, China, and India around 500BC. Conversations with Confucious, Buddha, and others make for some deep reading. Also paints a wonderful view of the Persian world around that time. Strongly recommended for history fans who enjoy pithy works.)
  #10 Surface Detail; Iain M. Banks; 640pp; Science Fiction; May 17; (I take back everything I've said - nothing beats Banks' Culture novels when at their finest for sheer, unbridled rollicking sci-fi. A total page turner with some excellent Ship and Mind goings-on and a mind-bending twist at the end reaching back about ten books and 20 years. If you like sci-fi go buy it and read it without delay. Fabulous. Just fabulous.)
  #14 The Club Dumas; Arturo Perez-Reverte; 368pp; Contemporary Fiction; June 10; (The Club Dumas is the basis for one of my favorite movies - The Ninth Gate, although the movie places emphasis on a different part of the plot. Excellent book centered on ancient books - any book about books is an instant hit with me. Follows a book 'mercenary' investigating differences between the 3 final copies of a 1667 book about raising the devil. Excellent - recommended.)
  #23 The Cellist of Sarajevo; Steven Galloway; 256pp; Contemporary Fiction; July 11; (Wow - what a powerful book! Sarajevo must have been a nightmare when it was under siege. The book centers around the (true) story of a cellist who plays for 22 days in the same spot to honor 22 people killed by a shell while waiting to buy bread. Imagine having to cross street junctions with random snipers killing people as they do so? Wow.)
  #29 Les Miserables; Victor Hugo; 640pp; Fiction; July 29; (Hugo's classic tale is absolutely wonderful - a real tour de force. It's a long and complex book dealing with France from 1820s-1830s and following the life of the convict Jean Valjean and those around him, especially the police inspector Javert. It was also made into an excellent movie with Liam Neeson and Geoffrey Rush playing the two roles, respectively. Very strongly recommended.)
  #38 City of Thieves; David Benioff; 272pp; Fiction; August 28; (Excellent story of a young man during the seige of Leningrad being forced to venture into the German occupied hinterland in search of eggs. Really well done and a page turner - read the whole thing this afternoon. Great twist at the end too. Recommended.)
  #39 Julian; Gore Vidal; 528pp; Historical Fiction; September 6; (Fabulous book! A biography of Julian Augustus, told as a novel, following his rise to power, apostasy and intellectual persecution of Christianity as a made-up religion, and military endeavors. I found it a page-turner with an excellent feel for the times that Julian lived in and the fragility of the Roman principate. Strongly recommended!)
  #41 A Passage to India; E.M. Forster; 416pp; Contemporary Fiction; September 18; (A masterful portrayal of the British Raj in the early C20th - their haughty arrogance as colonialists and their misunderstanding of the Indian society and culture they're in. The characters are involved in a scandal which brings out the heated worst behavior in both sides of the community. Strongly recommended.)
  #43 Cleopatra; Stacy Schiff; 400pp; History; October 9; (Fabulous depiction of her life, drawn from the limited sources that survive. Tells the stories of how her life and reign entangles with Julius Caesar's and Mark Anthony's, to invariably dire consequences. I've never read anything in depth about her life before and I'm glad this is what I read first. Strongly recommended.)
  #44 Parrot and Olivier in America; Peter Carey; 400pp; Historical Fiction; October 24; (Excellent account of the complicated life of two unlikely companions in the early 1800s in America, removed from France after the 100 Days when Napoleon briefly regainedd power. Peter Carey has a great way with words and character portrayals. Strongly recommended!)
  #45 The Girl With The Dragon Tattoo; Stieg Larsson; 600pp; Contemporary Fiction; November 4; (Excellent! Turned into a complete page turner for me. Hard to say much about it without giving away major plot elements, but characters and settingmare very well done. If you've been avoiding this like I had, give in and read it - you won't be disappointed. Already got 2nd and 3rd in trilogy to read. Strongly recommended!)
  #46 The Outlaws Inc.; Matt Potter; 332pp; Non-Fiction; November 13; (Excellent non-fiction account of the ex-Soviet air crews and their giant Anatovs and Ilyushins that fly around the world into some of the dodgiest places on the planet. It also shows how many governments and NGOs make use of them and turn a blind eye to the smuggling of all kinds of things that go on beside their legit cargo loads. Well worth reading.)
  #52 Olive Kitteridge; Elizabeth Strout; 304pp; Contemporary Fiction; December 26; (Excellent book dealing with the life of a cranky old woman in a small town in Maine. It's presented as a series of short story microcosms of people's lives in the town that are linked in some way to Olive. Especially interesting to see how her life changes when major events happen and how she works through them. Kind of reminds me of Annie Proulx's work but not so gritty and heavily written. One of my favorites this year.)
  #53 Burr; Gore Vidal; 448pp; Historical Fiction; December 30; (Excellent start to Vidal's Narratives of Empire series. Goes through Burr's reminiscences of his life during the Revolution, Vice-Presidency and treasonous activities in the West. Really looking forward to reading the other 6 in the series.Strongly recommended for history fans.)

The Complete List

And the complete list, with links to Amazon so you can explore further.

  1. Multireal; David Louis Edelman; 522pp; Science Fiction; February 14
  2. Geosynchron; David Louis Edelman; 500pp; Science Fiction; March 1
  3. Wolf Hall; Hilary Mantel; 592pp; Historical Fiction; April 2
  4. Zero Day; Mark Russinovich; 336pp; Contemporary Fiction; April 8
  5. Creation; Gore Vidal; 592pp; Historical Fiction; April 17
  6. Transition; Iain M. Banks; 448pp; Science Fiction; April 23
  7. Unaccustomed Earth; Jhumpa Larihi; 352pp; Contemporary Fiction; April 24
  8. The God of Small Things; Arundhati Roy; 352pp; Contemporary Fiction; May 7
  9. Fahrenheit 451; Ray Bradbury; 208pp; Contemporary Fiction; May 14
  10. Surface Detail; Iain M. Banks; 640pp; Science Fiction; May 17
  11. English Passengers; Matthew Kneale; 464pp; Historical Fiction; May 22
  12. Persian Mirrors; Elaine Sciolino; 432pp; Non-Fiction; May 27
  13. The Coral Thief; Rebecca Stott; 312pp; Historical Fiction; June 3
  14. The Club Dumas; Arturo Perez-Reverte; 368pp; Contemporary Fiction; June 10
  15. The Frigates; James Henderson; 192pp; Maritime History; June 15
  16. The Namesake; Jhumpa Lahiri; 304pp; Contemporary Fiction; June 25
  17. The Sewing Circles of Heart; Christina Lamb; 384pp; Non-Fiction; June 27
  18. Interpreter of Maladies; Jhumpa Lahiri; 208pp; Contemporary Fiction; July 1
  19. A Place So Foreign; Cory Doctorow; 243pp; Science Fiction; July 2
  20. Pulse; Julian Barnes; 229pp; Contemporary Fiction; July 4
  21. Burtynsky - China; Edward Burtynsky; 180pp; Non-Fiction; July 5
  22. Ghost Train to the Eastern Star; Paul Theroux; 512pp; Travel; July 9
  23. The Cellist of Sarajevo; Steven Galloway; 256pp; Contemporary Fiction; July 11
  24. Bird; Andrew Zuckerman; 512pp; Non-Fiction; July 14
  25. A God Who Hates; Wafa Sultan; 256pp; Non-Fiction; July 16
  26. The Shadow of the Wind; Carlos Ruiz Zafon; 486pp; Fiction; July 22
  27. The Hobbit; J.R.R. Tolkien; 330pp; Fiction; July 24
  28. The Blackwater Lightship; Colm Toibin; 288pp; Contemporary Fiction; July 26
  29. Les Miserables; Victor Hugo; 640pp; Fiction; July 29
  30. In Patagonia; Bruce Chatwin; 240pp; Travel; July 31
  31. Arabian Sands; Wilfred Thesiger; 400pp; Travel; August 7
  32. World Without End; Ken Follett; 1024pp; Historical Fiction; August 14
  33. William Rufus; Frank Barlow; 512pp; History; August 21
  34. Out Stealing Horses; Per Petterson; 256pp; Fiction; August 22
  35. To Siberia; Per Petterson; 256pp; Fiction; August 24
  36. The Catcher In The Rye; J.D. Salinger; 288pp; Contemporary Fiction; August 26
  37. Reheated Cabbage; Irvine Welsh; 288pp; Contemporary Fiction; August 28
  38. City of Thieves; David Benioff; 272pp; Fiction; August 28
  39. Julian; Gore Vidal; 528pp; Historical Fiction; September 6
  40. The Fourth Crusade and the Sack of Constantinople; Jonathan Phillips; 400pp; History; September 16
  41. A Passage to India; E.M. Forster; 416pp; Contemporary Fiction; September 18
  42. The Captive Queen; Alison Weir; 512pp; Historical Fiction; September 25
  43. Cleopatra; Stacy Schiff; 400; Historypp; October 9
  44. Parrot and Olivier in America; Peter Carey; 400pp; Historical Fiction; October 24
  45. The Girl With The Dragon Tattoo; Stieg Larsson; 600pp; Contemporary Fiction; November 4
  46. The Outlaws Inc.; Matt Potter; 332pp; Non-Fiction; November 13
  47. My Name Is Red; Orhan Pamuk; 432pp; Historical Fiction; November 18
  48. Endymion; Dan Simmons; 576pp; Science Fiction; November 28
  49. Circle of Reason; Amitav Ghosh; 432pp; Contemporary Fiction; November 29
  50. Beneath Blossom Rain; Kevin Grange; 352pp; Non-Fiction; December;  11
  51. Rise of Endymion; Dan Simmons; 720pp; Science Fiction; December 19
  52. Olive Kitteridge; Elizabeth Strout; 304pp; Contemporary Fiction; December 26
  53. Burr; Gore Vidal; 448pp; Historical Fiction; December 30

Categories:
Books | Personal

Last week I presented for the PASS Virtual Chapter on Professional Development about Communication Skills and the whole thing was recorded via LiveMeeting. It's 70 minutes of distilled experience with some stories thrown in.

Check it out at https://www323.livemeeting.com/cc/usergroups/view?id=7KH9ZW.

Enjoy!

PS Many thanks to Mark Caldwell for hosting the meeting!

Categories:
Career | Consulting

(2-foot wide Octopus during a night dive off Turneffe Caye, Belize)

 

Kimberly and I headed to Belize in November for our first dive trip in well over a year (we're not letting that happen again!) and spent a week on the Belize Aggressor III liveaboard dive boat. Doing liveaboards is the only way to dive - eat, sleep, and dive for 7 days! I did a bunch of courses during the week, including my very-long-overdue Nitrox and Advanced Open Water certifications, and 26 dives to make it up to 122 total dives. I managed to tick something off my bucket-list - diving the famous Blue Hole (down to 140 feet!). I also took a lot of photos!

Here I present my favorites from the week. Click on a photo for a 1024x768 enlargement.

Left to right below: Lizardfish; 3-ft Grouper who liked to be stroked.

 

Left to right below: Kimberly with her new camera rig; 5-ft barrel sponges.

 

Left to right below: mated pair of Banded Coral Shrimp; very small (1-inch long) Secretary Blennie that Kimberly found.

 

Left to right below: juvenile File Fish (about 2 inches) during a night dive; Brittle Star during a night dive.

 

Left to right below: Basket Star during a night dive - these guys are fun to feed by holding a light just by them and all the little worms and critters in the water swim to the light and and the Basket Star catches them; Brittle Stars in a barrel sponge during a night dive.

 

Left to right below: Large (9-inch shell) hermit crab; White-Speckled Nudibranch (about two inches long) I was lucky enough to spot in a sea-grass and soft coral forest.

 

Left to right below: Swimming with 8-ft Black-Tipped Reef Sharks - we really were that close!; looking up at a big shark.

 

Left to right below: the same shark again; swimming with a school of Jacks.

 

Left to right below: Speckled Moray warning me off; large (8-ft) Green Moray.

 

Left to right below: Now the Green Moray has seen me - Morays have terrible eyesight; a Jawfish - these guys live in little burrows in the sand and are very shy - I spent 1/2 hour lying on the sand waiting for this shot.

 

Left to right below: Lobster during a night dive - with mood lighting; solitary Banded Coral Shrimp during a night dive.

 

Left to right below: 4-inch Large-Eye Shrimp burrowing into the sand during a night dive; Speckled Toadfish during a night dive - these are really weird looking fish - very flat and wide - and extremely rare!

 

Left to right below: Octopus (same one as the picture at the start of the post) during a night dive.

 

Every so often I do non-SQL blog posts about things and one of the topics that comes up again and again is communication skills.

The PASS Professional Development Virtual Chapter invited me to present a lecture for them and so at 10am Pacific Time on Thursday, December 15th I'll be presenting on communication skills. I'll touch on written, spoken, blog, presenting and other forms of communication and tell some stories too.

It should be a good lecture - for once with very little technical information!

The LiveMeeting link is https://www.livemeeting.com/cc/usergroups/join?id=7KH9ZW&role=attend

Hope to (virtually) see you there!

Categories:
Career

It's the end of the financial year for most companies and we've had two companies in the last week approach us about pre-paying for 2012 classes to use up 2011 budget.

We've put together an offer that will run from now through the end of January (to entice you to spend 2012 budget early too!) where you can pre-pay six class seats for the price of five at the early-bird price (i.e. $14,975 - saving $2,995).

Here's how it works:

  • You go to our registration page and select 'Pre-paid 6 seats for the price of 5 for 2012 classes'. You can pay using any of our methods - and we can also invoice a PO if you want.
  • The six seats MUST be used for classes in 2012 in the US (we can discuss the UK classes over email - it's more complicated).
  • The seats can be used in any combination of US classes and do not have to be in the same class or the same person.
  • It covers the class fees but does NOT register anyone automatically. Class attendees need to register for the desired class ASAP to secure their seats, as seats are only available in a class until the class is sold out.
  • The process to register for a class using a pre-paid seat is to select the 'Check/Wire Transfer/Prepaid Block' payment option for the desired class, use the discount code 'prepaid', and complete registration. We will take care of matching registrations to the pre-paid block.
  • This does NOT cover travel and hotel expenses.

Here are the US classes to choose from in 2012:

  • IE1: Feb 27 – Mar 2, 2012: Internals and Performance in Tampa, FL
  • IE2: Mar 5-9, 2012: Performance Tuning in Tampa, FL
  • IE3: Mar 12-16, 2012: High Availability/Disaster Recovery in Tampa, FL
  • IE1: Apr 16-20, 2012: Internals and Performance in Chicago, IL
  • IEBI: Apr 16-20, 2012: Business Intelligence in Chicago, IL
  • IE2: Apr 23-27, 2012: Performance Tuning in Chicago, IL
  • IED: Apr 23-27, 2012: Developer Immersion in Chicago, IL
  • IE1: Aug 6-10, 2012*: Internals and Performance in Bellevue, WA
  • IE2: Aug 13-17, 2012*: Performance Tuning in Bellevue, WA
  • IED: Aug 13-17, 2012*: Developer Immersion in Bellevue, WA
  • IE3: Aug 20-24, 2012*: High Availability/Disaster Recovery in Bellevue, WA
  • IEBI: Aug 20-24, 2012*: Business Intelligence in Bellevue, WA
  • IE4: Aug 27-31, 2012*: Development Support (Security/Powershell/Opt Proc) in Bellevue, WA
  • IE1: Oct/Nov** – In Newark, NJ and/or possibly Atlanta, GA
  • IE2: Oct/Nov** – In Newark, NJ and/or possibly Atlanta, GA
  • * - these classes are confirmed with hotel contracts signed and will open for registration at the start of February 2012
  • ** - these classes are not confirmed and the dates/locations may change.

Hopefully this will work for some of you and allow you not to lose that precious budget!

Categories:
Classes | Training

We're continuing to add to our 2012 class roster (see the Tampa classes in February/March) and we're coming back to Chicago by popular demand in Spring 2012! We're teaching our first two 5-day Immersion Events back-to-back in Chicago plus we're co-locating our Immersion Event on Business Intelligence along with week 1, and our Immersion Event for Developers along with week 2.

We don't allow anyone else to license and teach our material so you're guaranteed the best instruction possible - unbeatable ROI for your training budget!

All classes are 5 days from 8.30 to 5.30 and include catered breakfast and lunch every day. The early bird price for each class is US$2,995.

We hope to see you there!

Here are the details:

Chicago, IL, April 16-20, 2012

IE1: Immersion Event on Internals and Performance

  • Data Storage Internals, Designing for Performance, and Indexing for Performance 
  • Instructors: Paul S. Randal, Kimberly L. Tripp
  • Full details and registration link HERE

IEBI: Immersion Event on Business Intelligence 

  • BI Overview, Integration Services, Analysis Services, Reporting Services and more
  • Instructors: Stacia Misner 
  • Full details and registration link HERE

Chicago, IL, April 23-27, 2012

IE2: Immersion Event on Performance Tuning

  • IO Subsystems, Workload Analysis, and Performance Tuning Methodologies 
  • Instructors: Paul S. Randal, Kimberly L. Tripp, Jonathan Kehayias, Joe Sack
  • Full details and registration link HERE

IE1: Immersion Event on Internals and Performance

  • Using T-SQL, CLR, Transactions, Optimizing Procedural Code, XML and more
  • Instructors: Bob Beauchemin 
  • Full details and registration link HERE

Categories:
Classes | Training

Don't expect any activity here for at least a couple of weeks as we're leaving today for a long-overdue dive trip (on the Belize Aggressor) - our first diving for almost 18 months. I'll post lots of photos in December...

In the meantime, please go run the index survey code from my last post - the more data the better!

Cheers

Jonathan Kehayias is such a nice guy. After our recent perf tuning Immersion Event in Chicago last month he had an idea for each of us to pick someone who's attended a SQLskills class and offer general mentoring (career, technical, professional development - whatever) to them for a few hours a month for six months. After six months we each pick someone else. His thinking was that this would be a cool way for SQLskills to give back to the community outside of the purely company-related things we do like blogging and twitter.

I think this is a really great idea. Back when I was at Microsoft I did a lot of mentoring when I managed teams there, plus in the company-wide mentoring program between senior and junior people in different groups. It's very rewarding to provide completely altruistic advice to someone and watch them grow.

Furthermore, we all have had people help us with our careers so it's important to pay that help forward.

So we're doing it, starting today with me, Joe, and Jonathan (Kimberly will join in before the end of the year as well).

Joe is going to mentor Luke Jian, who's attended our IE1 and IE2 classes in Chicago this year:

Luke Jian is a well-versed IT professional with over 15 years of experience in the development and implementation of pharmaceutical and healthcare systems, demand driven supply chain management, ERP systems and technical infrastructure design.

Luke's current role is Sr. Solutions Architect with Physicians Interactive a leading resource for healthcare information, medication samples and medical decision support tools. Luke's experience include managing the IT Operations of over 50 clinics,  Oracle DBA for a supply chain software company, Teaching Assistant and Trainer.

Luke holds a Masters degree in software engineering from “Politehnica” University  of Timisoara, Romania with concentration on relational databases and  is fluent in English, Romanian, French and German. He became a US citizen in June 2011, the same week as his first public speaking engagement at SQL Saturday #82 in Indianapolis.

Luke writes at http://blog.sqlpositive.com  and he can be reached at sensware@gmail.com or on Twitter as @sensware.

Jonathan is going to mentor Steven Ormrod, who's attended our IE1, IE2, IE3, and IE4 classes in Dallas, Chicago, and Bellevue this year:

Steven Ormrod has been working as a Database Administrator for an international purveyor of natural and organic foods for the past several years.  Recently, Steven accepted a position with a global provider of orthotics and prosthetics.  His environment contains hundreds of servers spread across three different countries.  Clustering and consolidation projects have been his primary focus.

He has an MCITP for SQL Server 2008 in Database Administration and Development.

Prior to working as a DBA he has been a software developer, system administrator, and a teacher.  He also spent a summer bartending and hitchhiking across Europe.

When he is not tinkering with technology, he enjoys cooking, traveling, and snorkeling. He blogs at http://sqlavenger.wordpress.com/ and is on Twitter as @sqlavenger.

I'm going to mentor Brad Hoff, who's attended our IE1 and IE2 classes in Chicago this year:

Brad is originally from WA and is working in TX as a Lead SQL and Oracle DBA for an international power development company and energy marketer.

Brad has a degree is in Electronic Engineering and his background includes Development, Network/Systems/SAN admin, and WAN/T-Comm.

When Brad's not working with SQL server, he enjoys philosophy, debate, brain-teasers, learning, anything that challenges his mind. He has a beautiful wife and an awesome 17-month-old son. He blogs at http://www.sqlphilosopher.com/wp/ and is on Twitter as @sqlphilosopher.

I'd like to congratulate these three guys and look forward to us helping them out over the next six months!

Categories:
Career | General

Continuing with my "index health" series, I've got another piece of code for you to run.

This time I'm interested in the number of columns in your clustered indexes and the consequent amount of nonclustered index space used by the clustered index keys.

Again, you're going to be really interested to see the results on your servers. When I editorialize the results I'll provide another query for you to run which will make the data actionable on your server.

Here are some results from a random customer server (yes, we already knew about these - long story :-):

NCIndexes ClusterKeys KeyWidth TableRows            KeySpaceInBytes
--------- ----------- -------- -------------------- --------------------
7         3           16       129902437            14549072944
1         3           12       29199817             350397804
10        2           12       1612919              193550280
5         2           5        4266671              106666775
2         2           8        5887697              94203152
5         4           20       827975               82797500
3         3           16       1215800              58358400
7         2           5        1497746              52421110
1         3           12       2667765              32013180
1         4           25       1033063              25826575
1         3           12       989320               11871840
2         2           8        278989               4463824
1         3           12       293736               3524832
4         2           5        160696               3213920

Feel free to send the results in any format you want - Excel spreadsheet works best though. Try not to add any columns to the result set - complicates the aggregation process.

The more results the better - thanks!

Here's the code:

IF EXISTS (SELECT * FROM tempdb.sys.objects WHERE [name] = 'SQLskillsIKSpace')
    DROP TABLE tempdb.dbo.SQLskillsIKSpace;
GO
CREATE TABLE tempdb.dbo.SQLskillsIKSpace (
    DatabaseID SMALLINT,
    ObjectID INT,
    IndexCount SMALLINT,
    TableRows  BIGINT,
    KeyCount   SMALLINT,
    KeyWidth   SMALLINT);
GO

EXEC sp_MSforeachdb 
    N'IF EXISTS (SELECT 1 FROM (SELECT DISTINCT [name]
    FROM sys.databases WHERE [state_desc] = ''ONLINE''
        AND [database_id] > 4
        AND [name] != ''pubs''
        AND [name] != ''Northwind''
        AND [name] != ''distribution''
        AND [name] NOT LIKE ''ReportServer%''
        AND [name] NOT LIKE ''Adventure%'') AS names WHERE [name] = ''?'')
BEGIN
USE [?]

INSERT INTO tempdb.dbo.SQLskillsIKSpace
SELECT DB_ID (''?''), o.[object_id], 0, 0, 0, 0
FROM sys.objects o
WHERE o.[type_desc] IN (''USER_TABLE'', ''VIEW'')
    AND o.[is_ms_shipped] = 0
    AND EXISTS (
        SELECT *
        FROM sys.indexes
        WHERE [index_id] = 1
            AND [object_id] = o.[object_id]);

UPDATE tempdb.dbo.SQLskillsIKSpace
SET [TableRows] = (
    SELECT SUM ([rows])
    FROM sys.partitions p
    WHERE p.[object_id] = [ObjectID]
    AND p.[index_id] = 1)
WHERE [DatabaseID] = DB_ID (''?'');
 
UPDATE tempdb.dbo.SQLskillsIKSpace
SET [IndexCount] = (
    SELECT COUNT (*)
    FROM sys.indexes i
    WHERE i.[object_id] = [ObjectID]
    AND i.[is_hypothetical] = 0
    AND i.[is_disabled] = 0
    AND i.[index_id] != 1)
WHERE [DatabaseID] = DB_ID (''?'');

UPDATE tempdb.dbo.SQLskillsIKSpace
SET [KeyCount] = (
    SELECT COUNT (*)
    FROM sys.index_columns ic
    WHERE ic.[object_id] = [ObjectID]
    AND ic.[index_id] = 1)
WHERE [DatabaseID] = DB_ID (''?'');

UPDATE tempdb.dbo.SQLskillsIKSpace
SET [KeyWidth] = (
    SELECT SUM (c.[max_length])
    FROM sys.columns c
    JOIN sys.index_columns ic
    ON c.[object_id] = ic.[object_id]
    AND c.[object_id] = [ObjectID]
    AND ic.[column_id] = c.[column_id]
    AND ic.[index_id] = 1)
WHERE [DatabaseID] = DB_ID (''?'');

DELETE tempdb.dbo.SQLskillsIKSpace
WHERE
    ([KeyCount] = 1 AND [KeyWidth] < 9)
    OR [IndexCount] = 0 OR [TableRows] = 0;

END';
GO

SELECT
    [IndexCount] AS [NCIndexes],
    [KeyCount] AS [ClusterKeys],
    [KeyWidth],
    [TableRows],
    [IndexCount] * [TableRows] * [KeyWidth] AS [KeySpaceInBytes]
FROM tempdb.dbo.SQLskillsIKSpace
ORDER BY [KeySpaceInBytes] DESC;

DROP TABLE tempdb.dbo.SQLskillsIKSpace;
GO

Yesterday I blogged about how having too few or too many nonclustered indexes can be a big problem for performance (see here). Today I'm posting some code you can run which will print out the number of indexes for each table in each database on an instance.

I made it print a result set per database so the table in msdb doesn't get too big having to store schema and object names. You can get the code below and here: NCIndexCounts.zip (870.00 bytes).

I also created a version that *does* store names and prints a single result set. You can get that one here: NCIndexCountsSingleResultSet.zip (1.41 kb).

There's probably a niftier way to do this but I'm not a T-SQL expert :-)

Enjoy!

/*============================================================================
  File:     NCIndexCounts.sql

  Summary:  Nonclustered index counts (multiple result sets)

  SQL Server Versions: 2005 onwards
------------------------------------------------------------------------------
  Written by Paul S. Randal, SQLskills.com

  (c) 2011, SQLskills.com. All rights reserved.

  For more scripts and sample code, check out
   
http://www.SQLskills.com

  You may alter this code for your own *non-commercial* purposes. You may
  republish altered code as long as you include this copyright and give due
  credit, but you must obtain prior permission before blogging this code.
 
  THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF
  ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED
  TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
  PARTICULAR PURPOSE.
============================================================================*/

IF EXISTS (SELECT * FROM msdb.sys.objects WHERE [name] = 'SQLskillsPaulsIndexCounts')
    DROP TABLE msdb.dbo.SQLskillsPaulsIndexCounts;
GO
CREATE TABLE msdb.dbo.SQLskillsPaulsIndexCounts (
    SchemaID INT,
    ObjectID INT,
    BaseType CHAR (10),
    IndexCount SMALLINT);
GO

EXEC sp_MSforeachdb
    N'IF EXISTS (SELECT 1 FROM (SELECT DISTINCT [name]
    FROM sys.databases WHERE [state_desc] = ''ONLINE''
        AND [database_id] > 4
        AND [name] != ''pubs''
        AND [name] != ''Northwind''
        AND [name] != ''distribution''
        AND [name] NOT LIKE ''ReportServer%''
        AND [name] NOT LIKE ''Adventure%'') AS names WHERE [name] = ''?'')
BEGIN
USE [?]
INSERT INTO msdb.dbo.SQLskillsPaulsIndexCounts
SELECT o.[schema_id], o.[object_id], ''Heap'', 0
FROM sys.objects o
WHERE o.[type_desc] IN (''USER_TABLE'', ''VIEW'')
    AND o.[is_ms_shipped] = 0
    AND EXISTS (
        SELECT *
        FROM sys.indexes
        WHERE [index_id] = 0
            AND [object_id] = o.[object_id]);

INSERT INTO msdb.dbo.SQLskillsPaulsIndexCounts
SELECT o.[schema_id], o.[object_id], ''Clustered'', 0
FROM sys.objects o
WHERE o.[type_desc] IN (''USER_TABLE'', ''VIEW'')
    AND o.[is_ms_shipped] = 0
    AND EXISTS (
        SELECT *
        FROM sys.indexes
        WHERE [index_id] = 1
            AND [object_id] = o.[object_id]);
           
UPDATE msdb.dbo.SQLskillsPaulsIndexCounts
SET [IndexCount] = (
    SELECT COUNT (*)
    FROM sys.indexes i
    WHERE i.object_id = [ObjectID]
    AND i.[is_hypothetical] = 0)

IF EXISTS (SELECT * FROM msdb.dbo.SQLskillsPaulsIndexCounts)
SELECT
    ''?'' AS [Database],
    SCHEMA_NAME ([SchemaID]) AS [Schema],
    OBJECT_NAME ([ObjectID]) AS [Table],
    [BaseType],
    (CASE
       WHEN [IndexCount] = 0 THEN 0
       ELSE [IndexCount]-1 END)
    AS [NCIndexes]
FROM msdb.dbo.SQLskillsPaulsIndexCounts
ORDER BY [BaseType] DESC, IndexCount;

TRUNCATE TABLE msdb.dbo.SQLskillsPaulsIndexCounts;
END';
GO

DROP TABLE msdb.dbo.SQLskillsPaulsIndexCounts;
GO

Back at the start of August I kicked off a survey (see here) that gave you some code to run to produce an aggregate list of the number of tables on your server with different numbers of nonclustered indexes. I got back results from more than 1000 servers across the world - a big thank you to everyone who sent me data!

It's taken me a while to get to this post because a) I needed a few hours to set aside to aggregate the comments, txt files, and spreadsheets that people sent and load them into a database; and b) I've been really busy with teaching etc. Finally I've had time this week while at SQL Connections in Las Vegas to put together the results and this post.

The winners:

  • Highest number of nonclustered indexes on a single clustered index: 1032
  • Highest number of nonclustered indexes on a single heap: 148
  • Highest number of clustered indexes with zero nonclustered indexes on a single server: 185237
  • Highest number of heaps with zero nonclustered indexes on a single server: 88042

Wow!

Now to some of the details...

Tables with Zero Nonclustered Indexes

The two graphs below show the number of servers that have a certain number of tables with zero nonclustered indexes.

 

For the clustered indexes, there is one case I can think of where having zero nonclustered indexes is acceptable: if all queries return all columns of the table and the query search predicate for all queries is the cluster key (or a left-based subset of the cluster key).

All queries that have a search predicate that does not match the cluster key (or a left-based subset thereof) will be table scans, which can put pressure on the buffer pool (see my post on Page Life Expectancy) and lead to contention on the ACCESS_METHODS_DATASET_PARENT latch (all manifesting as a high percentage of LATCH_EX or PAGEIOLATCH_SH and maybe CXPACKET waits).

For the heaps, all queries are inefficient table scans. Well, efficient if you're returning all the rows in the table every time, I suppose :-)

Bottom line: tables usually need nonclustered indexes to provide efficient access paths to the data requested by the various queries that your workload performs.

There are two things you can do to help find queries that need nonclustered indexes:

  1. Use the missing index DMVs (cautiously!) to determine which nonclustered indexes to create. I use the script posted by Microsoftie Bart Duncan in his blog post. However don't just go create all the indexes there. I generally look for Bart's "improvement_measure" column to be above 100k before I'll consider recommending the index to a client (and on systems that already have nonclustered indexes on the table, I'll look for index consolidation possibilities). Note also that the missing index DMVs will sometimes add the cluster key as an INCLUDEd column. This is unnecessary but harmless.
  2. Look directly in the plan cache to find query plans that perform scans. I use a variant of a query published by fellow MVP Glenn Berry in this blog post. Using the graphical query plan I can see what columns are being searched for and returned from the scans and then create the correct nonclustered indexes for these.

You can also use the Database Tuning Advisor, but I don't use that, personally.

You'll be amazed at the performance difference by having a good set of nonclustered indexes.

But don't go overboard otherwise you could detrimentally affect performance by having too many nonclustered indexes...

Tables with Nonclustered Indexes

The two graphs below show the number of tables that have a certain number of nonclustered indexes.

 

The data shows that it is most common to have 10 nonclustered indexes or less, but even that may be too many.

Every nonclustered index incurs overheard when a table row is inserted or deleted, or when any of the nonclustered index key columns (or INCLUDEd columns) are updated. Filtered indexes in SQL 2008+ are a special case, obviously. The overheard takes a few forms:

  • Buffer pool (i.e. memory and I/O) overhead of having to search the nonclustered index for the record to update.
  • I/O overhead of having to flush the updated index page to disk during the next checkpoint
  • Log space for the log records generated by the operation on the nonclustered index
  • Resource overheard for those log records in terms of:
    • Time to be read by the replication/CDC log reader Agent job
    • Time to be read by log backups (and data backups, if applicable)
    • Time and bandwidth to send the log records to a database mirroring mirror
    • Disk space to store the log records in a log backup
    • Time to restore the log records on a log shipping secondary or during a disaster recovery
  • Locking overhead
  • Page split overhead
  • Time to consistency check
  • Time to examine for fragmentation
  • Time to update statistics
  • Disk and backup space overhead

As you can see, nonclustered indexes can be a big burden on a system - you have to be careful when creating them so that you don't have too many.

There are three things you can do to reduce the number of nonclustered indexes on your system:

  1. Use the sys.dm_db_index_usage_stats DMV to find indexes that are only being updated. I've blogged about this here. Again, be careful though. Just because an index hasn't been used doesn't mean it should be dropped. It may be used only infrequently, but it's critical when it is used. Ideally you need to look at the output from the DMV after an entire business cycle has passed. Even then, be careful about dropping indexes that are enforcing uniqueness constraints as these can be used by the query optimizer without reflecting any user seeks or scans.
  2. Remove duplicate nonclustered indexes. Kimberly blogged code to find duplicate indexes here. There is no downside to doing this.
  3. Look for consolidation possibilities. Kimberly has code to show you all the key and INCLUDEd columns here. This is harder and is more of an art than a science. You're looking for indexes where you can combine two or more indexes into one without affecting the ability of the optimizer to use them for the various queries that the non-consolidated indexes used to help.
    • For example, an index on c1, c2, c3 INCLUDE c4, c5 can be combined with an index on c1, c2, c3 INCLUDE c4, c6. But only as long as c6 isn't a really wide column that would affect the performance of the queries using the first index.
    • A harder example: would you consolidate an index c1, c2 INCLUDE c3 with an index on c1, c3 INCLUDE c2? Possibly. It would depend on what the indexes are being used for in queries. 

Summary

Nonclustered indexes are essential for the performance of most workloads, but how many should you have? I often get someone in a class that Kimberly's teaching on indexes to ask her "what's the optimum number of indexes a table should have?" because I know it's a nonsensical question. (And she reciprocates by getting someone to ask me "how long will CHECKDB take?" :-)

The answer is a big, fat "it depends" - and hopefully I've given you some pointers to figure it out for yourself.

I'll continue this series of posts with more surveys and code that you can use on your systems to gauge the health of your indexes.

Hope this helps!

There's a lot of controversy about the Buffer Manager performance object counter Page Life Expectancy - mostly around people continuing to quote 300 as a threshold for starting to worry about there being a problem (which is just utter nonsense these days). That's far too *low* to be the point at which to start worrying if your PLE dips and stays there. Jonathan came up with a better number to use - based on the size of your buffer pool - see the bottom of his post here.

But that's not why I'm writing today: I want to explain why most of the time now Page Life Expectancy is really not giving you useful information.

Most new systems today use NUMA, and so the buffer pool is split up and managed per NUMA node, with each NUMA node getting it's own lazy writer thread, managing it's own buffer free list, and dealing with node-local memory allocations. Think of each of these as a mini buffer pool.

The Buffer Manager:Page Life Expectancy counter is calculated by adding the PLE of each mini buffer pool and then getting the average.

What does this mean? It means that the overall PLE is not giving you a true sense of what is happening on your machine as one NUMA node could be under memory pressure but the *overall* PLE would only dip slightly. One of my friends who's a Premier Field Engineer and MCM just had this situation today, which prompted this blog post. The conundrum was how can there be 100+ lazy writes/sec occuring when overall PLE is relatively static - and this was the issue.

For instance, for a machine with 8 NUMA nodes, with PLE of each being 4000, the overall PLE is 4000. If one of them drops to 1000, the overall PLE only drops to 3625, which likely wouldn't trigger your alerting as it hasn't even dropped 10%.

On NUMA machines, you need to be looking at the Buffer Node:Page Life Expectancy counters for all NUMA nodes otherwise you're not getting an accurate view of buffer pool memory pressure and so could be missing performance issues. And adjust Jonathan's threshold value according to the number of NUMA nodes you have.

Unfortunately there's no easy way to monitor per-NUMA node lazy writes - maybe in future.

[Edit: thanks to an email thread with Trayce Jordan from Microsoft and Jonathan, you can see the lazywriter activity for each NUMA node by looking for the lazywriter threads in sys.dm_exec_requests.]

Hope this helps!

Yes, it's that time again. We're just about to have the Fall show so its time to start planning for the Spring 2012 show. Once again Kimberly and I will be Co-Chairs of the Conference and we'd like to invite you to submit abstracts for the Spring 2012 SQL Connections conference, to be held in Las Vegas, March 26th - 29th, 2012. Pre-cons will be March 25th, post-cons on March 30th.

The conference will focus heavily on SQL Server 2012. Abstracts are still welcome on best practices for SQL Server 2005 - 2008 R2 and how to upgrade and migrate applications from SQL Server 200x to SQL Server 2008/2008R2. However, we will consider all topics - from design to performance to troubleshooting to architectures to programming. You should aim for the intermediate to advanced audience for any SQL Server 2005-2008 R2 materials and for SQL Server 2012 we'll consider any level - especially depending on the newness of the release.

The themes to consider for your abstracts are:

  • Design & Architecture
  • Performance
  • Troubleshooting
  • Monitoring
  • Best Practices
  • Myths and Misconceptions
  • Understanding/Introduction To...

For the session level, use the following examples as a guideline:

  • 100 - Beginner (e.g. what does 'corruption' mean?
  • 200 - Intermediate (e.g. what do I do when corruption is detected?)
  • 300 - Advanced (e.g. how do I do take advantage of partial database availability and online piecemeal restore?)
  • 400 - Master (e.g. how can I fix broken system tables using the DAC and server single-user mode?)
  • 500 - SQL Server Internals (e.g. how does the read-ahead in DBCC CHECKDB differ from regular adaptive range-scan read-ahead?)

For submitting session abstracts, please use this URL: http://www.deeptraining.com/devconnections/abstracts

The tool will be open from now until to November 1st, after which we won't accept any abstracts - no exceptions. If you have used this site before and have forgotten your password, you can have your password emailed to you. It's better to do this then to create a new account.

Please keep the abstracts under 200 words each and in one paragraph. Please do not use bulleted items or line breaks, and please use a spell-checker. If you have an issue with the site, please follow up with an email to paul@sqlskills.com.

Please create a new speaker record (if you haven't submitted before) and submit at least 3 abstracts; however, it will help your chances of being selected if you submit 5 or 6 abstracts. Please do not submit abstracts for sessions that you have previously presented at the Spring 2011 show. We must accept a minimum of three of your abstracts for you to be considered as one of the speakers - there are 7 open speaker slots, not including Kimberly and myself.

What you will get if selected:

  • $500 per conference talk. (Additional compensation for pre/post conference workshops.)
  • Coach airfare and hotel stay paid by the conference
  • Free admission to all of the co-located conferences
  • Speaker party
  • The adoration of attendees
  • etc. 

Thanks for your continued support for our SQL Connections conference and we look forward to seeing your bright and shiny *new* abstracts!

Thanks,
Kimberly L. Tripp & Paul S. Randal
SQL Connections Conference Chairs

(Comments are disabled for this post - send email with questions.)

Categories:

We've started to confirm our 2012 classes, and we're coming to the south-east of the US by popular demand in Spring 2012! We're teaching three 5-day classes back-to-back in Tampa, FL so not only can you get your learning on, you can also escape the winter weather and get your shorts and sunglasses on too!

We don't allow anyone else to license and teach our material so you're guaranteed the best instruction possible - unbeatable ROI for your training budget!

All classes are 5 days from 8.30 to 5.30 and include catered breakfast and lunch every day. The early bird price for each class is US$2,995.

We hope to see you there!

Here are the details:

Tampa, FL, February 27-March 2, 2012

IE1: Immersion Event on Internals and Performance

  • Data Storage Internals, Designing for Performance, and Indexing for Performance 
  • Instructors: Paul S. Randal, Kimberly L. Tripp
  • Full details and registration link HERE

Tampa, FL, March 5-9, 2012

IE2: Immersion Event on Performance Tuning

  • IO Subsystems, Workload Analysis, and Performance Tuning Methodologies 
  • Instructors: Paul S. Randal, Kimberly L. Tripp, Jonathan Kehayias, Joe Sack
  • Full details and registration link HERE

Tampa, FL, March 12-16, 2012

IE3: Immersion Event on High Availability and Disaster Recovery

  • HA/DR strategy, replication, clustering, mirroring, virtualization, consolidation, corruption 
  • Instructors: Paul S. Randal, Kimberly L. Tripp, Jonathan Kehayias, Joe Sack
  • Full details and registration link HERE

Categories:
Classes | Training

Days like this don’t come around very often for us here at SQLskills.com – we’re expanding again! It’s been seven months since Jonathan Kehayias came on board and we’ve had a huge amount of fun together. Now business has expanded to the point where we need to grow again so it’s time to hire the next member of our close-knit, expert team.

Specifically, we’ve asked Joseph Sack to join us and we’re extremely pleased that he accepted our offer. He’ll become employee #4 when he starts with us on Monday, October 3rd.

 

Joe has worked in the SQL Server space since 1997, and we first met him back in 2006 when he was a Premier Field Engineer at Microsoft and he was in one of the early SQL Server 2005 Microsoft Certified Master (MCM) rotations – which he passed! He then went on to get the SQL 2008 MCM certification and finally took over responsibility for the whole SQL Server MCM program within Microsoft from 2009 to 2011.

It was while Joe was running the SQL MCM program that we got to know Joe really well and gain an appreciation for his extensive SQL Server knowledge and enterprise consulting expertise, as well as his passion for developing content and helping people learn – whether as a consultant or as a teacher.

In fact we respect Joe so much that recently we asked if he’d like to get back into the wild and varied consulting life by leaving Microsoft to join us. We feel honored that he agreed. He brings a wealth of performance troubleshooting, development, scalability and architecture expertise to the team and maintains the high bar we have of all team members either having MCM certification or being MCM instructors (or both!).

With our increased full-time consulting team we’re now able to meet the burgeoning demand for our services, including our new Remote DBA Service that we announced earlier this week. Joe will be blogging on SQLskills.com and hanging out on Twitter (@JosephSack) and in the community with the rest of us.

As you can tell, we’re very excited to have Joe join our team!

Thanks as always,

Paul and Kimberly

Categories:
Consulting | General

I bet you'd love the answer to be "SQLskills.com"? Well now it can be.

Over the last few months some of our clients for whom we've performed SQL Server health checks on their critical servers have asked us if we'd consider a regular service where we perform mini-health checks on those same servers and also permanently monitor the servers for anything untoward happening.

Now that we're expanding our team of world-class SQL experts again (more details on Friday!) we have the capacity to do this, so today I'm announcing a new "remote DBA" service.

Once we've completed an initial health check of the SQL Server instances you'd like us to look after, we'll install a monitoring package that will alert you and us by email if anything out-of-the-ordinary happens (in response to which we'll log in and see what's up) and on a regular interval (weekly, bi-weekly, monthly, or quarterly - whatever works for you and your budget) we'll log in and perform a mini health check to ensure that things are ticking along nicely. We can log in with or without you being present - completely up to you - and of course we'll be under complete NDA. And we can do it anywhere in the world.

We're the company that shows people how to be great DBAs and we solve the nasty performance/SAN/design problems that others give up on.

Why not take the pressure off, get some peace of mind, and let us be *your* DBAs?

Exorbitantly expensive? Not at all - you're paying for hours we work, not a massive retainer.

If you're interested in finding out more details, send me an email.

It may just be the best investment you make all year.

This is a question that came up today on Twitter, and is actually something I've been meaning to blog about.

One of the biggest space hogs in tempdb can be DBCC CHECKDB. It generates all kinds of information about what it's seeing in the database (called facts) and stores them in a giant worktable. A fact may be something like 'we read page F' or 'record X on page Y points to an off-row LOB column in record A of page B' or it could be something like an entire IAM page bitmap. It is usually the case that the amount of memory required for the worktable is more than is available to store it in memory and so the worktable spills out to tempdb.

CHECKDB needs to use this fact generation method because it doesn't read the data file pages in any kind of logical or depth-first order - it reads them in allocation order, which is the fastest way. In fact it spawns multiple threads and each thread reads a set of pages, which is why I/O performance is sucked down while CHECKDB is running - its special readahead drives the I/O subsystem as hard as it can. As each thread is generating all the facts, it hands them to the query processor which sorts them by a key CHECKDB defines (page ID, object ID, index ID etc) and inserts them into the worktable.

Once fact generation has finished, the query processor then gives the facts back to CHECKDB again so that it can match them up (e.g. page X points to page Y, so we better have seen page Y) - called the aggregation phase. If any mismatched or extra facts are found, that indicates a corruption and the relevant error is generated.

Here's a picture of the process:

Now - because CHECKDB can use up so much tempdb space, we put in a way to ask CHECKDB to estimate how much tempdb space will be required - it's called WITH ESTIMATEONLY. The output looks something like:

Estimated TEMPDB space needed for CHECKALLOC (KB)
-------------------------------------------------
56
(1 row(s) affected)

Estimated TEMPDB space needed for CHECKTABLES (KB)
--------------------------------------------------
3345
(1 row(s) affected)

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

This isn't the total amount of space required to check the entire database, because CHECKDB never checks the entire database in one go. To try to limit the amount of tempdb space required, it breaks the database down into batches. Batches are built by adding in more and more tables until one of the following limits is reached:

  • There are 512 or more indexes in the batch
  • The total estimate for facts for this batch is more than 32MB

The smallest possible batch is one table and all its indexes - so a very large table may easily blow the 32MB fact limit (or theoretically the 512 index limit).

The fact size estimation is calculated by looping through all partitions of all indexes3 for a table (remember a non-partitioned table or index is a special case of a single partition as far as SQL Server is concerned) and adding up:

  • Twice the sum of all pages allocated to the partition (HoBt, LOB, and SLOB)
  • Three times the number of HoBt pages in the clustered index
  • Two times the number of LOB columns in the table
  • Two times the number of tables rows, if a heap
  • Maximum row size times the number of HoBt pages

And these totals are multiplied by the sizes of the relevant facts.

The WITH ESTIMATEONLY option runs through all the batches and spits out the largest total fact estimate from all the batches. It's supposed to be a very conservative estimate, but certain pathological cases can trip it up as it's can't account for all possible schemas.

One more piece of knowledge I can safely page-out of my head now! :-)

PS Beware also that I've heard of several cases of a bug in SQL Server 2008 R2 where the output is incorrectly very low. The dev team is aware of this issue and are working on it.

Theme design by Nukeation based on Jelle Druyts