Bert Wagner (b/t) is hosting T-SQL Tuesday #104. The invitation is to write about code you’ve written that you would hate to live without. For me, this is almost a no-brainer!
My DMV Diagnostic Queries represent a lot of code that I would hate to live without. I use them on a daily basis to gather information about SQL Server instances and databases and to help more quickly understand what configuration and performance issues they have. I’ve been publicly posting these queries since 2009, but I actually started developing them for my own personal use back in 2006. The story about how they came about is kind of interesting…
Back in about August of 2006, I was the sole DBA for NewsGator, which was (at that time) an RSS aggregation company. Our main product/service was the ability to let people “subscribe” to RSS feeds for web sites and blogs, and then have us download the RSS feeds of those sites. We would also manage the “read state” of the RS feeds that you subscribed to, so that as you read through your subscribed content and marked posts as read, we would synchronize your progress across different devices.
I had only been at the company about three months, and we had recently migrated from 32-bit SQL Server 2000 to 64-bit SQL Server 2005 SP1 on a two-node FCI running on new hardware. Performance had been pretty good since the migration, and it was about 4:30PM on a Friday afternoon, when I started making some final quick checks of the health of my instance before getting ready to leave for the weekend.
I noticed that my CPU utilization was running about 90-95%, which was much higher than normal. I tried a few of my standard DBA tricks (at that time) to correct the issue, such as running sp_updatestats, running DBCC FREEPROCCACHE, etc. with no real improvement. I even took the emergency step of “shutting down” the content servers (which were application servers that downloaded the RSS feeds, that typically generated about 90% of my database load). This had no appreciable effect on my CPU utilization.
By now, I was getting worried, since we had a problem that I did not immediately know how to diagnose and correct. By this time, our support team and many of the senior executives in the company were aware that we had a problem because our applications were starting to time out and throw errors. I had a literal parade of different people coming to my desk asking some variation/combination of “What’s wrong with the database?” or “What can we do to help?”.
This got so bad that the CTO/Founder of NewsGator (Greg Reinaker) grabbed a large rolling whiteboard, and wrote something like “Glenn knows there is a problem. He is working on it. Please leave him alone”, which was actually pretty helpful.
So after some time, it ended up being just me and the best developer on the Platform Team (Jeff Tingley) staying late into the night and next morning, on a call with Microsoft Premier Support trying to diagnose and troubleshoot the issue. Eventually, we figured out that our problem was mainly caused by parameter sniffing in one stored procedure where we were getting one very inefficient plan stuck in the plan cache.
The short-term fix was to use a local variable to store an input parameter for that stored procedure to disable parameter sniffing for that stored procedure, and to periodically recompile a few other stored procedures that were also part of the problem. Jeff and I finally left around 3AM, with the system being relatively stable. I was exhausted from the time and the stress of feeling like the fate of the company rested on my shoulders. I was convinced that I was in big trouble and was possibly going to be fired since it had taken us so long to figure out the problem. Little did I know…
As it turned out, my boss’s boss decided to give both Jeff and I a $500.00 bonus, plus we got a big round of applause at a company meeting the next Monday (which I appreciated much more).
This incident was the genesis of my DMV Diagnostic queries. I never wanted to be in that situation again! Anytime there was any application slowdown, people always assumed that it was a database problem (which was not always the case). Having a set of queries that I could run to figure out what was going on with the database and database server was the key to being able to answer the “What’s wrong with the database?” question.
Many thousands of people around the world use my queries on a regular basis, and seem to find them very useful, at least based on the feedback I have gotten over the years. Now you know the story of how they came into being.
3 thoughts on “T-SQL Tuesday #104: Code I Have Written That I Would Hate to Live Without”
May I ask which query out of your DMV Diagnostic Queries would uncover or diagnose the parameter sniffing problem that you mention in the story? Cause I’m not aware of any.Thank you.
A number of the queries in the set would have been useful to diagnose that problem. For example, there are queries that show the most expensive queries and stored procedures for total worker time. There is one that shows elapsed times, with execution time variability. With SQL Server 2016 and newer, we can use a couple of queries that leverage Query Store to find something like that.
Loved reading that story Glenn, thank you for your dedication to your DMV Diagnostic Queriesand for the SQL Family at large. They have helped me many times.