Last week I visited the .Net User Group in NY where .NetRocks was recording as part of their Visual Studio Road Trip…
What a great time and a great group! Always fun visiting NY but even more fun when I present to a group that really gets into the topic. I guess I had something to do with it having chosen procedures and recompilation to kick off my part of the discussion… But, still, a fun group for sure! And, why did I choose stored procedures and recompilation?
Every developer that works with SQL Server has to access their data in SOME way… how? Adhoc SQL, prepared statements (like sp_executesql) or stored procedures. To be honest, none are perfect. You shouldn’t always use adhoc. You shouldn’t always use prepared statements… And, dare I say – you shouldn’t always use stored procedures? In fact, I kicked off the evening with the starting statements that SQL Server is a general purpose RDBMS. You can do anything. But, does that mean that each feature is perfect for every use, all the time? Basically, what I said is that you should never say always and never say never. ;-)
Having said that, I do – strongly – believe that you can be the most successful using stored procedures. But, that’s predicated on the fact that you understand how they work. It’s predicated on the fact that you understand that recompiling a plan is NOT always a bad thing. Why? Because SQL Server “sniffs” the parameters passed and chooses the execution plan based on those parameters. It’s that plan (defined by those parameters) that gets saved (in cache) and reused for subsequent executions. If that plan is not good for ALL executions then you start to have parameter sniffing problems.
The end result – reusing a plan is not always good and recompilation is not always bad.
So, what can you do?
To be honest, this is a HUGE discussion and there are LOTS of tangents. In IE2 (our Immersion Event on Performance Tuning), I spent an entire day on the plan cache and optimizing for procedural code. But, for my pre-session (prior to recording .NetRocks), I chose to discuss ONE set of options that can be VERY helpful to reduce parameter sniffing problems. This discussion was around statement-level recompilation and SQL Server offers 3 special things that you can add to a statement to define how its plan should be handled. There are still other things that could change the behavior but simply put, I’ll go through 5 different behaviors here:
- Default behavior
- OPTION (RECOMPILE)
- OPTION (OPTIMIZE FOR (@param = value))
- OPTION (OPTIMIZE FOR UNKNOWN)
- Using variables to obfuscate parameters
And, I have a script that will help you to go through these different scenarios. Most importantly, do not end up using ONE of these ALL the time. Remember, ALWAYS is NOT the right way to deal with performance problems.
Having said that, I know all of you have A LOT to deal with. So, where do you start? How do you begin?
First, and foremost, do your analysis on the top 10 stored procedures that essentially meet these criteria:
- The performance of the stored procedure wildly varies (from only a second to minutes – or at least fast to not fast as all). And, maybe it’s more like this: MOST of the time the procedure runs well but occasionally the performance really tanks. As a result, you UPDATE STATISTICS and that seems to solve the problem. Hmmm… in actuality, it might not have been the statistics that were the problem. A side-effect (most of the time) of updating statistics, is that the plans associated with them are invalidated. On next execution a new plan will be generated (after sniffing the parameters). And, if the next execution uses a more typical parameter then a more typical plan will be generated. This might be why MOST of the time it seems to be fine. Next time, instead of updating stats, consider doing sp_recompile procname. This will invalidate the proc’s plan. If this works, then you know that you need to look more closely at how that plan gets generated and whether or not it’s even good to save that plan.
- The stored procedure returns wildly varying results sets (sometimes it returns only a few rows, other times it returns thousands [or tens of thousands] of rows)
- The stored procedure is used frequently and the stored procedure has at least one parameter (in many cases the worst performing procs are those that have many parameters)
Once you know that you have a problem investigate what should be recompiled. In general, you want to recompile the smallest amount possible to solve the problem. But, how do you know what should be recompiled? Testing!
Usually, I’ll test a procedure running multiple executions, each with different parameter values that generate wildly different result sets, and I’ll execute each of these using WITH RECOMPILE. Specifically, it will look like this:
EXEC procedure @param1 = value, @param2 = value, @paramn = value WITH RECOMPILE
When a bunch of these are executed, I’ll review their graphical plans. What I’m looking for is the most expensive statement and whether or not it has the SAME plan across the different parameters used. To be honest, you don’t even care what the plan is but you do care if it varies. To OPTIMALLY find the data for a query that returns 1 row the plan might be very different from a query that returns 10,000 rows. And, if the OPTIMAL plans vary then maybe it’s not a good idea to save the plan (which is the default behavior). And, this is what leads to parameter sniffing problems (PSP).
Instead of letting the default behavior just happen, you have a few options.
Using OPTION (RECOMPILE) [available in SQL Server 2005 and higher]
This can be a great way of telling SQL Server NOT to save a plan for a particular statement. However, this causes EVERY execution to go through recompilation (which has a negative cost as well, mostly in CPU but this can also translate to time). So, you don’t really want to do this for everything. Do NOT let this become a crutch (or a “go to” option) that gets used anytime there’s a problem. Use this sparingly. But, this can be a FANTASTIC way to deal with PSP.
Using OPTION (OPTIMIZE FOR (@param = value)) [available in SQL Server 2005 and higher]
Because the cost of recompilation can become a problem, you might want to choose an alternative. In this case, you can tell SQL Server NOT to sniff the parameter(s) passed in and instead, use parameters that you supply (and define INSIDE the stored procedure). This can reduce the cost of recompilation but be careful, you have to make sure you choose parameters that are really good for ALL executions (or, at least the executions that are either the most likely or the most time critical). This is incredibly powerful but could become problematic down the road as data changes. But, I like this option MORE than I like hard-coded plans because this option WILL change as the data/statistics change. So, this might be a great option to consider.
OPTION (OPTIMIZE FOR UNKNOWN) [available in SQL Server 2008 and higher]
I remember when this option first came out. I remember thinking – what do they mean “unknown.” Do they “guess?” That just didn’t seem right… Nope, it’s not a guess. But, it’s not going to be overly obvious to most because it requires a deeper understanding statistics in SQL Server. See, parameter sniffing really translates into – they use the histogram to estimate rows which in turn is used to pick the plan. What optimize for unknown does is it doesn’t use the histogram. They do NOT look at the parameters passed in and instead they rely on something else called the density vector. Simply put, the DV is the AVERAGE. So, instead of looking up how many rows they specifically think the parameters you’ve passed in will return, they look at the average number of rows returned for that column. Then, they create a plan with those numbers. The idea is that this will give you an average plan rather than a plan tied to specific parameters that might be anomalies. But, this can work phenomenally when your data is either evenly distributed OR the average really does work well for the normal parameters used. If you have heavily skewed data then this might not give ideal results.
Using variables to obfuscate parameters [available in any version really … ]
This is not an official way to deal with the problem but some folks have found that using this “works” and/or solves their problems. What is this doing? Well… it’s actually doing EXACTLY the same thing as OPTIMIZE FOR UNKNOWN. During compilation the value of a variable (as opposed to a parameter) is unknown. If they don’t know the value, what can they use? They really don’t have much of a choice except to use the average.
OK… so, now – how do you see all of this in action?
(1) Check out my 30 minute presentation from the user group. Here’s the presentation itself: dnrRoadTripp_StoredProcs.pdf (518.36 kb)
(2) Then, consider checking out the video (this isn’t super high quality as it was just from one of the attendees that was at the presentation but it turned out pretty well actually): http://neuronspark.com/optimizing-procedural-code/
(3) Next play with the demo script:
(3a) First, download the sample credit database for 2000 and higher here: https://www.sqlskills.com/PastConferences.asp. Use the 2000 version for 2000, 2005 or 2008. Use the 2008 version for 2008, 2008R2 and 2012.
(3b) Then, use this script to walk through these options:
RecompilationParameterSniffing&Unknown.sql (4.58 kb)
And… that should be it for this one!
If you want to hear the offical .NetRocks show that was recorded AFTER this lecture/discussion, check out .NetRocks.com. And, I’ll post a link here once it’s been published (which should be later this week). And, if you want to hear more of my fun times with DNR, check out some of our past shows. I blogged a list of our past shows here: Getting ready for DotNetRocks tonight (in NYC).
Finally, if any of you are in Vegas next week – I’m delivering an ENTIRE day (Monday, October 30) on Optimizing Procedural Code as a preconference workshop at SQLConnections.
Thanks for reading!
9 thoughts on “Stored procedures, recompilation and .NetRocks”
How do the subjects covered here and in the .net rocks show apply to SQL CE 3.1. We have SQL server 2000-2008 databases that sync data using RDA to SQL CE on ARM based devices. Will things such as not using random or non sequential data as the clustering key help when it gets to CE.
Currently looking at a similar issue along these lines. I went to use your demo scripts you have listed and needed a PW to unzip your scripts. Also noticed the RecompilationParameterSniffing&Unknown.sql (4.58 kb) link does not offer anything.
Was this by design? :-)
Link is fixed – thanks
Stay out of the sun, I’ve had my share of displastic lesions removed from my body.
Another fair skinned DBA, but I despise the sun.
Yeah, I hear you. But, I will NEVER stay “out” of the sun. I absolutely love the sun. I love the way it feels on my face (sadly). But, now, I’m layering on the SPF 110. ;-) Slather on the sunscreen and bring it on! I love the ocean too much to avoid the sun. I just need to be smarter. ;-)
But, living in Seattle does keep me out of the sun for a fair portion of the year…….. ;-)
I have noticed an issue in production where a procedure is being recompiled at a very high frequency. Although it has not posed any problems directly to the system stability, but it has the potential to cause disruption on higher load. in that procedure I used temporary tables where I also used dynamic sql to add the new column in temporary table. Please provide me some solution.
To be honest, there really isn’t enough information here to understand the issue or provide a better solution. Why are you using dynamic SQL to add a column to a temp table? Why not just create the temp table with the column in it? Regardless, temp tables have issues with statistics and compilations but the issues were far greater in 2000 than 2005 and higher. So, I’d also need to know the version of SQL.
Just to add – recompilation is not a terrible / always-bad thing. In some cases it’s required (like in the case of temp tables). But, if you are going to execute this a tremendous number of times then there might be alternatives using a permanent table and something that’s sessions specific (like @@spid) to identify the rows for that session. However, this can also become expensive to manage correctly (more so than the recompilation).
So… I’m kind of back to the start. I don’t really know enough of your scenario to be able to offer you a better solution. Sorry about that!
(oh, and sorry for the delay… I’ve been hammered with SPAM and so I lost track of a few comments that I’m just now getting around to… hopefully you’ll see this and we can move forward!)