If you don’t already know it, I love query the plan cache in SQL Server to identify problems that might exist in a server. I’ve blogged a number of ways to use plan cache data to identify and diagnose problems the last few years:
Finding what queries in the plan cache use a specific index
Tuning ‘cost threshold for parallelism’ from the Plan Cache
Finding Implicit Column Conversions in the Plan Cache
Digging into the SQL Plan Cache: Finding Missing Indexes
Last week on the forums a question was asked about how to use Extended Events to identify queries that took excessive amounts of time to compile. What intrigued me was that the person asking the question claimed to have done this type of analysis with SQL Trace, so in 2012, that should mean that we can do it with Extended Events, right? As Borat would say, not so much!
After seeing the events being used I took a look at the Event map, and while there is a loose mapping between the trace events that were being used to Extended Events, not all the same data is generated. The primary event in trace that was being used for this is the Performance Statistics event, which is has a trace event id of 165. If we lookup this event in the sys.trace_xe_event_map table in SQL Server 2012, we’ll find that it maps to three different events in Extended Events; query_cache_removal_statistics, query_pre_execution_showplan, and uncached_sql_batch_statistics.
Immediately, I said to myself, “No, this isn’t possible in Extended Events.” simply because the overhead of turning on query_pre_execution_showplan, will kill performance of any production server (see http://connect.microsoft.com/SQLServer/feedback/details/732870/sqlserver-query-post-execution-showplan-performance-impact which is for the query_post_execution_showplan event, but all of the showplan events have the same impact). Looking at the columns for the other two events, left me basically in the same mindset, you can’t do this with Extended Events. The uncached_sql_batch_statistics event only captures the statement that wasn’t cached, and the query_cache_removal_statistics only gives you the execution statistics, not the parse and compile times. Even if you were to turn on the query_pre_execution_showplan event, it doesn’t provide you with the CPU and duration columns for output, so it is not going to be useful here.
To be honest, I didn’t think that this was going to be possible, but then I thought about some of the fun we had in our Immersion Event week two in London looking at query compile issues, and I remembered that I wrote a parser between modules to pull compile time information from the showplans in the plan cache. After searching through some scratch code files in my user profile I found a relevant one and started playing around a little with the XML parsing. The result is the below query, which identifies the TOP 10 most expensive to compile statements in the cache currently, and includes the query stats for the statement, based on the query_hash, though it could be easily modified to use query_plan_hash to match on instead.
-- Find high compile resource plans in the plan cache SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan') SELECT TOP 10 CompileTime_ms, CompileCPU_ms, CompileMemory_KB, qs.execution_count, qs.total_elapsed_time/1000 AS duration_ms, qs.total_worker_time/1000 as cputime_ms, (qs.total_elapsed_time/qs.execution_count)/1000 AS avg_duration_ms, (qs.total_worker_time/qs.execution_count)/1000 AS avg_cputime_ms, qs.max_elapsed_time/1000 AS max_duration_ms, qs.max_worker_time/1000 AS max_cputime_ms, SUBSTRING(st.text, (qs.statement_start_offset / 2) + 1, (CASE qs.statement_end_offset WHEN -1 THEN DATALENGTH(st.text) ELSE qs.statement_end_offset END - qs.statement_start_offset) / 2 + 1) AS StmtText, query_hash, query_plan_hash FROM ( SELECT c.value('xs:hexBinary(substring((@QueryHash),3))', 'varbinary(max)') AS QueryHash, c.value('xs:hexBinary(substring((@QueryPlanHash),3))', 'varbinary(max)') AS QueryPlanHash, c.value('(QueryPlan/@CompileTime)', 'int') AS CompileTime_ms, c.value('(QueryPlan/@CompileCPU)', 'int') AS CompileCPU_ms, c.value('(QueryPlan/@CompileMemory)', 'int') AS CompileMemory_KB, qp.query_plan FROM sys.dm_exec_cached_plans AS cp CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS qp CROSS APPLY qp.query_plan.nodes('ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS n(c) ) AS tab JOIN sys.dm_exec_query_stats AS qs ON tab.QueryHash = qs.query_hash CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st ORDER BY CompileTime_ms DESC OPTION(RECOMPILE, MAXDOP 1);
Running this against a couple of test servers, I found a couple of queries that had 12 second compile times with 3 second execution durations. Now the challenge becomes testing if the query can be simplified to reduce the time it takes the optimizer to compile it, without reducing execution performance. The above code should work on any version of SQL Server from 2005 Service Pack 2 onwards, since the CompileTime, Compile CPU, and CompileMemory attributes were added in Service Pack 2. I don’t have a SQL Server 2005 instance to test it on, but I did test this on SQL Server 2008 and 2012.
This only works on 2008 and higher because the query fingerprints in dm_exec_query_stats weren't in SQL Server 2005. Using plan_handle will get you close, but that only tells you the plan had a high compile cost statement in it. You will have to dig into the plan to see which specific statement, which I am trying to avoid with the code above. I'll have to figure out another way to match these up based on the XML and blog the 2005 solution.
16 thoughts on “Identifying High Compile Time Statements from the Plan Cache”
SQL2005EE-CU4 (9.0.5266) doesn’t like the query. It stumbles over the lack of query_hash and query_plan_hash in the sys.dm_exec_query_stats DMV.
I’ll gladly be you guinea pig for 2005…
I tested the above query on some SQL Server 2005 (SP2 and SP3) instances.
As the query_hash and query_plan_hash fields were only added in SQL Server 2008 the query needs some minor adjustments. The references to query_hash and query_plan_hash need to be removed and the join to sys.dm_exec_query_stats needs to change to:
ON tab.plan_handle = qs.plan_handle
The query will then produce similar results
I knew there would be a catch in 2005 that would make that not work, I totally forgot that query fingerprints was introduced in SQL Server 2008. I intentionally avoided using plan_handle because it won’t give good results for multi-statement batches or procedures since the plan_handle will repeat for each statement even though only one of them could be problematic. To work around this, since you guys are willing to test it out, we can grab the StatementId from the XML and the use ROW_NUMBER() to assign a StatementId to the dm_exec_query_stats entries ordered by the statement_start_offset to match just the problem statement:
— Find high compile resource plans in the plan cache
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT TOP 10
qs.total_elapsed_time/1000 AS duration_ms,
qs.total_worker_time/1000 as cputime_ms,
(qs.total_elapsed_time/qs.execution_count)/1000 AS avg_duration_ms,
(qs.total_worker_time/qs.execution_count)/1000 AS avg_cputime_ms,
qs.max_elapsed_time/1000 AS max_duration_ms,
qs.max_worker_time/1000 AS max_cputime_ms,
SUBSTRING(st.text, (qs.statement_start_offset / 2) + 1,
WHEN -1 THEN DATALENGTH(st.text)
END – qs.statement_start_offset) / 2 + 1) AS StmtText,
c.value(‘(@StatementId)’, ‘int’) AS StatementID,
c.value(‘(QueryPlan/@CompileTime)’, ‘int’) AS CompileTime_ms,
c.value(‘(QueryPlan/@CompileCPU)’, ‘int’) AS CompileCPU_ms,
c.value(‘(QueryPlan/@CompileMemory)’, ‘int’) AS CompileMemory_KB,
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS qp
CROSS APPLY qp.query_plan.nodes(‘http://ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple’) AS n(c)
) AS tab
ROW_NUMBER() OVER (PARTITION BY plan_handle ORDER BY statement_start_offset) AS StatementID,
FROM sys.dm_exec_query_stats) AS qs
ON tab.plan_handle = qs.plan_handle
AND tab.StatementID = qs.StatementID
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
ORDER BY CompileTime_ms DESC
OPTION(RECOMPILE, MAXDOP 1);
Let me know if that doesn’t work for some reason and thanks for testing it on 2005.
Other than a couple of minor errors (missing the http:// in the XML Namespaces declaration and stil including the query_hash and query_plan_hash fields in the select list the revision runs fine against SQL 2005 SP2 and SP3.
Thanks for update.
It is fixed above.
Wish I’d had this a couple of months ago. I wrote a query to help me convince my boss that optimize for adhoc workloads was a good idea. And of course one the pieces of information I needed was total compile time for adhoc queries and total compile time for queries with more than 1 use.
Hi Jonathan. I’m quite new on SQL Server. I’m a tuning enthusiast.
I’d like to know (I’m sorry if it is a stupid question) if it is possible to compare a query duration and a compileTime. What I mean is if it possible to say that a query that spends Nms executing, Y% of the Nms was spent as CompileTime.
Using your query in the post, it would be like: CompileTime_ms / (CompileTime_ms + duration_ms)
That is true?
Can you try using the sys.dm_exec_text_query_plan for get the results?
With this functions you can pass statement_start_offset and statement_End_offset and get direct plan for a given statement.
when I use “With recompile” this script not work, because script not in plan cache, is there way ?
I don’t understand the question you are asking unfortunately. Using WITH(RECOMPILE) for this script just keeps this scripts plan from being stored in the cache, it doesn’t affect other plans already in the cache.
This is about the 20th time or so that I’ve cited this fine post and have used the method myself to solve some of the damned problems like why a screen that uses a 100ms stored procedure takes 2 to 22 seconds (average of 18 seconds) to return and thought it was about time for me to say…
Dear Mr. Jonathan Kehayias… thank you for the incredible script. It’s made me look the part of a hero on more than one occasion and I’ve referred many others to it, Very well done and thank you, good Sir!
Hi Jonathan, has your version of this query evolved at all since you posted? This view of the query cache data is both very useful and original, and if you’ve made improvements, I’d love to use the latest & greatest.
Nope, it’s still the same.
If only there was a performance counter (or something in sys.dm_os_wait_stats/sys.dm_exec_session_wait_stats that showed aggregate data for this.
‘set statistics time on’ looks promising. I’ll have a play in our data access layer and see if I can at least do client side aggregation of this data.
Do you know if there is a substantial performance impact in this approach regarding collection on the server of compile time?
If you use Query Store, the sys.query_store_query DMV has the avg_compile_duration, avg_bind_duration, avg_bind_cpu_time, avg_optimize_duration, avg_optimize_cpu_time for the statements.