If you don’t already know it, I love query the plan cache in SQL Server to identify problems that might exist in a server. I’ve blogged a number of ways to use plan cache data to identify and diagnose problems the last few years:
Finding what queries in the plan cache use a specific index
Tuning ‘cost threshold for parallelism’ from the Plan Cache
Finding Implicit Column Conversions in the Plan Cache
Digging into the SQL Plan Cache: Finding Missing Indexes
Last week on the forums a question was asked about how to use Extended Events to identify queries that took excessive amounts of time to compile. What intrigued me was that the person asking the question claimed to have done this type of analysis with SQL Trace, so in 2012, that should mean that we can do it with Extended Events, right? As Borat would say, not so much!
After seeing the events being used I took a look at the Event map, and while there is a loose mapping between the trace events that were being used to Extended Events, not all the same data is generated. The primary event in trace that was being used for this is the Performance Statistics event, which is has a trace event id of 165. If we lookup this event in the sys.trace_xe_event_map table in SQL Server 2012, we’ll find that it maps to three different events in Extended Events; query_cache_removal_statistics, query_pre_execution_showplan, and uncached_sql_batch_statistics.
Immediately, I said to myself, “No, this isn’t possible in Extended Events.” simply because the overhead of turning on query_pre_execution_showplan, will kill performance of any production server (see http://connect.microsoft.com/SQLServer/feedback/details/732870/sqlserver-query-post-execution-showplan-performance-impact which is for the query_post_execution_showplan event, but all of the showplan events have the same impact). Looking at the columns for the other two events, left me basically in the same mindset, you can’t do this with Extended Events. The uncached_sql_batch_statistics event only captures the statement that wasn’t cached, and the query_cache_removal_statistics only gives you the execution statistics, not the parse and compile times. Even if you were to turn on the query_pre_execution_showplan event, it doesn’t provide you with the CPU and duration columns for output, so it is not going to be useful here.
To be honest, I didn’t think that this was going to be possible, but then I thought about some of the fun we had in our Immersion Event week two in London looking at query compile issues, and I remembered that I wrote a parser between modules to pull compile time information from the showplans in the plan cache. After searching through some scratch code files in my user profile I found a relevant one and started playing around a little with the XML parsing. The result is the below query, which identifies the TOP 10 most expensive to compile statements in the cache currently, and includes the query stats for the statement, based on the query_hash, though it could be easily modified to use query_plan_hash to match on instead.
-- Find high compile resource plans in the plan cache SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; WITH XMLNAMESPACES (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan') SELECT TOP 10 CompileTime_ms, CompileCPU_ms, CompileMemory_KB, qs.execution_count, qs.total_elapsed_time/1000 AS duration_ms, qs.total_worker_time/1000 as cputime_ms, (qs.total_elapsed_time/qs.execution_count)/1000 AS avg_duration_ms, (qs.total_worker_time/qs.execution_count)/1000 AS avg_cputime_ms, qs.max_elapsed_time/1000 AS max_duration_ms, qs.max_worker_time/1000 AS max_cputime_ms, SUBSTRING(st.text, (qs.statement_start_offset / 2) + 1, (CASE qs.statement_end_offset WHEN -1 THEN DATALENGTH(st.text) ELSE qs.statement_end_offset END - qs.statement_start_offset) / 2 + 1) AS StmtText, query_hash, query_plan_hash FROM ( SELECT c.value('xs:hexBinary(substring((@QueryHash),3))', 'varbinary(max)') AS QueryHash, c.value('xs:hexBinary(substring((@QueryPlanHash),3))', 'varbinary(max)') AS QueryPlanHash, c.value('(QueryPlan/@CompileTime)', 'int') AS CompileTime_ms, c.value('(QueryPlan/@CompileCPU)', 'int') AS CompileCPU_ms, c.value('(QueryPlan/@CompileMemory)', 'int') AS CompileMemory_KB, qp.query_plan FROM sys.dm_exec_cached_plans AS cp CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS qp CROSS APPLY qp.query_plan.nodes('ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS n(c) ) AS tab JOIN sys.dm_exec_query_stats AS qs ON tab.QueryHash = qs.query_hash CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st ORDER BY CompileTime_ms DESC OPTION(RECOMPILE, MAXDOP 1);
Running this against a couple of test servers, I found a couple of queries that had 12 second compile times with 3 second execution durations. Now the challenge becomes testing if the query can be simplified to reduce the time it takes the optimizer to compile it, without reducing execution performance. The above code should work on any version of SQL Server from 2005 Service Pack 2 onwards, since the CompileTime, Compile CPU, and CompileMemory attributes were added in Service Pack 2. I don’t have a SQL Server 2005 instance to test it on, but I did test this on SQL Server 2008 and 2012.
This only works on 2008 and higher because the query fingerprints in dm_exec_query_stats weren't in SQL Server 2005. Using plan_handle will get you close, but that only tells you the plan had a high compile cost statement in it. You will have to dig into the plan to see which specific statement, which I am trying to avoid with the code above. I'll have to figure out another way to match these up based on the XML and blog the 2005 solution.