Why You Need Query Store, Part III: Proactively analyze your workload

 

The amount of data collected by Query Store pales in comparison to the amount of data available in all of SQL Server that could be captured.  But the quality of the data overshadows the quantity, or lack thereof.  While most people target the execution statistics when viewing historical data in Query Store, I would argue that the query text and query plan information are equally as valuable when it comes to taking the time to analyze your workload.

The query text and plan are stored in two separate tables, sys.query_store_query_text and sys.query_store_plan, respectively.  The text is stored as a nvarchar(max) data type, and the plan is stored as varbinary(max).  This means that the data can be mined, looking for patterns and explicit use of objects.  Want to know what queries use an index?  Look for it in the plans.  Want to know what queries have a RECOMPILE hint on them?  Look for it in the query text.  Anything you want to find with regard to query text or plans is there, you just have to know how to query the data.

Mining Query Store: Queries

But wait, can’t we just search the text?  Sure, with the query_text field you can just wildcard your way through it looking for things.  For example, the query below will help you find every statement in Query Store that has RECOMPILE in the text.

SELECT
     [qsq].[query_id],
     [qsp].[plan_id],
     [rs].[last_execution_time],
     [rs].[avg_duration],
     [rs].[avg_logical_io_reads],
     [qst].[query_sql_text],
     TRY_CONVERT(XML, [qsp].[query_plan]) AS [QueryPlan_XML]
FROM [sys].[query_store_query] [qsq]
JOIN [sys].[query_store_query_text] [qst]
     ON [qsq].[query_text_id] = [qst].[query_text_id]
JOIN [sys].[query_store_plan] [qsp]
     ON [qsq].[query_id] = [qsp].[query_id]
JOIN [sys].[query_store_runtime_stats] [rs]
     ON [qsp].[plan_id] = [rs].[plan_id]
WHERE [qst].[query_sql_text] LIKE '%RECOMPILE%';
GO

Mining Query Store: Plans

Query plans have a specific format, defined by Microsoft, which can be searched using wildcards, but the performance is abysmal and you’re better served using XQuery.  That’s right, I said XQuery.  If you don’t know, XQuery and I are frenemies.  I love the information I get when using it, but writing the T-SQL always makes me want to compulsively eat M&Ms and animal crackers until I feel sick.

Fortunately for me, Jonathan writes XQuery and creates the foundation of the queries I manage to cobble together for Query Store.  In his post, Finding what queries in the plan cache use a specific index, Jonathan has the code to look through the plan cache and find what queries use a specific index.  The challenge is that the plan cache can be volatile, particularly in an ad hoc workload.  Even in a parameterized/procedural workload, there is churn and the plans that are cache change as different business processes run throughout the day, week, and month.

Consider the scenario where you use sys.dm_db_index_usage_stats to determine if an index is being used, and when you find one that it isn’t….how do you really know?  As we know, the plan cache is transitory, so a query that uses it might not be in cache at that moment.  But it would be in Query Store, depending on your retention settings.  If we modify the query from Jonathan’s post, we can interrogate the Query Store data:

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
GO

DECLARE @IndexName AS NVARCHAR(128) = '[FK_Sales_InvoiceLines_InvoiceID]';
-- Make sure the name passed is appropriately quoted

IF (LEFT(@IndexName, 1) <> '[' AND RIGHT(@IndexName, 1) <> ']') SET @IndexName = QUOTENAME(@IndexName);
--Handle the case where the left or right was quoted manually but not the opposite side
IF LEFT(@IndexName, 1) <> '[' SET @IndexName = '['+@IndexName;
IF RIGHT(@IndexName, 1) <> ']' SET @IndexName = @IndexName + ']';
;WITH XMLNAMESPACES
(DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
SELECT
     stmt.value('(@StatementText)[1]', 'varchar(max)') AS SQL_Text,
     obj.value('(@Database)[1]', 'varchar(128)') AS DatabaseName,
     obj.value('(@Schema)[1]', 'varchar(128)') AS SchemaName,
     obj.value('(@Table)[1]', 'varchar(128)') AS TableName,
     obj.value('(@Index)[1]', 'varchar(128)') AS IndexName,
     obj.value('(@IndexKind)[1]', 'varchar(128)') AS IndexKind,
     query_plan
FROM
(
     SELECT query_plan
     FROM
     (
          SELECT TRY_CONVERT(XML, [qsp].[query_plan]) AS [query_plan]
          FROM sys.query_store_plan [qsp]) tp
          ) AS tab (query_plan)
     CROSS APPLY query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS batch(stmt)
     CROSS APPLY stmt.nodes('.//IndexScan/Object[@Index=sql:variable("@IndexName")]') AS idx(obj)
OPTION(MAXDOP 1, RECOMPILE);

Summary

Beyond looking at performance metrics over time for a query, take advantage of the plethora of data in the Query Store tables.  Use both the text and plan information to proactively look for anti-patterns in your workload: use of index hints, RECOMPILE, NO LOCK…all the things that may cause problems, and understand both object access and expensive operations in the plans.  There is no limit to the patterns, and anti-patterns, you can find.  You just have to write the code.

SQL Server Plan Cache Limits

Last week on SQLPerformance.com I blogged about ad hoc workloads and how they impact performance, noting plan cache bloat, and I had a comment asking about SQL Server plan cache limits.  Kimberly mentions them in her classic post Plan cache and optimizing for adhoc workloads, but I thought I’d put the information into a blog post because I always have trouble finding the original references: Plan Caching in SQL Server 2008 (which is still accurate for SQL Server 2019, as far as I know) and Plan Cache Internals.

There are two limitations related to the plan cache:

  • Number of entries
  • Total size in MB (based on server memory)

By default, the plan cache is limited to 160,036 total entries (40,009 entries per bucket), and size based on max server memory (for SQL Server 2008+ and SQL Server 2005 SP2):

75% of visible target memory from 0 to 4GB                                                                                                             + 10% of visible target memory from 4GB to 64GB                                                                                                   + 25% of visible target memory > 64GB

Here’s the math:

 

SQL Server default plan cache size limits

SQL Server default plan cache size limits

 

If you have an ad hoc workload, you’ll often hit the max number of entries before you hit the space limit, particularly if you have the optimize for ad hoc workloads server option enabled, which stores the plan stub for an ad hoc query on initial execution, rather than the entire plan (the plan stub consumes less space, so this is an attempt to reduce bloat).

For pure ad hoc workloads, if you’re seeing SOS_CACHESTORE spinlock contention, you can use trace flag 174 to increase the number of entries to 160,001 per bucket (640,004 total), which is applicable for:

  • SQL Server 2012 SP1 CU14+
  • SQL Server 2012 SP2 CU5+
  • SQL Server 2014 CU2+
  • SQL Server 2014 SP1 CU1+
  • SQL Server 2016+

Very rarely, I have seen instances where customers have used trace flag 8032 to increase the total size of the plan cache.  This trace flag uses the plan cache limits for SQL Server 2005 RTM:

75% of visible target memory from 0 to 4GB                                                                                                             + 50% of visible target memory from 4GB to 64GB                                                                                                   + 25% of visible target memory > 64GB

The math is below, but note that this can significantly affect the memory available for the buffer pool and this is something I don’t typically recommend…use with caution.

SQL Server plan cache limits with TF 8032

SQL Server plan cache limits with TF 8032

Summary

If you’re seeing plan cache bloat due to an ad hoc workload, take note of the space consumed by your plan cache and consider enabling the optimize for ad hoc setting, and possibly enabling TF 174.  Both of those are short term fixes – ideally you look to parameterize stable queries that are executed most frequently to reduce the number of entries in the plan cache.