How was a plan forced?

If you use Automatic Plan Correction, and thus also Query Store, you may wonder how was a plan forced: manually or automatically with APC?  The type of forced plan is tracked in sys.query_store_plan, and you can use this simple query to determine how a plan was forced:

SELECT
   p.is_forced_plan,
   p.plan_forcing_type,
   p.plan_forcing_type_desc
FROM sys.query_store_plan p
WHERE p.is_forced_plan = 1;
GO
Manual vs. Automatically Forced

Manual vs. Automatically Forced

 

Now, this information is a good start, but we can do better if we join to sys.dm_db_tuning_recommendations and the other Query Store view.  The query below is not pretty I know 🙂 but it does the job:

;WITH ForcedPlanInfo
AS
   (SELECT
      [FP].[QueryId] [QueryID],
      [Reason],
      JSON_VALUE(state, '$.currentValue') [CurrentValue],
      [FP].[RegressedPlanId] [RegressedPlanID],
      [FP].[ForcedPlanID] [ForcedPlanID],
      [execute_action_start_time],
      JSON_VALUE(details, '$.implementationDetails.script') [Script]
   FROM [sys].[dm_db_tuning_recommendations]
   CROSS APPLY OPENJSON (Details, '$.planForceDetails')
   WITH (
      [QueryId] INT '$.queryId',
      [RegressedPlanId] INT '$.regressedPlanId',
      [ForcedPlanID] INT '$.recommendedPlanId'
   ) AS FP
)
SELECT
   qt.query_text_id,
   q.query_id,
   p.plan_id,
   p.is_forced_plan,
   p.plan_forcing_type,
   p.plan_forcing_type_desc,
   qt.query_sql_text,
   TRY_CAST(p.query_plan AS XML),
   tr.QueryID,
   tr.Reason,
   tr.CurrentValue,
   tr.RegressedPlanID,
   tr.ForcedPlanID,
   tr.execute_action_start_time,
   tr.Script
FROM sys.query_store_plan p
LEFT OUTER JOIN ForcedPlanInfo tr
   ON p.plan_id = tr.ForcedPlanID
JOIN sys.query_store_query q
   ON p.query_id = q.query_id
JOIN sys.query_store_query_text qt
   ON q.query_text_id = qt.query_text_id
WHERE p.is_forced_plan = 1;
GO
Plan Forcing Output Part 1

Plan Forcing Output Part 1

Plan Forcing Output Part 2

Plan Forcing Output Part 2

 

As you can see, I broke the query output into two parts because it’s fairly wide, and for the plan that was forced manually, there are columns that are not relevant.

The only additional item I would want to display in this output is the user that forced the plan manually.  Unfortunately, this information is not currently captured by SQL Server.  You will not find it in an Extended Event or in the SQL Server ERRORLOG; hopefully that will be addressed soon.

Hope this helps those of you that are using Query Store and Automatic Plan Correction!

T-SQL Tuesday #124: Using Query Store (and why you might not be…yet)

T-SQL Tuesday

T-SQL Tuesday

 

This month’s T-SQL Tuesday topic is courtesy of Tracy Boggiano and it’s related to using Query Store!  Tracy asked people to write about their experience adopting Query Store (or not, if you haven’t tried it yet or have had issues). I have so much to share… 😊

I can spend a full day talking about this favorite feature of mine, but will do my best to succinctly summarize the usage patterns I’ve seen with customers.

Successful Adoption

We have several customers that turned Query Store on the moment they upgraded to SQL Server 2016 or higher and were off and running. We have one that uses Enterprise Edition and also uses Automatic Plan Correction, and there are some plans that we force manually as well.  Those customers have parameterized queries and procedure-based workloads.

Delayed Adoption

There are a few customers that upgraded to SQL Server 2016 or higher and had concerns about Query Store. In one case they already had a third-party monitoring tool in place and didn’t think they needed it. Other customers were very hesitant to make changes to the environment, and wanted to do testing with it – after the upgrade – to make sure there were no adverse effects. In all cases, the customers finally implemented Query Store, with success.

Successful Adoption…but not without some struggles

Last fall we had a previous customer reach out for help after they had to wait 45 minutes for a database to come online after a server reboot. The database queries were blocked by QDS_LOADDB waits. There were three things in play here – the first was that they had CAPTURE_MODE set to ALL, and it should be AUTO. Second, they didn’t have trace flag 7752 enabled (the behavior of which is now the default in SQL Server 2019). And the third was that their Query Store was 100GB in size. The workload was fairly ad hoc, so these three things together caused the problem initially described. They implemented the TF, made multiple changes to the settings (set CAPTURE_MODE to AUTO, changed MAX_STORAGE_SIZE_MB to 10GB, decreased CLEANUP_POLICY to 3 days), and then Query Store was usable for them.

Failed Adoption

Lastly, there are a few customers that have been unable to implement Query Store because of their extremely ad hoc workload. For one customer, it was very frustrating because they benefited from plan forcing. At the time, they were running the latest version of SQL Server 2017 but ran into the bug where the transaction log would fill up (fixed in 2017 CU16 and 2016 SP2 CU8). Even after applying this CU and having optimal settings, they were still unable to use Query Store because of their workload.

Query Store is *still* a possibility

For anyone with an ad hoc workload that is at the point of dismissing Query Store, I recommend that you verify you are following best practices, and it might be necessary to look at SQL Server 2019 or Azure SQL Database. There is new option, CUSTOM, for QUERY_CAPTURE_MODE that allow you to control exactly what queries are captured based on execution frequency, compile CPU, or execution CPU. Further, the SQL Server team continues to actively evolve the Query Store feature and address issues as they are found. If this is a feature you want to use, and you are running into a problem, you need to open a User Voice item.  The SQL Server team can only fix the problems and limitations they know about.

Are Changes to Query Store Logged?

The Query Store feature is a bit unique in that its status can change without user interference, which means it is important to understand how changes to Query Store are logged.  A couple weeks ago John Deardurff posted a question on Twitter asking specifically whether the SQL Server ERRORLOG is written to when the OPERATION_MODE changes to READ_ONLY because  MAX_STORAGE_SIZE_MB is exceeded.  I had never tested to confirm, but I know there is an event in Extended Events that will fire when the limit is reached.  I also know that when a user makes a change to a Query Store setting, it is logged in the ERRORLOG.

User Changes

With a copy of WideWorldImporters restored in my SQL Server 2019 VM, I will create an Extended Events with a few events I think will be relevant:

CREATE EVENT SESSION [XEStatusChange]
ON SERVER
ADD EVENT qds.query_store_db_settings_changed,
ADD EVENT qds.query_store_disk_size_info,
ADD EVENT qds.query_store_disk_size_over_limit
ADD TARGET package0.event_file(SET filename=N'C:\temp\XEStatus',max_file_size=(256))
WITH (MAX_MEMORY=4096 KB, EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY=30 SECONDS, MAX_EVENT_SIZE=0 KB, MEMORY_PARTITION_MODE=NONE,  TRACK_CAUSALITY=OFF, STARTUP_STATE=OFF);
GO

 

Next, we will change the MAX_STORAGE_SIZE_MB to just 5MB.  Note that this is an extremely low value that I do not recommend.

USE [master];
GO
ALTER DATABASE [WideWorldImporters]
SET QUERY_STORE (
OPERATION_MODE = READ_WRITE,
MAX_STORAGE_SIZE_MB = 5
);
GO

 

If we look at the output from Extended Events, we can see that my change to Query Store is logged:

Extended Events output after Query Store MAX_STORAGE_SIZE changed to 5MB via TSQL

Extended Events output after Query Store MAX_STORAGE_SIZE changed to 5MB via TSQL

 

If we check the ERRORLOG, information about the change exists there as well:

ERRORLOG entry showing Query Store setting change

ERRORLOG entry showing Query Store setting change

Finally, a quick peek of the Query Store pane in SSMS confirms the settings:

Query Store Settings

Query Store Settings

Another quick note: these are not the values I would recommend for a production system.  For more information on settings, check out my Query Store Settings post.

System Changes

Now we will explore what gets logged when the OPERATION_MODE setting changes from READ_WRITE to READ_ONLY.  We will introduce this by running a very ad hoc workload that generates thousands of queries with different literal values.  The code exists in the post, Examining the Performance Impact of an Ad Hoc Workload, and we will run usp_RandomSelects from multiple threads with the Extended Events session still running.

Within a few minutes, the following events show up in the Live Data View:

Extended Events output after MAX_STORAGE_SIZE is exceeded

Extended Events output after MAX_STORAGE_SIZE is exceeded

 

If we check the ERRORLOG, there are no new entries related to Query Store settings.

Summary

As you can see, changes to Query Store are logged differently, depending on the type of change.  Any Query Store option that is changed by a user using the ALTER DATABASE command is captured in the ERRORLOG.  The query_store_db_settings_changed event also fires. When the OPERATION_MODE is changed by the system, because the storage allocated to Query Store is exceeded, the change is not captured in the ERRORLOG, but it is captured with the query_store_disk_size_over_limit event. In addition, the query_store_disk_size_info event, which fires on a regular basis, will also report the max and current size values. If you want to know when Query Store changes to a READ_ONLY state, you will need to set up an Extended Events session to capture it. To be more proactive, I recommend setting up an Agent job that runs on a regular basis to monitor the space used, and sends an email if it exceeds a certain percentage.