Workload Tuning Training

I recently booked my flight to the U.K. for SQLBits, and I realized I probably should write a post about my full day training session on workload tuning! It’s on Wednesday, February 27, 2019, and you can read the abstract here: Stop Tuning Your Queries and Start Tuning Your Workload.

I included a fair bit of information about what I’ll cover, so I thought it might be interesting to anyone considering this session to hear the backstory about how it came to be.
First, this is a new full day session, I haven’t presented it anywhere else previously. It does have pieces of content I’ve discussed elsewhere, but this really represents a different approach for tuning and troubleshooting. Namely: let’s not focus on one or two or five specific queries that are causing problems, let’s focus on the workload as a whole to identify patterns and then address them in specific queries.

What I have noticed over the past couple years is that I see similar patterns when I’m looking at a given system. Very often, within a solution there are patterns in terms of how queries are written, typical performance problems, and then typical ways they are addressed or fixed internally.

For example, we engage with a customer and discover they use table variables in a lot of their code, and this causes poor cardinality estimates and then a host of subsequent issues. We optimize a couple queries using temporary tables instead, and they then replace all table variables with temporary tables, and then start seeing issues related to temp table contention. Both options have a time and a place, the key is to understand when and why.

The ultimate goal of my session is to share my experience working with a wide range of customers and solutions over the years, the patterns I have seen, and options for addressing them. I’ll tell you now that there is no “easy button” in terms of fixes. But you already knew that 😊 A big key is figuring out how to convince your team of the changes that need to be made, and understanding how and why is critical. That’s what we’ll discuss, because I want you to walk away with applicable information that you can use immediately in your environment.

Still have questions? Add a comment or send me an email. I’d love to hear from you. I hope to see you in Manchester in Feburary!

p.s. I should also mention that next week (Jan 15 – Jan 17) I’m presenting my Query Store course online, and it’s the only delivery of it scheduled for the first part of the year. There is still space if you’re interested! Three afternoons next week, each day with two 1.5 hour sessions. I really like this format, it gives people good breaks to absorb information and think of/ask questions.

Different Query Store Settings for a Database in an Availability Group

Last week there was a question on #sqlhelp on Twitter about the status of Query Store for a database in an Availability Group. I’ve written about Query Store and Availability Groups before so if you’re not familiar with QS behavior in an AG, check out that post first. But this question was, I think, specific to the values that shows on a read-only replica and how there were different query store settings between a primary and secondary. Let’s set it up and take a look.

Environment
I have a multi-node cluster set up, and an AG (Avengers) running for the WideWorldImporters database across two nodes (CAP\ROGERS and BUCKY\BARNES). Right now, CAP is primary and BUCKY is secondary:

Avengers AG Status

Avengers AG Status

 

Query Store
On the primary, if you check the status of Query Store in the UI, we see the following:

Query Store configuration for WideWorldImporters on CAP

Query Store configuration for WideWorldImporters on CAP

 

If you check the status of Query Store from the secondary, the settings are the same:

Query Store configuration for WideWorldImporters on BUCKY

Query Store configuration for WideWorldImporters on BUCKY

 

Now we’ll make a change to the Query Store on the primary using TSQL:

USE [master];
GO
ALTER DATABASE [WideWorldImporters]
SET QUERY_STORE (MAX_STORAGE_SIZE_MB = 1024);
GO

And if we check the settings again, we see they’re updated on the primary:

Query Store configuration for WideWorldImporters after change on CAP

Query Store configuration for WideWorldImporters after change on CAP

 

But on the primary, you’ll see that settings show the same values as before:

Query Store configuration for WideWorldImporters after change on BUCKY

Query Store configuration for WideWorldImporters after change on BUCKY

 

This is expected.  When the instance starts up and the database loads on a replica, the Query Store settings are pulled from sys.database_query_store_options, and these are cached in memory. Any change that is subsequently made to the configuration of Query Store on the primary is persisted to disk on the primary and then propagated over to the secondary and written to disk, just like anything else. However, on-disk changes to Query Store settings are only propagated to the memory cache when the secondary replica is restarted, or if there is a failover and the secondary becomes primary.

Conclusion

If you’re seeing a disparity between values for Query Store settings on a primary and any or all of your replicas, it is expected and you can be assured that the changes have been written to the secondary copies and committed.

Handling Dates in Query Store

Query Store retains query performance data at the plan level.  This data is then broken out into intervals of time, determined by the INTERVAL_LENGTH_MINUTES setting.  The time intervals are found in the sys.query_store_runtime_stats_interval system view, and the start_time and end_time columns are of the DATETIMEOFFSET data type.  This means that the date is time-zone aware, and in Query Store the data is stored as UTC.  Now why does all of this matter?  Because handling dates in Query Store is important if you’re going to query the data directly.

Here’s what we get with a simple query against the runtime stats interval view:

SELECT *
FROM sys.query_store_runtime_stats_interval
ORDER BY runtime_stats_interval_id;
GO
sys.query_store_runtime_stats_interval output

sys.query_store_runtime_stats_interval output

These are 10 minute intervals and notice that they all have +00:00 for the offset, which is  +/- UTC.  If you want to display data based on the intervals, and convert it to your local timezone, you need to use AT TIME ZONE.  We can use a variation of a query from my last post, Finding the Slowest Query in a Stored Procedure, to see this in action.

SELECT
   [qsq].[query_id],
   [qsp].[plan_id],
   [qsq].[object_id],
   [rs].[runtime_stats_interval_id],
   [rsi].[start_time] AT TIME ZONE 'Eastern Standard Time' [EST StartTime],
   [rsi].[end_time] AT TIME ZONE 'Eastern Standard Time' [EST EndTime],
   [rs].[count_executions],
   [rs].[avg_duration],
   [rs].[avg_cpu_time],
   [rs].[avg_logical_io_reads]
FROM [sys].[query_store_query] [qsq]
JOIN [sys].[query_store_query_text] [qst]
   ON [qsq].[query_text_id] = [qst].[query_text_id]
JOIN [sys].[query_store_plan] [qsp]
   ON [qsq].[query_id] = [qsp].[query_id]
JOIN [sys].[query_store_runtime_stats] [rs]
   ON [qsp].[plan_id] = [rs].[plan_id]
JOIN [sys].[query_store_runtime_stats_interval] [rsi]
   ON [rs].[runtime_stats_interval_id] = [rsi].[runtime_stats_interval_id]
WHERE [qsq].[query_id] = 1421
   AND [qsp].[plan_id] = 1426
   AND [rsi].[end_time] > DATEADD(HOUR, -1, GETUTCDATE())
ORDER BY [rs].[runtime_stats_interval_id];
GO
StartTime and EndTime converted

StartTime and EndTime converted

Since I’m in chilly Cleveland, OH, which is in the Eastern Time Zone, I’ve converted the data as such.  If you’re not familiar with the exact time zone text you can use, query sys.time_zone_info.  Note that DATETIMEOFFSET does not account for Daylight Savings Time, so you’ll have to handle that in your code (yay for writing this post in January).

Separate from displaying the date, you may want to query for a particular range, as I’ve done in the above query with AND [rsi].[end_time] > DATEADD(HOUR, -1, GETUTCDATE()).  Notice that I’m using GETUTCDATE(), which is due to an interesting behavior with DATEADD which Aaron Bertrand has detailed in his post, Performance Surprises and Assumptions: DATEADD.  There are other date fields you could use here, such as last_execution_time, and it, as well as every other date column in the Query Store system views, is DATETIMEOFFSET.  Of note: last_execution_time exists in in sys.query_store_query, sys.query_store_plan, and sys.query_store_runtime_stats but they can and will be different between a query, its plan(s), and the runtime stats intervals.

Hopefully this helps the next time you’re writing queries against the Query Store system views!