Important Query Store Fixes – January 2019

The latest CUs for SQL Server 2016 and 2017 contain some important Query Store fixes that I thought worth mentioning for those of you on either version or those of you looking to upgrade.  As of this writing, the current CU for SQL Server 2016 SP2 is CU5, and for SQL Server 2017 it is CU13.  Many times we see fixes that make it into a SQL Server 2017 CU ported back to a SQL Server 2016 build.  Interestingly enough, there are some Query Store fixes in 2016 CUs that are not in 2017 CUs.  I don’t know if that’s because the issues do not exist in 2017, or if it’s just that they have been fixed yet in 2017.  I’m planning to update this post if the fixes are added down the read.  So here we go, in descending CU order…

SQL Server 2017 CU13FIX: A dump file may be generated when you run the DML internal plan on Query Store enabled database in SQL Server 2017

This will occur for anyone using Automatic Plan Correction (which means you will be on Enterprise Edition), as noted by having FORCE_LAST_GOOD_PLAN enabled.  You can read more about Automatic Plan Correction here (I’m a  fan, it’s pretty cool and very helpful for those with small DBA teams and lots of databases, or those DBAs who just have too much on their plate and are constantly putting out fires).

SQL Server 2017 CU11 and SQL Server 2016 SP2 CU5FIX: Transactions and log truncation may be blocked when you use Query Store in SQL Server 2016 and 2017

I’m aware of multiple companies that have run into this issue and I’m glad to see that the fix was ported back to 2016.  I highly recommend getting up to this CU if you can, as the resolution requires a restart (I’ve heard that killing the session_id also works but I haven’t seen that).  Also note that the command referenced in the KB article is ALTER DATABASE <databasename> SET QUERY_STORE_CLEAR, but I have also seen this when trying to change one of the Query Store options (e.g. changing size or retention days).  I am pretty sure it’s any ALTER DATABASE statement that changes Query Store configuration.

SQL Server 2016 SP2 CU5FIX: Query Store enabled database takes long time on startup after you apply cumulative update for a SQL Server version

Note that this is applicable to systems with Availability Groups, and this is a fix that is not listed for any 2017 CU…I’ll keep watching to see if it shows up.

SQL Server 2016 SP2 CU4FIX: Access violation when SQL Server 2016 tries to start Query Store Manager during startup

This is also a fix that is not listed for any 2017 CU…

SQL Server 2017 CU5FIX: Access violation occurs when Query Store collects runtime statistics in SQL Server 2017

I don’t see this fix in any SQL Server 2016 CU, but I’ll keep watching.

SQL Server 2016 SP2 CU2FIX: Slow performance of SQL Server 2016 when Query Store is enabled

There were many performance-related improvements for Query Store in SQL Server 2017, and huge props to the SQL Server team for getting these back-ported to 2016.  At a bare minimum, this is the build you should be running, but I would rather see you on CU5 at this point.

Note: If you installed CU2 for SQL Server 2017 at any point, please read: Query Store Fix in SQL Server 2017 for an explanation of what you need to do when you apply a higher CU and why.

You may look at this list of fixes and be concerned.  Don’t be.  These fixes are a good thing!  They tell me that more and more people are using Query Store (which is great) and while they have run into issues, I believe it’s because these are bigger systems (see the access violation fix which is for systems with 256 logical cores) or they are interesting/edge-case workloads, both of which probably don’t exist in Azure, and may be why these issues haven’t been seen previously.  I remain, as always, a big proponent of Query Store.  If you’re seeing something interesting feel free to comment here or drop me an email.  I’ve had some folks do that and I was able to confirm they were seeing a bug and let them know that it would be fixed soon.  I’ve had one or two other cases where it’s an issue I haven’t seen and I recommended contacting Product Support.  Either way, if you’re unsure about Query Store I have lots of resources to get you started, to get it configured properly, and I answer the question “what about performance?!”.  I hope this helps!

 

Workload Tuning Training

I recently booked my flight to the U.K. for SQLBits, and I realized I probably should write a post about my full day training session on workload tuning! It’s on Wednesday, February 27, 2019, and you can read the abstract here: Stop Tuning Your Queries and Start Tuning Your Workload.

I included a fair bit of information about what I’ll cover, so I thought it might be interesting to anyone considering this session to hear the backstory about how it came to be.
First, this is a new full day session, I haven’t presented it anywhere else previously. It does have pieces of content I’ve discussed elsewhere, but this really represents a different approach for tuning and troubleshooting. Namely: let’s not focus on one or two or five specific queries that are causing problems, let’s focus on the workload as a whole to identify patterns and then address them in specific queries.

What I have noticed over the past couple years is that I see similar patterns when I’m looking at a given system. Very often, within a solution there are patterns in terms of how queries are written, typical performance problems, and then typical ways they are addressed or fixed internally.

For example, we engage with a customer and discover they use table variables in a lot of their code, and this causes poor cardinality estimates and then a host of subsequent issues. We optimize a couple queries using temporary tables instead, and they then replace all table variables with temporary tables, and then start seeing issues related to temp table contention. Both options have a time and a place, the key is to understand when and why.

The ultimate goal of my session is to share my experience working with a wide range of customers and solutions over the years, the patterns I have seen, and options for addressing them. I’ll tell you now that there is no “easy button” in terms of fixes. But you already knew that 😊 A big key is figuring out how to convince your team of the changes that need to be made, and understanding how and why is critical. That’s what we’ll discuss, because I want you to walk away with applicable information that you can use immediately in your environment.

Still have questions? Add a comment or send me an email. I’d love to hear from you. I hope to see you in Manchester in Feburary!

p.s. I should also mention that next week (Jan 15 – Jan 17) I’m presenting my Query Store course online, and it’s the only delivery of it scheduled for the first part of the year. There is still space if you’re interested! Three afternoons next week, each day with two 1.5 hour sessions. I really like this format, it gives people good breaks to absorb information and think of/ask questions.

Different Query Store Settings for a Database in an Availability Group

Last week there was a question on #sqlhelp on Twitter about the status of Query Store for a database in an Availability Group. I’ve written about Query Store and Availability Groups before so if you’re not familiar with QS behavior in an AG, check out that post first. But this question was, I think, specific to the values that shows on a read-only replica and how there were different query store settings between a primary and secondary. Let’s set it up and take a look.

Environment
I have a multi-node cluster set up, and an AG (Avengers) running for the WideWorldImporters database across two nodes (CAP\ROGERS and BUCKY\BARNES). Right now, CAP is primary and BUCKY is secondary:

Avengers AG Status

Avengers AG Status

 

Query Store
On the primary, if you check the status of Query Store in the UI, we see the following:

Query Store configuration for WideWorldImporters on CAP

Query Store configuration for WideWorldImporters on CAP

 

If you check the status of Query Store from the secondary, the settings are the same:

Query Store configuration for WideWorldImporters on BUCKY

Query Store configuration for WideWorldImporters on BUCKY

 

Now we’ll make a change to the Query Store on the primary using TSQL:

USE [master];
GO
ALTER DATABASE [WideWorldImporters]
SET QUERY_STORE (MAX_STORAGE_SIZE_MB = 1024);
GO

And if we check the settings again, we see they’re updated on the primary:

Query Store configuration for WideWorldImporters after change on CAP

Query Store configuration for WideWorldImporters after change on CAP

 

But on the primary, you’ll see that settings show the same values as before:

Query Store configuration for WideWorldImporters after change on BUCKY

Query Store configuration for WideWorldImporters after change on BUCKY

 

This is expected.  When the instance starts up and the database loads on a replica, the Query Store settings are pulled from sys.database_query_store_options, and these are cached in memory. Any change that is subsequently made to the configuration of Query Store on the primary is persisted to disk on the primary and then propagated over to the secondary and written to disk, just like anything else. However, on-disk changes to Query Store settings are only propagated to the memory cache when the secondary replica is restarted, or if there is a failover and the secondary becomes primary.

Conclusion

If you’re seeing a disparity between values for Query Store settings on a primary and any or all of your replicas, it is expected and you can be assured that the changes have been written to the secondary copies and committed.