Query Store Training – Portugal

I am so excited to announce that I am presenting a full day of Query Store Training, in-person, this September in Lisbon, Portugal! The SQLskills team will be in London for two weeks in September for a set of Immersion Events (IEPTO1, IEAzure, IECAG, and IEPTO2). After I’ve finished my teaching for IEPTO2 I’m heading over to Lisbon for a full day (Friday, September 21, 2018)  on Query Store in advance of SQLSaturday Portugal.

This workshop has continued to evolve since its first inception at the PASS Summit last fall – specifically, I’ve added more content around performance and workload analysis, but every time I get asked a new question that I think is relevant or interesting, it gets added into a slide or demo. You can read the full abstract below, and can purchase your ticket here, and if you have any questions about the workshop please email me or post a comment!

*Note: If you’re interested in the Immersion Events listed above, please know that we probably won’t offer them in Europe again until 2020, so if you’re interested please talk to your manager and get signed up. We would love to see you!

Using Query Store to Easily Troubleshoot and Stabilize Your Workload

– Have you upgraded to SQL Server 2016 or higher, but still have databases using the old Cardinality Estimator?
– Do you know that you have queries with inconsistent performance, but you’re just not sure how to find them, or fix them, quickly?
– Are you tired of flailing around in SQL Server, querying DMV after DMV to figure out the *real* problem with performance?

Query Store can help.

We’ll cover Query Store end-to-end in this full day workshop built using real-world examples based on customer issues resolved over the last two years. You’ll understand how to configure it, what data it captures, and how to use it to analyze performance, find regressions, and force plans. The demos will teach you how to find common patterns in query performance using T-SQL, and how to understand your workload.

This class is applicable for those running SQL Server 2016 or higher (or planning to upgrade), or Azure SQL Database, and will provide practical and applicable information you can use whether you’re a new or veteran DBA, a developer that has to troubleshoot query performance, or an application administrator just trying to keep the system afloat. You’ll learn how to find and leverage important information in Query Store to make solving common performance problems easier the moment you walk back into the office.

Monitoring Space Used by Query Store

Last week I presented a session on Query Store and when talking about the settings I mentioned that monitoring space used by Query Store is extremely important when you first enable it for a database.  Someone asked me how I would do that and as I provided an explanation I realized that I should document my method…because I give the same example every time and I would be nice to have the code.

For those of you not familiar with the Query Store settings, please check out my post which lists each one, the defaults, and what I would recommend for values and why.  When discussing MAX_STORAGE_SIZE_MB, I mention monitoring via sys.database_query_store_options or Extended Events.  As much as I love Extended Events, there isn’t an event that fires based on a threshold exceeded.  The event related to size is query_store_disk_size_over_limit, and it fires when the space used exceeds the value for MAX_STORAGE_SIZE_MB, which is too late.  I want to take action before the maximum storage size is hit.

Therefore, the best option I’ve found is to create an Agent job which runs on a regular basis (maybe every four or six hours initially) that checks current_storage_size_mb in sys.database_query_store_options and calculates the space used by Query Store as a percentage of the total allocated, and then if that exceeds the threshold you set, send an email.  The code that you can put into an Agent job is below.  Please note you want to make sure the job runs in the context of the user database with Query Store enabled (as sys.database_query_store_options is a database view), and configure the threshold to a value that makes sense to your MAX_STORAGE_SIZE_MB.  In my experience, 80% has been a good starting point, but feel free to adjust as you see fit!

Once your Query Store size has been tweaked and stabilized, I would leave this job in place as a safety to alert you should anything change (e.g. someone else changes a Query Store setting which indirectly affects the storage used).

/* Change DBNameHere as appropriate */
USE [DBNameHere]

/* Change Threshold as appropriate */
DECLARE @Threshold DECIMAL(4,2) = 80.00
DECLARE @CurrentStorage INT
DECLARE @MaxStorage INT

SELECT @CurrentStorage = current_storage_size_mb, @MaxStorage = max_storage_size_mb
FROM sys.database_query_store_options

IF (SELECT CAST(CAST(current_storage_size_mb AS DECIMAL(21,2))/CAST(max_storage_size_mb AS DECIMAL(21,2))*100 AS DECIMAL(4,2))
FROM sys.database_query_store_options) >= @Threshold
BEGIN

     DECLARE @EmailText NVARCHAR(MAX) = N'The Query Store current space used is ' + CAST(@CurrentStorage AS NVARCHAR(19)) + 'MB
     and the max space configured is ' + CAST(@MaxStorage AS NVARCHAR(19)) + 'MB,
     which exceeds the threshold of ' + CAST(@Threshold AS NVARCHAR(19) )+ '%.
     Please allocate more space to Query Store or decrease the amount of data retained (stale_query_threshold_days).'

     /* Edit profile_name and recipients as appropriate */
     EXEC msdb.dbo.sp_send_dbmail
     @profile_name = 'SQL DBAs',
     @recipients = 'DBAs@yourcompany.com',
     @body = @EmailText,
     @subject = 'Storage Threshold for Query Store Exceeded' ;
END

Updating Statistics with Ola Hallengren’s Script

I am a HUGE fan of updating statistics as part of regular maintenance.  In fact, if you don’t know if you have a step or job that updates out of statistics on a regular basis, go check now!  This post will still be here when you get back 😊

At any rate, for a long time the default options for updating statistics were pretty much a sledgehammer.  Within the maintenance plan options, the Update Statistics Task only provides the option to update Index statistics, Column statistics, or both.  You can also specify whether it is a full scan or a sample for the update, but that’s about it:

Update Statistics Task (Maintenance Plan)

Update Statistics Task (Maintenance Plan)

I don’t like this option because it means that statistics that have had little or no change will be updated.  I could have a 10 million row table where only 1000 rows change, and yet the statistics for that table will update.  This is a waste of resources.  For a small database, or system that’s not 24×7, that isn’t such a big deal.  But in a database with multiple 10 million row tables, it is a big deal.

The sp_updatestats command isn’t a favorite of mine either.  I’ve written about that here, so I won’t re-hash it.

If you have used Ola Hallengren’s scripts for maintenance, you hopefully know that it will also update statistics using the @UpdateStatistics parameter.  The default value for this is NULL, which means do not update statistics.  To be clear, if you drop in Ola’s scripts and have it create the jobs for you, and then you start running the “IndexOptimize – USER_DATABASES” job, by default you’re not updating statistics.  The code the IndexOptimize – USER_DATABASES job has, by default, is:

EXECUTE [dbo].[IndexOptimize]
@Databases = 'USER_DATABASES',
@LogToTable = 'Y'

If you want to have the job also update statistics, you need:

EXECUTE [dbo].[IndexOptimize]
@Databases = 'USER_DATABASES',
@UpdateStatistics = 'ALL',
@LogToTable = 'Y'

With this variation, we are updating index and column statistics, which is great.  But…we are updating them regardless of whether it’s needed.  Statistic with no rows modified? Update it.  Statistic with 10 rows modified? Update it.

There has always been an option to only update statistics that have changed, this is the @OnlyModifiedStatistics option, and this gets us behavior just like sp_updatestats.

EXECUTE [dbo].[IndexOptimize]
@Databases = 'USER_DATABASES',
@UpdateStatistics = 'ALL',
@OnlyModifiedStatistics = 'Y',
@LogToTable = 'Y'

With this option, if no rows have changed, the statistic will not be updated.  If one or more rows have changed, the statistic will be updated.

Since the release of SP1 for 2012, this has been my only challenge with Ola’s scripts.  In SQL Server 2008R2 SP2 and SQL Server 2012 SP1 they introduced the sys.dm_db_stats_properties DMV, which tracks modifications for each statistic.  I have written custom scripts to use this information to determine if stats should be updated, which I’ve talked about here.  Jonathan has also modified Ola’s script for a few of our customers to look at sys.dm_db_stats_properties to determine if enough data had changed to update stats, and a long time ago we had emailed Ola to ask if he could include an option to set a threshold.  Good news, that option now exists!

Using Ola’s script to update statistics based on a threshold of change

With the IndexOptimize stored procedure Ola now includes the option of @StatisticsModificationLevel.  You can use this to set a threshold for modifications, so that only statistics with a specific volume of change are updated.  For example, if I want statistics updated if 5% of the data has changed, use:

EXECUTE [dbo].[IndexOptimize]
@Databases = 'USER_DATABASES',
@UpdateStatistics = 'ALL',
@StatisticsModificationLevel= '5',
@LogToTable = 'Y'

Take note: the option @OnlyModifiedStatistics option is not included here…you cannot use both options, it has to be one or the other.

This is great!  I can further customize this for different tables.  Consider a database that has a very volatile table, maybe dbo.OrderStatus, where auto-update may or may not kick in during the day, so I want to make sure stats are updated nightly:

EXECUTE [dbo].[IndexOptimize]
@Databases = 'USER_DATABASES',
@Indexes = 'ALL_INDEXES, -SalesDB.dbo.OrderStatus',
@UpdateStatistics = 'ALL',
@StatisticsModificationLevel= '10',
@LogToTable = 'Y'

This will address fragmentation and update statistics for all tables in the SalesDB database except dbo.OrderStatus, and it will update statistics if 10% or more of the rows have changed.

I would then have a second job to address fragmentation and stats for OrderStatus:

EXECUTE [dbo].[IndexOptimize]
@Databases = 'USER_DATABASES',
@Indexes = 'SalesDB.dbo.OrderStatus',
@UpdateStatistics = 'ALL',
@StatisticsModificationLevel= '1',
@LogToTable = 'Y'

For the dbo.OrderStatus table, statistics would be updated when only 1% of the data had changed.

I love the flexibility this provides!

You might be wondering why I chose 1%…take a close look at this important note which is included in Ola’s documentation:

Statistics will also be updated when the number of modified rows has reached a decreasing, dynamic threshold, SQRT(number of rows * 1000)

This is critical to understand because if the threshold I have set for @StatisticsModificationLevel ends up having a number of rows HIGHER than the formula above, statistics will update sooner than I expect.

For example, if I have 1 million rows in a table and I have @StatisticsModificationLevel = 10, then 10% of the rows, or 100,000, have to change in order to update statistics.  HOWEVER, if you plug 1 million into SQRT(1,000,000 * 1000), you get 31,623, which means Ola’s script will update statistics after 31,623 rows have changed…well before 100,000.

This may be important for some of you to understand in terms of these thresholds, so I dropped the information into a table to make it easier to comprehend (at least, it’s easier for me!).

Thresholds for Statistics Updates (percentage and SQRT algorithm)

Thresholds for Statistics Updates (percentage and SQRT algorithm)

Using my original example, if dbo.OrderStatus has about one million rows, then with 1% as the threshold, only 10,000 rows need to change before stats are updated.  If the SQRT algorithm were used, over 30,000 rows would need to change before stats were updated, and depending on the data skew, that might be too high.

Understand that as tables get larger, statistics will likely be updated before the set percentage value is reached because the SQRT algorithm has a lower threshold.  (Yes, I’m driving this point home.)  Consider a table with 10 million rows.  If I set the threshold to 5%, I would expect statistics to update after 500,000 modifications, but in fact they will update after 100,000.

If you’re wondering where the SQRT algorithm comes from, please review Microsoft’s Statistics documentation.  This threshold was originally introduced with trace flag 2371 to lower the threshold for automatic updates.  It is applied by default started in SQL Server 2016 when using compatibility level 130.  My assumption is that Ola determined this was a good threshold to use as a fail-safe/catch-all for his script, and I think it was smart move on his part.  In general, I’d rather have statistics update too often, rather than not often enough.  However, using the new @StatisticsModificationLevel option gives us better control than we’ve had previously, unless we write a custom script (which is still an option…do what works best for you!).

Can you force a plan for a different query with Query Store?

This is question I’ve gotten a few times in class…Can you force a plan for a different query with Query Store?

tl;dr

No.

Assume you have two similar queries, but they have different query_id values in Query Store.  One of the queries has a plan that’s stable, and I want to force that plan for the other query.  Query Store provides no ability to do this in the UI, but you can try it with the stored procedure.  Let’s take a look…

Testing

Within WideWorldImporters we’ll execute an ad-hoc query with two different input values:

USE [master];
GO
ALTER DATABASE [WideWorldImporters] SET QUERY_STORE = ON;
GO
ALTER DATABASE [WideWorldImporters] SET QUERY_STORE (OPERATION_MODE = READ_WRITE);
GO

USE [WideWorldImporters];
GO

DECLARE @CustomerID INT;
SET @CustomerID = 972;

SELECT o.OrderDate, o.ContactPersonID, ol.StockItemID, ol.Quantity
FROM Sales.Orders o
JOIN Sales.OrderLines ol
ON o.OrderID = ol.OrderID
WHERE o.CustomerID = @CustomerID;
GO

DECLARE @CustomerID2 INT;
SET @CustomerID2 = 972;

SELECT o.ContactPersonID, o.OrderDate, ol.StockItemID, ol.Quantity
FROM Sales.Orders o
JOIN Sales.OrderLines ol
ON o.OrderID = ol.OrderID
WHERE o.CustomerID = @CustomerID2;
GO

Let’s see what’s in Query Store:


SELECT qt.query_text_id, q.query_id, qt.query_sql_text, p.plan_id, TRY_CAST(p.query_plan AS XML)
FROM sys.query_store_query_text qt
JOIN sys.query_store_query q
ON qt.query_text_id = q.query_text_id
JOIN sys.query_store_plan p
ON q.query_id = p.query_id
WHERE qt.query_sql_text LIKE '%Sales.Orders%';
GO

Query information from Query Store

Query information from Query Store

We see that we have two different queries and one plan for each. We can force the plan for the first query:

EXEC sp_query_store_force_plan @query_id = 3, @plan_id = 3;
GO

This works.  If we try to force that same plan for the other query:

EXEC sp_query_store_force_plan @query_id = 4, @plan_id = 3;
GO
Error when trying to force a different plan for a query

Error when trying to force a different plan for a query

Trying to force plan_id 3 for query_id 4 throws this error:

Msg 12406, Level 11, State 1, Procedure sp_query_store_force_plan, Line 1 [Batch Start Line 34]
 Query plan with provided plan_id (2) is not found in the Query Store for query (4). Check the plan_id value and rerun the command.

Summary
Within Query Store, the relationship between query_id and plan_id is managed internally (i.e. there are no foreign key constraints for the underlying tables), and there is a validation that any plan_id that you want to force for a query_id must have been generated for that specific query.

In this type of scenario, you have to get the plan shape you want for the query, which may require trying different input parameters.  The example I’ve provided is very simple, but when in doubt, check the input parameters for the plan that you want, then try those with the other query (that doesn’t yet have the plan you want to force).  Of course, if you have to use a query or index hint to get the plan that you want, then it’s going to be a little trickier to get the plan you want for the original query.  Good luck!

Query Store and the Plan Cache Flushing

I’ve had two comments recently on my blog about Query Store causing the plan cache to be flushed. There was a known issue related to the plan cache flushing after Query Store was enabled, but this was fixed in CU2 for SQL Server 2016 SP1. So I did some testing and here is what I think is causing the confusion:

When you enable Query Store, which is done with an ALTER DATABASE SET statement, the plan cache for the database is flushed.

Now, before anyone writes up a UserVoice item, understand that there are several ALTER DATABASE SET commands that cause the plan cache for a database to be flushed. For example, taking a database OFFLINE causes the database plan cache to be flushed. That one seems intuitive, right?  So why is the plan cache cleared when you enable Query Store, or change one of the settings?  To ensure that new Query Store data is not lost.  This relates to the internals of how Query Store works, which aren’t essential to dig into, the point is that this behavior is known by Microsoft and expected.

If you review the ALTER DATABASE SET documentation, and specifically review the Query Store options, you won’t find any mention of the database plan cache clearing.  But you can test it to see that it occurs…

Testing

First, disable Query Store for the WideWorldImporters database, and then free the plan cache:

USE [master];
GO
ALTER DATABASE [WideWorldImporters] SET QUERY_STORE = OFF;
GO

DBCC FREEPROCCACHE;
GO

USE [WideWorldImporters];
GO

SELECT o.OrderDate, o.ContactPersonID, ol.StockItemID, ol.Quantity
FROM Sales.Orders o
JOIN Sales.OrderLines ol
	ON o.OrderID = ol.OrderID
WHERE o.CustomerID = 972;
GO
SELECT o.OrderDate, o.ContactPersonID, ol.StockItemID, ol.Quantity
FROM Sales.Orders o
JOIN Sales.OrderLines ol
	ON o.OrderID = ol.OrderID
WHERE o.CustomerID = 123;
GO

Now query the plan cache to confirm those plans are in cache:


SELECT t.dbid, t.text, s.creation_time, s.execution_count, p.query_plan
FROM sys.dm_exec_query_stats s
CROSS APPLY sys.dm_exec_query_plan(s.plan_handle) p
CROSS APPLY sys.dm_exec_sql_text(s.sql_handle) t
WHERE t.text LIKE '%Sales%';
GO
SQL Server's plan cache after initial query execution

SQL Server’s plan cache after initial query execution

Great, they’re there. Now enable Query Store, then check the plan cache again.


USE [master];
GO
ALTER DATABASE [WideWorldImporters] SET QUERY_STORE = ON;
GO
ALTER DATABASE [WideWorldImporters] SET QUERY_STORE (OPERATION_MODE = READ_WRITE);
GO

SELECT t.dbid, t.text, s.creation_time, s.execution_count, p.query_plan
FROM sys.dm_exec_query_stats s
CROSS APPLY sys.dm_exec_query_plan(s.plan_handle) p
CROSS APPLY sys.dm_exec_sql_text(s.sql_handle) t
WHERE t.text LIKE '%Sales%';
GO
SQL Server's plan cache after enabling Query Store

SQL Server’s plan cache after enabling Query Store (plans have been cleared)

 

 

 

 

 

 

 

The plan cache for the database has been cleared. Note that this only clears the plan cache for that database – plans for other databases still remain in cache. Run a few more queries to add some plans back to the plan cache, and confirm they’re there.


USE [WideWorldImporters];
GO

SELECT o.OrderDate, o.ContactPersonID, ol.StockItemID, ol.Quantity
FROM Sales.Orders o
JOIN Sales.OrderLines ol
	ON o.OrderID = ol.OrderID
WHERE o.CustomerID = 972;
GO
SELECT o.OrderDate, o.ContactPersonID, ol.StockItemID, ol.Quantity
FROM Sales.Orders o
JOIN Sales.OrderLines ol
	ON o.OrderID = ol.OrderID
WHERE o.CustomerID = 123;
GO

SELECT t.dbid, t.text, s.creation_time, s.execution_count, p.query_plan
FROM sys.dm_exec_query_stats s
CROSS APPLY sys.dm_exec_query_plan(s.plan_handle) p
CROSS APPLY sys.dm_exec_sql_text(s.sql_handle) t
WHERE t.text LIKE '%Sales%';
GO
SQL Server's plan cache after enabling Query Store AND running some queries

SQL Server’s plan cache after enabling Query Store AND running some queries

This time, change one of the settings for Query Store. It’s already enabled, but perhaps we want to change the INTERVAL_LENGTH_MINUTES setting from the default of 60 minutes to 30 minutes.


USE [master]
GO
ALTER DATABASE [WideWorldImporters] SET QUERY_STORE (INTERVAL_LENGTH_MINUTES = 30)
GO

SELECT t.dbid, t.text, s.creation_time, s.execution_count, p.query_plan
FROM sys.dm_exec_query_stats s
CROSS APPLY sys.dm_exec_query_plan(s.plan_handle) p
CROSS APPLY sys.dm_exec_sql_text(s.sql_handle) t
WHERE t.text LIKE '%Sales%';
GO
SQL Server's plan cache after changing a Query Store setting (plans have been cleared)

SQL Server’s plan cache after changing a Query Store setting (plans have been cleared)

In checking the plan cache again, the ALTER DATABASE SET statement cleared the database’s cache.

Summary

As you can see, the database plan cache is cleared after you enable Query Store, or change any settings related to Query Store. This is the same behavior we see with other ALTER DATABASE SET commands (e.g. changing the recovery model).  Unfortunately, this is not documented, nor is anything written to the ERRORLOG.

Of note: I don’t expect that you are changing settings often (if you are, I’d like to understand that thought process, as once you find the right values for space and interval, I expect those settings to be static…and if you’re not sure where to start, feel free to check out my post discussing the different options). I also don’t expect that you are turning Query Store on and off throughout the day; that completely defeats the purpose of the feature. It should be enabled, and left enabled, all the time. You don’t know when a problem might occur, right?

Why aren’t you using Query Store?

Way back in 2016 I wrote a post trying to find out why people were not using Extended Events.  See, I really like Extended Events, and I’d been presenting on the topic for several years, but I kept running into people who wouldn’t even try XE.  There were also a lot of individuals who declared they were “Profiler for Life”.  Ok, but I wanted to understand why.

I find myself asking people the same question now about Query Store. I’ve presented a lot of sessions on the topic, with the primary goal of teaching people how to use Query Store.  As such, most of the people in my sessions are not using Query Store.  As an example, if I have 30 people in a session, typically only 25-30% are running SQL Server 2016 or higher.  And of those 7 to 10 people, maybe 2 to 5 are using Query Store.  Does that mean that only 5%-15% of SQL Server users are using Query Store?  No.  I have no idea how many people who are running SQL Server 2016 or higher are using Query Store.  I’m pretty sure Microsoft doesn’t have an accurate number either, except for Azure SQL Database.  Query Store is turned on for nearly all 2 million user databases on Azure SQL Database.

So the primary purpose of this post is to provide a place (the comments) for people to share why they are not using Query Store.  I’m very interested in hearing this information, so please share!  And please comment even if it’s the same as what someone else has written.

In addition, if you don’t know much about Query Store but want to get started because you’re running SQL Server 2016 or higher (or are looking to upgrade), I’ve included some links below to help you get started.  Some links are for documentation or blog posts, but if you’re not a believer of RTFM :) , there are also links to my Pluralsight courses (requires a subscription) and an introductory session I’ve given (free).  If you have questions not answered in a post or video, let me know this in the comments as well!

Lastly, I’ve had multiple questions related to the performance impact of enabling Query Store.  This is a great question – look for a post next week that provides more detail.  But until then, please let me know why you’re not using Query Store, and thanks for reading!

 

Do you need to update statistics after an upgrade?

This post originally went live on May 11, 2018, but modifications were made on May 14, 2018 after some additional internal discussions with Microsoft.  Changes made on May 14, 2018 are in blue.  

There are a variety of methods we use for helping customers upgrade to a new SQL Server version, and one question we often get asked is whether or not statistics need to be updated as part of the upgrade process.

tl;dr

Yes.  Update statistics after an upgrade.  Further, if you’re upgrading to 2012 or higher from an earlier version, you should rebuild your indexes (which will update index statistics, so then you just need to update column statistics).

History

Some of you may remember that the stats blob changed between SQL Server 2000 and SQL Server 2005, and Microsoft specifically recommended updating statistics after upgrading from SQL Server 2000.  Official Microsoft documentation about the stats blog change in SQL Server 2005 is difficult to find, but this article includes the following paragraph:

After you upgrade from SQL Server 2000, update statistics on all databases. Use the sp_updatestats stored procedure to update statistics in user-defined tables in SQL Server databases. This step is not necessary for upgrading from SQL Server 2005.

Current Microsoft documentation related to upgrading does not state anything specific about updating statistics, but people continue to ask and if you peruse forums, blog posts, and other social media options, you’ll see recommendations to update statistics.  Further, the documentation that Microsoft provides about when to update statistics does not mention anything about upgrades.

Side bar: I don’t recommend using sp_updatestats, and here’s why: Understanding What sp_updatestats Really Updates.

Today

The statistics blob has not changed since SQL Server 2000 to my knowledge, but I thought I would ask someone from Microsoft for an official recommendation to share publicly.  Here you go:

Microsoft suggests that customers test the need for a full update of statistics after a major version change and/or a database compatibility level change.

Further items to note:

  1. If Microsoft updates the format of statistics (e.g. the stats blog), customers will be expected to update statistics after an upgrade.
    1. Microsoft does not always upgrade the statistics format as part of a major version upgrade.
  2. There are occasions where Microsoft does not change the format of statistics, but they do change the algorithm for creating statistics as part of a major version upgrade or database compatibility level change.

In addition, there was a change in the nonclustered leaf level internals in SQL Server 2012, so if you are upgrading to 2012 or higher from an earlier version (e.g. 2008, 2008R2), rebuild your nonclustered indexes.  And remember, rebuilding indexes updates the statistics for those indexes with a fullscan, so you do not need to update them again.

Conclusion

As part of your upgrade methodology, it is recommended (by me, based on experience with a lot of customer upgrades) to build in time to update statistics.  I’ve gotten some pushback from customers who don’t want to update statistics after upgrade because it takes too long.  Some kind reminders:

  • Updating statistics is an online operation, therefore, the database and related applications are accessible and usable. A statistics update does take a schema modification lock so you’re not allowed to make any changes to a table while its stats are updating.  Therefore, if you decide to change your schema after upgrading your SQL Server version (not something I would typically recommend), do that before you update stats.
  • You need to update statistics regularly to provide the optimizer with current information about your data, so at some point it needs to be done. Immediately after an upgrade is a really good time, considering the aforementioned items.

If you’re not comfortable upgrading to a newer version of SQL Server, we can help!  I’m in the process of helping a customer migrate from SQL Server 2012 to SQL Server 2017, and I’m so excited to get them up to the latest version so they can start using some new features…like Query Store 😉

Query Store Examples: Stories from customers

In the past week I’ve used Query Store to troubleshoot performance issues for two different customers running SQL Server 2016 and higher.  I’ve presented a lot of sessions about what Query Store is, how to configure it, and basic use cases, but we really take it to the next level when we can provide Query Store examples from the real world.  If you are running SQL Server 2016 or SQL Server 2017, I hope you’re using this feature!

Scenario 1

A long time customer reached out for some help after adding 50% more memory to their VM and seeing no improvement in performance.  Sometimes you can’t throw hardware at an issue, and they wanted to look at using In-Memory OLTP to help improve performance.  After a 10 minute discussion on In-Memory OLTP, where I explained why it wouldn’t magically solve their performance issues (and also explained the amount of testing that would need to be done prior to implementation), I took a look at the system.

I started with a health audit and a review of wait statistics.  There were a couple settings they could change, but nothing was horribly misconfigured.  Wait stats showed nothing remarkable.  The majority of waits were due to CXPACKET, but the average duration was extremely low.  They had some WRITELOG and PAGEIOLATCH waits, but these were also low in average duration.  I asked if they had Query Store enabled…they did not.  I requested that they enable it to capture query information while I starting to look at plan cache data.  With Query Store collecting information, I used Glenn’s DMV queries to dig into the plan cache to look at queries that were executing most frequently, taking the longest, and consuming the most resources.

Now, while the plan cache has some fantastic information, it’s transitory.  Plans can fall out of cache because they aren’t used, or because they get recompiled, so truly tracking information for execution frequency over time can be a bit tricky.  Further, there are some plans that never make it into the plan cache (e.g. trivial plans or those with the RECOMPILE hint).  This is where Query Store shines.  EVERY query that executes can be captured in Query Store, even if it’s trivial or has the RECOMPILE hint.  Further, execution frequency and resource use is aggregated over defined intervals (e.g. 15 minutes, 30 minutes), as determined by the INTERVAL_LENGTH_MINUTES setting.  You can look back over a small or large period of time to see more than just what query performance looked like, but also how many times queries executed.

Several hours later, after a typical workload where users said performance was “slow”, we looked at the Query Store data.  In terms of high resource queries, there were a few (e.g. lots of reads, or lots of CPU), but the execution frequency was low for a lot of those heavy hitters.  Then we looked at queries executing most frequently – the death by a thousand cuts scenario – and here we could see that there were thousands of queries executing in an hour, and while individually a query might not take long to execute, or use a lot of resources, but cumulatively it added up.

Queries with high execution count in Query Store

Queries with high execution count in Query Store

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

In digging into some of those queries I quickly noticed that almost every plan had a missing index recommendation.  I queried sys.indexes and found that almost 95% of the tables in the database had 2 indexes or less.  It’s a rare case where a database is under-indexed.  Their problem?  Their workload volume and their data are slowly, but steadily increasing.  They have some big tables that are missing indexes and causing scans, and even though all that data is sitting in memory (because they have enough server memory to hold the entire database), they are using a lot of CPU and time to roll through that data.  I could have determined this with the information in the plan cache, but I would have had to set up a job to capture it on a regular basis and then write some queries to do analysis against it.  With Query Store, I just had to enable it, let it capture the data, then use the UI to look at performance.

Scenario 2

A new customer engaged with us after upgrading to SQL Server 2017.  They had some extreme cases of variability in query performance – the system would be running fine and then all of the sudden performance would tank.  They would knee-jerk and free the plan cache, then suddenly things would be great again.  A couple hours later, the same problem.  Rinse and repeat, rinse and repeat.  We had them enable Query Store and within a couple hours we took a look at the data (letting them continue with the practice of freeing procedure cache when there was a problem).  Within the Query Store UI I looked at CPU use for that window of time, and used the data grid to sort the output to view queries with multiple plans first.  There it was…they had a select number of queries that had multiple plans with huge variations due to different input parameters – queries that were parameter sensitive.

Viewing multiple plans for the same query in Query Store

Viewing multiple plans for the same query in Query Store

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

We used Query Store to force the most optimal plan and thus stabilize performance, then we looked at the query itself and the different plans being generated.  Within an hour we determined one code change and an index that would generate a consistent plan.  After testing these changes and implementing them, we unforced the plan, confirmed that the new plan we wanted was consistently used, and moved on to the next challenge.

Summary

If you’re running SQL Server 2016 or higher, I highly recommend enabling Query Store…even if you have a third party monitoring tool.  Don’t get me wrong, those tools are great and track a ton of information that Query Store doesn’t.  But those tools don’t capture EVERY query, nor do they capture query metrics all the time.  Lastly, they don’t provide the ability to force plans.  You get all that with Query Store, and more :)

If you’re interesting in learning more about Query Store from the ground up, you’re in luck!  In two weeks I’m hosting a live, online Immersion Event for Query Store that you can attend from the comfort of your home (or your desk at work :)  You can find a course description and registration details here.  I hope you’re able to attend!

Query Store Requests

In early January Microsoft announced that Connect, the method for filing SQL Server bugs and feature requests, was being retired.  It was replaced by User Voice, and any bugs/requests were ported over.  Sadly, the votes from Connect did not come across to User Voice, so I went through and found all the Query Store requests, which are listed below.  If you could please take the time to up-vote them, that would be fantastic.  If you could also take time to write about why this would help your business, help you upgrade, or purchase more SQL Server licenses, that is even better.  It helps the product team immensely to understand how this feature/fix/functionality helps you and your company, so taking 5 minutes to write about that is important.

Two things…I’ve listed these in order of priority to me, but of course your priorities may be different!  :)  Second, if I have missed a bug/request, please feel free to email me or comment so I can add it to this list.  Thanks!

  1. Enable Query Store for collection on a read-only replica in an Availability Group
  2. Indicate “morally equivalent” forced plan where use_plan = true but is_forced = 0
    1. Kendra Little wrote a post explaining this behavior: Forced Plan Confusion: Is_Forced vs. Use Plan = True
  3. Query store create database
    1. Check out Andy Mallon’s blog post about discovering this issue: Deadlock when creating a database – A Query Store bug
  4. Option to store query store data in a filegroup other than PRIMARY
  5. SQL Server Management Studio 17.3 (14.0.17199.0) returns incorrect syntax on WAIT_STATS_CAPTURE_MODE
  6. QDS_LOADDB Wait type blocks all queries
    1. You can use trace flag 7752 to get around this, see Query Store Trace Flags
  7. Query Store SELECT Performance
  8. “Edit Query Text” query text doesn’t match the actual query in 2017 RTM
  9. Add an Extended Events action for query_id
  10. Max Plan per query missing from Query Store properties in SSMS 2016
    1. Still an issue in SSMS 17.5
  11. [SQL Server 2016 CTP 2.4 – Query Store] Functionality to collect the information of queries which are not finished
  12. Query Store – Add Actual memory Grant Used From sys.dm_exec_query_stats
    1. In SQL Server 2017, sys.query_store_runtime_stats the data captured includes:
      1. avg_query_max_used_memory
      2. last_query_max_used_memory
      3. min_query_max_used_memory
      4. max_query_max_used_memory
      5. stdev_query_max_used_memory

      This data represents memory used for the query, not the memory grant, but if I were interested in using Resource Governor to create pools with memory limits for queries, this is pretty good information from which to work.

  13. Bugbash: Enable Query Store for tempdb
  14. Add Query Store to Model for Auto Configuration on New Databases
    1. As far I know, this works.  If you enable Query Store for model, it will be enabled by default for new databases you create (unless you otherwise specify), see SQL Server Query Store: Default Settings.

 

Query Store Fix in SQL Server 2017

There is a Query Store fix in the latest Cumulative Updates of SQL Server 2017, and if you are keeping current with the latest cumulative updates, there’s a script you need to run if you’ve installed CU2 at any point.

First, if you are still on CU2, you might want to consider upgrading to CU4 (released February 17, 2018).

Second, if you previously had CU2 installed (and are now on CU3 or CU4), you want to make sure that you’ve run the script included with CU3 and CU4 release notes.

The script removes plans from Query Store that were captured when running CU2.  If you want to know for certain whether your database is affected before you execute the script, you can run the following query against the databases with Query Store enabled:

/* execute against EACH database that has Query Store enabled */
SELECT COUNT([plan_id])
FROM [sys].[query_store_plan]
WHERE [engine_version] = '14.0.3008.27';
GO

As an aside, if you aren’t using Query Store, notice that one of the things it captures for the plan is SQL Server version…how cool is that in terms of testing, upgrading, and troubleshooting?

If you have plans from CU2, they need to be removed from Query Store, and this is done in the script using the sys.sp_query_store_remove_plan function.  Understand that in addition to removing the query plan, it will also remove the runtime stats tied to that plan from Query Store.  But, more important than that, if that plan was forced for a query, it will be un-forced before it is removed.

Therefore, before you run the script from Microsoft, I strongly recommend you not just check to see if you have plans from CU2, but you also look to see if any of those are forced:

/* execute against EACH database that has Query Store enabled */
SELECT
   [p].[query_id],
   [p].[plan_id],
   CASE
      WHEN [q].[object_id] = 0 THEN 'Ad-hoc'
      ELSE OBJECT_NAME([q].[object_id])
   END AS [Object],
   [qt].[query_sql_text],
   [q].*, TRY_CONVERT(XML, [p].[query_plan]) AS [QueryPlan_XML]
FROM [sys].[query_store_plan]  [p]
JOIN [sys].[query_store_query] [q]
   ON [p].[query_id] = [q].[query_id]
JOIN [sys].[query_store_query_text] [qt]
   ON [q].[query_text_id] = [qt].[query_text_id]
WHERE  [engine_version] = '14.0.3008.27'
   AND [p].[is_forced_plan] = 1;
GO

This script will list any queries that have forced plans, and if they are part of an object (e.g. stored procedure) it will also list the object name.  If no rows return from this query, then you don’t have any forced plans which are affected and you can run the script from Microsoft.

If you do have any queries with forced plans, I recommend that you save a copy of the forced plan as a .sqlplan file, so that you have documentation of what plan was forced.  There are two ways to do this:

  1. Within the Query Store page for the user database in SSMS, run the Queries with Forced Plans report. In the grid, sort the list by query_id.  For each query_id identified by the query above, select the forced plan, then right-click on it and save.
  2. Within the Query Store page for the user database in SSMS, open the Tracked Queries report. Individually enter each query_id identified by the query above, select the forced plan, then right-click on it and save.

Once you have saved off a copy of every forced plan, then you can execute the script from Microsoft.

Note: If you are using Automatic Plan Correction, this will also un-force those plans (which would have been forced automatically).

At this point, queries that previously had a forced plan may have performance issues.  It’s possible that changes in your data distribution, changes in statistics, or perhaps even changes in the optimizer have caused a different plan to be generated that might be acceptable performance-wise.  If that is not the case, there is no ability to import plans into Query Store.  Each query that had a forced plan, that doesn’t get a new, “good” plan, may need to be run again to get the same plan into Query Store.  One of the easiest things to do is to find compiled values within the plan using the ParameterCompiledValue attribute, then re-run the query using those values.  You can then use data from Query Store, and compare against the saved plan, to verify it’s a consistent/stable plan.

Finding input parameters from the query plan

Finding input parameters from the query plan

 

 

 

 

 

 

 

 

 

 

 

 

 

I definitely recommend updating to the current Cumulative Update (and this is a general recommendation, not just because of the issue I’ve discussed here), and part of preparing for that upgrade means checking to see if you’re affecting by this issue, and addressing it as part of the upgrade process, rather than down the road.