Monitoring Space Used by Query Store

Last week I presented a session on Query Store and when talking about the settings I mentioned that monitoring space used by Query Store is extremely important when you first enable it for a database.  Someone asked me how I would do that and as I provided an explanation I realized that I should document my method…because I give the same example every time and I would be nice to have the code.

For those of you not familiar with the Query Store settings, please check out my post which lists each one, the defaults, and what I would recommend for values and why.  When discussing MAX_STORAGE_SIZE_MB, I mention monitoring via sys.database_query_store_options or Extended Events.  As much as I love Extended Events, there isn’t an event that fires based on a threshold exceeded.  The event related to size is query_store_disk_size_over_limit, and it fires when the space used exceeds the value for MAX_STORAGE_SIZE_MB, which is too late.  I want to take action before the maximum storage size is hit.

Therefore, the best option I’ve found is to create an Agent job which runs on a regular basis (maybe every four or six hours initially) that checks current_storage_size_mb in sys.database_query_store_options and calculates the space used by Query Store as a percentage of the total allocated, and then if that exceeds the threshold you set, send an email.  The code that you can put into an Agent job is below.  Please note you want to make sure the job runs in the context of the user database with Query Store enabled (as sys.database_query_store_options is a database view), and configure the threshold to a value that makes sense to your MAX_STORAGE_SIZE_MB.  In my experience, 80% has been a good starting point, but feel free to adjust as you see fit!

Once your Query Store size has been tweaked and stabilized, I would leave this job in place as a safety to alert you should anything change (e.g. someone else changes a Query Store setting which indirectly affects the storage used).

/* Change DBNameHere as appropriate */
USE [DBNameHere]

/* Change Threshold as appropriate */
DECLARE @Threshold DECIMAL(4,2) = 80.00
DECLARE @CurrentStorage INT

SELECT @CurrentStorage = current_storage_size_mb, @MaxStorage = max_storage_size_mb
FROM sys.database_query_store_options

IF (SELECT CAST(CAST(current_storage_size_mb AS DECIMAL(21,2))/CAST(max_storage_size_mb AS DECIMAL(21,2))*100 AS DECIMAL(4,2))
FROM sys.database_query_store_options) >= @Threshold

     DECLARE @EmailText NVARCHAR(MAX) = N'The Query Store current space used is ' + CAST(@CurrentStorage AS NVARCHAR(19)) + 'MB
     and the max space configured is ' + CAST(@MaxStorage AS NVARCHAR(19)) + 'MB,
     which exceeds the threshold of ' + CAST(@Threshold AS NVARCHAR(19) )+ '%.
     Please allocate more space to Query Store or decrease the amount of data retained (stale_query_threshold_days).'

     /* Edit profile_name and recipients as appropriate */
     EXEC msdb.dbo.sp_send_dbmail
     @profile_name = 'SQL DBAs',
     @recipients = '',
     @body = @EmailText,
     @subject = 'Storage Threshold for Query Store Exceeded' ;


Today I opened up a SQL Server ERRORLOG and saw these two messages repeated every 20 seconds or so:

Starting up database ‘AdventureWorks2014’.

CHECKDB for database ‘AdventureWorks2014’ finished without errors on 2015-08-23 02:15:08.070 (local time).  This is an information message only; no user action required.

When you initially see these two messages repeated over and over, it might seem like SQL Server is caught in some issue with recovery.  Or you might think it’s running CHECKDB over and over.  Neither are true.  The database has AUTO_CLOSE enabled.  (And you see the CHECKDB message because it’s reading the boot page and noting the last time CHECKDB ran successfully…to see what updates that entry, check out my post What DBCC Checks Update dbccLastKnownGood?)

When AUTO_CLOSE is enabled, after the last user exits the database, the database shuts down and its resources are freed.  When someone tries to access the database again, the database reopens.  You might be thinking that for databases that are not accessed that often, this might be a good thing.  After all, freeing resources and giving them back to SQL Server for use elsewhere sounds useful.  Not so much.  There’s a cost associated with that shut down, and a cost to open the database back up when a user connects.  For example – shutting down a database removes all plans for that database from cache.  The next time a user runs a query, it will have to be compiled.  If the user disconnects, the plan is freed from cache.  If someone connects one minute later and runs the same query, it has be compiled again.  You get the point: this is inefficient.  And really, how many databases in your production environment do you really not access?  If you’re not accessing the database, why is it in a production instance?  If you want a few more details on AUTO_CLOSE, check out the entry for ALTER DATABASE in Books Online.

I am sure (maybe?) that there are valid cases for having AUTO_CLOSE enabled.  But I haven’t found one yet 🙂

On top of the resource use, realize that every time the database starts up, you’re going to get the above two messages in the ERRORLOG.  In the log I was looking at, there were multiple databases with this option enabled, so the log was flooded with these messages.  In general, I’m a huge fan of cycling the ERRORLOG on a regular basis (just set up an Agent job that runs sp_cycle_errorlog every week), and I try to reduce “clutter” in the log as much as possible.  This means don’t enable a setting like AUTO_CLOSE which can put in all those messages, and use trace flag 3226 to stop logging successful backup messages (they still go to msdb).

Oh yes, to disable AUTO_CLOSE: