Last week I presented a session, Demystifying Statistics in SQL Server, at the PASS Community Summit, and I had a lot of great questions; so many that I’m creating multiple posts to answer them. This second post is dedicated to questions specific to updating statistics in SQL Server. Of note…I have a couple previous posts which also include helpful information:
- SQLskills SQL101: Updating SQL Server Statistics Part I – Automatic Updates
- SQLskills SQL101: Updating SQL Server Statistics Part II – Scheduled Updates
- Q: If we have both update statistics and updatestats asynch set to true, what happens?
- They are both in effect, it is not one or the other. When the amount of data modified has exceeded the threshold, statistics will update automatically, and it will be asynchronous if that option is enabled
- Q: When updating stats what is the best approach for highly transactional environments? We do a small update stats every 10 minutes. Then Ola script updates once a day. We are still have stat issues.
- This is tricky to answer, I’d really need to know more about what “have stat issues” means before I would make any kind of suggestion.
- In general, I recommend having a scheduled job that runs regularly to update statistics. For larger databases, or those that are very volatile, you may have multiple jobs that update statistics – one job may run once a week (or every other week, or once a month) to update large tables, one job may run weekly to update stats that are outdated for any table that is not large, and then another job may run every hour to update volatile tables.
- Q: Would it be more likely to have auto update stats async enabled for larger databases / data sets? Is it less important to smaller databases?
- You would likely see more of a benefit with AUTO_UPDATE_STATISTICS_ASYNC enabled for a larger database, but that doesn’t preclude it from being enabled for a smaller database.
- Q: When do you recommend using update stats with FULLSCAN rather than default sampled?
- When you know there is skew in the column.
- How do you know when there is skew in the column?
- You know the data in the tables and you know that something like a CustomerID has a skew (where OrderID does not).
- You discover it when troubleshooting query performance.
- You use Kimberly’s script to find columns that have skew.
- Q: What is your take on updating statistics with fullscan compared to other sampling frequencies?
- For smaller databases, or those environments with large maintenance windows, I recommend the easy button:
- update all statistics in the database with fullscan
- When that type of task no longer fits in your maintenance window, or you don’t want to spend resources on that, then I find that the default sample is good enough, unless there is skew.
- I always recommend FULLSCAN for columns that have data skew.
- For smaller databases, or those environments with large maintenance windows, I recommend the easy button:
- Q: Is it useful to periodically manually update statistics with sp_updatestats?
- I have a strong aversion to sp_updatestats and this is why:
- I much prefer using Ola’s script to update statistics:
- Q: In actual execution plan, may I see somewhere in the properties the time spent updating statistics? Or may I see this somewhere else? (Plan cache, QS, DMVs)
- This does not appear in the plan (neither the actual or the one in the plan cache) as an UPDATE STATISTICS is its own statement. You can see that statement in the DMVs, or with Extended Events, or in Query Store.
- Correlating increased query duration to an automatic UPDATE STATISTICS takes a bit of extra work…that would be a good blog post to write.
- Q: What about update statistics and plan cache? Do we invalidate a plan when we update stats?
- Yes, when you update a statistic, any plan that exists in cache that uses that statistic is invalided, and will be recompiled the next time it is used.
- Q: In an OLTP environment when data is changing, is it advisable to run a update statistics on a regular basis or recompile the stored procedure when the plan goes bad?
- I much prefer to proactively manage statistics, so if you have a volatile table (or set of tables) you may have an Agent Job that runs regularly (every hour, every 15 minutes? It really depends on the size of the table and the churn) to keep statistics fresh.
- I do not recommend just running a recompile on the SP if the plan goes bad, as that’s just a quick “fix”, and doesn’t determine the root issue and solve the problem long term.
- Q: How to monitor who ran a manual update statistics? I’ve used XE object_altered and it doesn’t catch it.
- Use the sql_statement_completed event in Extended Events (filter appropriately).
- Q: In some cases using read_only databases turn OFF update statistics make an advantage.
- I’m not sure if this was a question or a comment. You could do this, and if so I would recommend that you update all statistics with FULLSCAN and then turn AUTO_UPDATE_STATISTICS off. But if you’re READ_ONLY you cannot change any data, so auto update won’t ever kick in.
- Q: How are stats management/maint strategies impacted with Azure SQL (latest versions) and upcoming SQL 2022?
- For Azure SQL, there is no Agent so you need to use another method to proactively manage statistics. You still have AUTO_UPDATE_STATISTICS, so stats will be updated automatically, but again, I still like having something in place to manage them. You can use Azure Automation and runbooks for this: