At some point, whether you’re a DBA, developer, or application administrator, you’re going to find yourself troubleshooting performance in Azure. Within the Azure Portal you have the capability to look at performance for an Azure SQL via the Performance overview and Query Performance insight pages.
If you haven’t worked with SQL Server previously, this is a great place to start as it provides a high level view of system performance, broken out by DTUs, CPU, and data and log I/O. You can dig into specific queries that might be causing issues through Query Performance Insight, including query text and resource use over time.
But if you’re familiar with troubleshooting in SQL Server, this method feels different, and slower. Not to worry, many of the tools you’re familiar with still work in Azure SQL, you just need a few tweaks.
Whenever I’m dropping into a server – whether it’s one I’m familiar with or not – to look at a performance issue, I want to understand wait statistics. Specifically, I want to know what wait statistics while the problem is occurring. For this reason, I don’t just query sys.dm_os_wait_stats, I snapshot it, using Paul’s script from his post Capturing wait statistics for a period of time.
You can use this as-is in Azure SQL, but, since it queries sys.dm_os_wait_ stats, you’re not looking at wait statistics specific to your database, which is really what you care about. So you have to change sys.dm_os_wait_ stats to sys.dm_db_wait_stats in the script. Once you make that switch, you’re good to go – although depending on the issue you’re dealing with you may change the delay between snapshots to something lower than 30 minutes, which is what Paul has in the script. I sample for either 5 and 10 minutes, depending on what else I want to check. This also assumes the problem is occurring WHILE I am capturing wait statistics.
I think I’ve bought Adam Machanic drinks a few times, but at this point I feel like I owe him several dinners at high end restaurants. While I’m waiting for the wait statistics snapshot to complete, I like to run WhoIsActive. You want to grab that from his blog, rather than GitHub. If you read the post, there was an issue with the original version of the script, fixed here in v2. Note that when you create the stored procedure you want to do that in your user database. It will fail if you try to create it in master. Once it’s created, you can use all your favorite parameters for execution:
EXEC sp_WhoIsActive; /* default, what’s running right now */
EXEC sp_WhoIsActive @find_block_leaders = 1; /* list output with lead blocking sessions at the top */
EXEC sp_WhoIsActive @get_plans = 1; /* pull execution plans */
If you haven’t worked with WhoIsActive much, I’ll refer to you the documentation. Between wait statistics and WhoIsActive, I usually get an idea of where to look next. And this is where the next script comes in handy…
SQL Server Diagnostic Queries
You can get the current version of Glenn’s Diagnostic Queries for each release on his Resources page. Note that Azure SQL and Managed Instances have separate scripts. With the Diagnostics Queries in hand, I can dig deeper into what I think might be the issue, or I can just start rolling through the queries in order to check configuration and performance data. I may end up on some tangents, depending on what I find, but these three scripts together provide a great starting point when troubleshooting performance in Azure, whether I have a complaint that the system is slow overall, or I’m told that a particular set of queries is slow. Happy troubleshooting!