Troubleshooting Performance in Azure SQL

At some point, whether you’re a DBA, developer, or application administrator, you’re going to find yourself troubleshooting performance in Azure.  Within the Azure Portal you have the capability to look at performance for an Azure SQL via the Performance overview and Query Performance insight pages.

Checking Performance in the Azure Portal

Checking Performance in the Azure Portal

If you haven’t worked with SQL Server previously, this is a great place to start as it provides a high level view of system performance, broken out by DTUs, CPU, and data and log I/O.  You can dig into specific queries that might be causing issues through Query Performance Insight, including query text and resource use over time.

But if you’re familiar with troubleshooting in SQL Server, this method feels different, and slower.  Not to worry, many of the tools you’re familiar with still work in Azure SQL, you just need a few tweaks.

Wait Statistics

Whenever I’m dropping into a server – whether it’s one I’m familiar with or not – to look at a performance issue, I want to understand wait statistics.  Specifically, I want to know what wait statistics while the problem is occurring.  For this reason, I don’t just query sys.dm_os_wait_stats, I snapshot it, using Paul’s script from his post Capturing wait statistics for a period of time.

You can use this as-is in Azure SQL, but, since it queries sys.dm_os_wait_ stats, you’re not looking at wait statistics specific to your database, which is really what you care about.  So you have to change sys.dm_os_wait_ stats to sys.dm_db_wait_stats in the script.  Once you make that switch, you’re good to go – although depending on the issue you’re dealing with you may change the delay between snapshots to something lower than 30 minutes, which is what Paul has in the script.  I sample for either 5 and 10 minutes, depending on what else I want to check.  This also assumes the problem is occurring WHILE I am capturing wait statistics.

WhoIsActive

I think I’ve bought Adam Machanic drinks a few times, but at this point I feel like I owe him several dinners at high end restaurants.  While I’m waiting for the wait statistics snapshot to complete, I like to run WhoIsActive.  You want to grab that from his blog, rather than GitHub.  If you read the post, there was an issue with the original version of the script, fixed here in v2.  Note that when you create the stored procedure you want to do that in your user database.  It will fail if you try to create it in master.  Once it’s created, you can use all your favorite parameters for execution:

EXEC sp_WhoIsActive;   /* default, what's running right now */
GO

EXEC sp_WhoIsActive @find_block_leaders = 1;   /* list output with lead blocking sessions at the top */
GO

EXEC sp_WhoIsActive @get_plans = 1;   /* pull execution plans */

If you haven’t worked with WhoIsActive much, I’ll refer to you the documentation.  Between wait statistics and WhoIsActive, I usually get an idea of where to look next.  And this is where the next script comes in handy…

SQL Server Diagnostic Queries

You can get the current version of Glenn’s Diagnostic Queries for each release on his Resources page.  Note that Azure SQL and Managed Instances have separate scripts.  With the Diagnostics Queries in hand, I can dig deeper into what I think might be the issue, or I can just start rolling through the queries in order to check configuration and performance data.  I may end up on some tangents, depending on what I find, but these three scripts together provide a great starting point when troubleshooting performance in Azure, whether I have a complaint that the system is slow overall, or I’m told that a particular set of queries is slow. Happy troubleshooting!

Using relog: Creating a smaller file

 

Today I had to remember the steps for using relog. If you’ve never heard of relog, stay with me. It’s a tool for merging PerfMon files, and also for extracting smaller data sets from Perfmon. I still love PerfMon friends, I admit it. It’s been a staple for understanding performance on a Windows Server since I’ve started in IT. But I haven’t used relog in forever so I completely forgot the commands I needed.

The problem

I provided a PerfMon template to a customer a few weeks ago and asked them to start capturing data.  I requested that they roll over the file daily, but that request got lost in the shuffle.  They manually stopped and restarted the data collector intermittently, so when they sent me the file to review data from an issue they had over the weekend, it was almost 5GB in size.  Oof.  I loathe opening large PerfMon files in the UI because it takes a long to load.  This is where using relog came in because I only needed to look at about an hour’s worth of data.

Finding the command I needed

I did a quick Google search looking for the syntax, and landed on a post from my friend Andy Galbraith, Handling Perfmon Logs with Relog.  It’s a useful post if you want to merge multiple files, and I recommend it if you need that.  I needed to separate out my data.

My very first SQL Server presentation was on the topic of baselines, and I used to demo relog.  I had to have the code somewhere right?  Nope.  I did find my old presentation and wow…that needs updating.  Then I searched my old blog and found a post where I had the syntax.  Yay past me!

However, no one else would ever find that post so I thought I’d share the syntax here.  Also, if you go to the official relog documentation the syntax is there, but it’s not quite correct.

Using relog to extract data based on a date range

The parameters are -b for begin time and -e for end time.  And for both parameters the documentation states:

Date syntax for -b and -e

Date syntax for -b and -e

 

Do you see a space between the year and the hour?  Me neither.  This syntax fails:

relog SQLskills_DataCollector.blg -b 3/21/202019:00:00 -e 3/21/202020:00:00 -o SQLskills_March21.blg

relog error with no space between date and time

relog error with no space between date and time

 

But if I add a space between the date and the time:

relog SQLskills_DataCollector.blg -b 3/21/2020 19:00:00 -e 3/21/2020 20:00:00 -o SQLskills_March21.blg

It works:

correct relog command

correct relog command

 

And with that, I’m off to analyze some data!

 

What columns are in that index?

I’ve been helping several customers with tuning lately, which typically means I’m also looking at indexes and asking questions. A pattern I’ve noticed is columns from the clustering key being added to the key definition for the nonclustered index.  This brings up some interesting considerations when it comes to creating your nonclustered indexes.

Review

We’re going to start with a copy of WideWorldImporters, and create a table and indexes from it.
Notice that I’m adding a column to this table named OrderID_and_OrderLineID, which is the OrderID and the OrderLineID values concatenated. In Sales.OrderLines, OrderLine is unique, but OrderID is not. The two of them concatenated are unique, and I made this my primary key, as well as my clustering key (remember, you can have a primary key that’s different than your clustering key, which is a separate topic for another day).

USE [WideWorldImporters]
GO

DROP TABLE IF EXISTS [Sales].[SQLskills_OrderLines];
GO

CREATE TABLE [Sales].[SQLskills_OrderLines](
     [OrderLineID] [int] NOT NULL,
     [OrderID] [int] NOT NULL,
     [OrderID_and_OrderLineID] [bigint] NOT NULL,
     [StockItemID] [int] NOT NULL,
     [Description] [nvarchar](100) NOT NULL,
     [PackageTypeID] [int] NOT NULL,
     [Quantity] [int] NOT NULL,
     [UnitPrice] [decimal](18, 2) NULL,
     [TaxRate] [decimal](18, 3) NOT NULL,
     [PickedQuantity] [int] NOT NULL,
     [PickingCompletedWhen] [datetime2](7) NULL,
     [LastEditedBy] [int] NOT NULL,
     [LastEditedWhen] [datetime2](7) NOT NULL,
     CONSTRAINT [PK_Sales_SQLskills_OrderLines] PRIMARY KEY CLUSTERED
     (
          [OrderID_and_OrderLineID] ASC
     )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, 
     ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [USERDATA]
     ) ON [USERDATA];
GO

INSERT INTO [Sales].[SQLskills_OrderLines](
     [OrderLineID]
     ,[OrderID]
     ,[OrderID_and_OrderLineID]
     ,[StockItemID]
     ,[Description]
     ,[PackageTypeID]
     ,[Quantity]
     ,[UnitPrice]
     ,[TaxRate]
     ,[PickedQuantity]
     ,[PickingCompletedWhen]
     ,[LastEditedBy]
     ,[LastEditedWhen]
     )
SELECT
     [OrderLineID]
     ,[OrderID]
     ,CAST(CAST([OrderID] AS VARCHAR(20)) + CAST([OrderLineID] AS VARCHAR(20)) AS BIGINT)
     ,[StockItemID]
     ,[Description]
     ,[PackageTypeID]
     ,[Quantity]
     ,[UnitPrice]
     ,[TaxRate]
     ,[PickedQuantity]
     ,[PickingCompletedWhen]
     ,[LastEditedBy]
     ,[LastEditedWhen]
FROM [Sales].[OrderLines];
GO

We’ll use Kimberly’s version of sp_helpindex, sp_SQLskills_helpindex, to see what’s truly in the index, and where. I much prefer this version to the built-in sp_helpindex, as it includes information about included columns, filters, and tells you what columns are in the tree and leaf levels.

EXEC sp_SQLskills_helpindex 'Sales.SQLskills_OrderLines';
GO
Clustered index for SQLskills_OrderLines

Clustered index for SQLskills_OrderLines

This output is wide, which doesn’t lend itself well to a post, so you may need to zoom in on the image. I also trimmed out the included and filtered columns as neither are relevant in this discussion.

Notice that the only column in columns_in_tree is the concatenated OrderID and OrderLineID, and then all columns are in columns_in_leaf (because the leaf level of the clustered index is all the data).  Now we’ll add a nonclustered index on OrderID:

CREATE NONCLUSTERED INDEX [NCI_OrderID]
     ON [Sales].[SQLskills_OrderLines] ([OrderID])
     ON [USERDATA];
GO

EXEC sp_SQLskills_helpindex 'Sales.SQLskills_OrderLines';
GO
Nonclustered index for SQLskills_OrderLines

Nonclustered index for SQLskills_OrderLines

In columns_in_tree we have our index key, OrderID, but we also have our clustering key, OrderID_and_OrderLineID, and both are in columns_in_leaf. What happens if we create a nonclustered index on OrderLineID, and make it unique?

CREATE UNIQUE NONCLUSTERED INDEX [NCI_OrderLineID]
     ON [Sales].[SQLskills_OrderLines] ([OrderLineID])
     ON [USERDATA];
GO

EXEC sp_SQLskills_helpindex 'Sales.SQLskills_OrderLines';
GO
Unique nonclustered index for SQLskills_OrderLines

Unique nonclustered index for SQLskills_OrderLines

In the sp_SQLskills_helpindex output you’ll notice that OrderLineID is the only column in columns_in_tree, and both OrderLineID and OrderID_and_OrderLineID are in columns_in_leaf.

Remember that when you identify a nonclustered index as unique, it will not add the clustering key to the tree level, it just adds it to the leaf level. But if the nonclustered index is not declared as unique, then SQL Server adds the clustering key to the tree level to uniquely identify the row.

If we had created a clustered index that we didn’t also identify as unique, then SQL Server adds a UNIQUIFIER to the clustered index key, which then also gets added to the nonclustered index.

Defining the clustering key as part of the non-clustered key

What I’ve seen lately when looking at indexes is this:

CREATE NONCLUSTERED INDEX [NCI_OrderID_2]     
     ON [Sales].[SQLskills_OrderLines] ([OrderID], [OrderID_and_OrderLineID])
     ON [USERDATA];
GO

This is the same as the NCI_OrderID index I created, but someone has specifically placed the clustering key in the nonclustered index key. I always ask why they added the clustering key. Typically the answer is: I know it’s going to be needed in the query. That’s fine, but understand that it doesn’t matter whether you add the clustering key to the key for a non-unique nonclustered index or not; if you don’t, SQL Server will add it.

We’ve kept it simple so far with a single column for our clustering key. What if we have multiple columns? We’ll change up our table a bit:

USE [WideWorldImporters]
GO

DROP TABLE IF EXISTS [Sales].[SQLskills_OrderLines];
GO

CREATE TABLE [Sales].[SQLskills_OrderLines](
     [OrderLineID] [int] NOT NULL,
     [OrderID] [int] NOT NULL,
     [OrderID_and_OrderLineID] [bigint] NOT NULL,
     [StockItemID] [int] NOT NULL,
     [Description] [nvarchar](100) NOT NULL,
     [PackageTypeID] [int] NOT NULL,
     [Quantity] [int] NOT NULL,
     [UnitPrice] [decimal](18, 2) NULL,
     [TaxRate] [decimal](18, 3) NOT NULL,
     [PickedQuantity] [int] NOT NULL,
     [PickingCompletedWhen] [datetime2](7) NULL,
     [LastEditedBy] [int] NOT NULL,
     [LastEditedWhen] [datetime2](7) NOT NULL,
     CONSTRAINT [PK_Sales_SQLskills_OrderLines] PRIMARY KEY CLUSTERED
     (
     [OrderID] ASC, [OrderLineID] ASC, [StockItemID] ASC, [PackageTypeID] ASC
     )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, 
     ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [USERDATA]
     ) ON [USERDATA];
GO

INSERT INTO [Sales].[SQLskills_OrderLines](
     [OrderLineID]
     ,[OrderID]
     ,[OrderID_and_OrderLineID]
     ,[StockItemID]
     ,[Description]
     ,[PackageTypeID]
     ,[Quantity]
     ,[UnitPrice]
     ,[TaxRate]
     ,[PickedQuantity]
     ,[PickingCompletedWhen]
     ,[LastEditedBy]
     ,[LastEditedWhen])
SELECT
     [OrderLineID]
     ,[OrderID]
     ,CAST(CAST([OrderID] AS VARCHAR(20)) + CAST([OrderLineID] AS VARCHAR(20)) AS BIGINT)     
     ,[StockItemID]
     ,[Description]
     ,[PackageTypeID]
     ,[Quantity]
     ,[UnitPrice]
     ,[TaxRate]
     ,[PickedQuantity]
     ,[PickingCompletedWhen]
     ,[LastEditedBy]
     ,[LastEditedWhen]
FROM [Sales].[OrderLines];
GO

EXEC sp_SQLskills_helpindex 'Sales.SQLskills_OrderLines';
GO
Four column clustered index for SQLskills_OrderLines

Four column clustered index for SQLskills_OrderLines

I don’t generally advocate a wide clustered index, but there are certain scenarios where it’s needed/appropriate. If we take this table and add a nonclustered index, not unique, on one column, then you’ll see all four columns from the clustering key in the tree level:

CREATE NONCLUSTERED INDEX [NCI_Quantity]
     ON [Sales].[SQLskills_OrderLines] ([Quantity])
     ON [USERDATA];
GO

EXEC sp_SQLskills_helpindex 'Sales.SQLskills_OrderLines';
GO
Nonclustered index (with wide CI) for SQLskills_OrderLines

Nonclustered index (with wide CI) for SQLskills_OrderLines

For a large table, the size of this nonclustered index has the potential to be much, much larger than a table where the clustering key is only a single column.

If we add a unique nonclustered index, then that clustering key is only in the leaf level:

CREATE UNIQUE NONCLUSTERED INDEX [NCI_OrderID_and_OrderLineID]
     ON [Sales].[SQLskills_OrderLines] ([OrderID_and_OrderLineID])
     ON [USERDATA];
GO

EXEC sp_SQLskills_helpindex 'Sales.SQLskills_OrderLines';
GO
Unique nonclustered index (with wide CI) for SQLskills_OrderLines

Unique nonclustered index (with wide CI) for SQLskills_OrderLines

But then…what happens if we add one or two columns from the clustering key into the nonclustered index key?

CREATE NONCLUSTERED INDEX [NCI_PickingCompletedWhen]
     ON [Sales].[SQLskills_OrderLines] ([PickingCompletedWhen], [OrderID], [OrderLineID])
     ON [USERDATA];
GO

EXEC sp_SQLskills_helpindex 'Sales.SQLskills_OrderLines';
GO
Nonclustered index with column from clustering key for SQLskills_OrderLines

Nonclustered index with column from clustering key for SQLskills_OrderLines

It doesn’t matter! All the columns from the clustering key are added to the tree level because I did not identify it as unique! If I had defined the nonclustered index as unique, then only those two columns that I specified in the key would be in the tree, and the other two columns from the clustering key would be part of the leaf level.

Conclusion

There are a couple points to remember when designing your nonclustered indexes. First, whenever you know that the column, or combination or columns, makes a unique key for a nonclustered index, I always recommend that you specify UNIQUE when you create the index. That information is great for the optimizer to have when generating query plans. Second, think carefully about the impact of your clustering key on your nonclustered indexes. Any non-unique nonclustered index will be larger (more pages, eventually more levels) because the entire clustering key will be in the tree level. Further, if you happen to have a clustering key with a column (or multiple columns) that are frequently modified, then those modifications have to be propagated to the tree levels of the nonclustered indexes as well.

Happy indexing!