Migrating Legacy LOB Data Types to Current Ones – A Big Gotcha!

Recently, Erin (Blog|Twitter) and I encountered a bug in SQL Server that can affect some scenarios in a very negative manner.  When working with a database that was designed in SQL Server 2000 using the legacy LOB data types, text, ntext, and image, it is possible to encounter a problem when the ‘text in row’ table option has been configured and the data types for the columns were later migrated to the newer varchar(max), nvarchar(max), or varbinary(max) data types in SQL Server.  If you have a table that fits the specific scenario in question, and you try to analyze the table for data compression using the sp_estimate_data_compression_savings stored procedure, the end result is a hung session with a Schema Modification lock being held in tempdb that requires a restart of the instance to clear.  NOT GOOD!!!!!

We were able to reproduce this problem in SQL Server and we’ve subsequently filed a bug internally with Microsoft as well.  The good thing is that there is a work around, which I’ll cover later in this blog post, if you happen to run into this specific scenario and bug.

To start off with, if you create a table using varchar(max) and try to use sp_tableoption to set ‘text in row’ for the table it will error out:

DROP TABLE dbo.Table_1

CREATE TABLE dbo.Table_1
(
RowID int NOT NULL,
TestColumn varchar(max) NULL,
OtherColumn varchar(50) NULL,
ChangeDate datetime NOT NULL
)

exec sp_tableoption 'Table_1', 'text in row', '7000';

 

Msg 2599, Level 16, State 1, Procedure sp_tableoption, Line 102
Cannot switch to in row text in table "Table_1".

This is expected behavior by the Engine since the ‘text in row’ option only applies to the legacy data types and cannot be used with the newer LOB data types in SQL Server 2005+. For these, you instead must use the ‘large value types out of row’ option.

However, if you have a table that was created in SQL 2000 or before, and used text, image, or ntext columns, and the ‘text in row option’ was set:

DROP TABLE dbo.Table_1

CREATE TABLE dbo.Table_1
(
RowID int NOT NULL,
TestColumn text NULL,
OtherColumn varchar(50) NULL,
ChangeDate datetime NOT NULL
)

EXECUTE sp_tableoption 'Table_1', 'text in row', '7000';</blockquote>

and then later you upgraded to 2005+ and changed the column data type to varchar(max) or one of the other LOB data types:

ALTER TABLE Table_1;
ALTER COLUMN TestColumn varchar(max);

The ‘text in row’ value remains configured for the table.

SELECT text_in_row_limit
FROM sys.tables
WHERE name = 'Table_1';

If you attempt to estimate the compression for this table it will generate an error because the compression code creates a copy of the table in tempdb and tries to call sp_tableoption to set the option on the temp table used for sampling.

WARNING: Running the below code will result in a locked session that requires restarting the SQL instance to clear! DO NOT DO THIS ON A PRODUCTION SQL SERVER!

EXECUTE sp_estimate_data_compression_savings 'dbo','Table_1', 0, 1, 'row'

This command will result in error 2599 occurring, however, the SP doesn’t handle the error correctly for some reason, and it ends up with the session waiting on a SCH-M lock in tempdb that is being blocked by itself and never goes away.  The proc will actually complete and return the error for 2599 in SQL Server 2008 R2 + SP1, but it results in a stuck session for SQL Server 2008R2 RTM, SQL Server 2008R2 + SP2, and all versions of SQL Server 2012 that we’ve tested so far (we are waiting for further information from Microsoft at this point about why this might occur, so this may be updated in the near future, but I wouldn’t rely on this behavior currently).
In addition to this, you can’t clear the ‘text in row’ value from the table because executing sp_tableoption will return the above error.  The only work around that I’ve found so far for this issue is to add a dummy column to the table with one of the legacy types, turn off the option, and then drop the column.

ALTER TABLE Table_1 ADD [dummy] text;
EXECUTE sp_tableoption 'Table_1', 'text in row', 'OFF';
ALTER TABLE Table_1 DROP COLUMN [dummy];

Once you do this, sp_estimate_data_compression_savings  can estimate the compression savings. If you have migrated data types from legacy LOB types to the current LOB types, you should check for this scenario before attempting to run sp_estimate_data_compression_savings against the tables that changed.

I’ve done a bit of testing and I don’t think that this problem is prevalent in SQL Server instances in the field for a couple of reasons.  Firstly, it requires that you are attempting to estimate the compression savings for a table affected by the problem, and in my experience, data compression is not really widely used.  Secondly, it requires that you migrated from one of the legacy LOB data types to a current one through the use of ALTER TABLE DDL explicitly (the SSMS UI will generate a script to create a new table, copy all the data into it, drop the existing tables indexes and constraints, then drop the old table, and the rename the new table to the old table name, then build all the necessary constraints and indexes, so the table option won’t be in effect after the designer based change).

You really need the perfect storm of conditions to hit this specific problem inside of SQL Server, but it is still something to be wary of.

Hopefully you won’t hit it, but I’m sure someone else will at some point.

CPU Ready Time in VMware and How to Interpret its Real Meaning

(If you’re having persistent problems with your virtual machine configurations and SQL Server performance, Jonathan can help you fix it – fast! Click here for details.)

In the last month I have had to explain how to interpret CPU Ready Time information for SQL Server VMs running on VMware to a number of people. The first time was on Twitter, and the topic is really too big for a 140 character discussion, but I went ahead and gave it a shot. It is rare in my experience for SQL professionals to know anything about virtualization under VMware or Hyper-V, which is why we spend a half a day, or more, on virtualization during IE3: High Availability & Disaster Recovery. Based on my experience, I was surprised to see the question being asked on #sqlhelp at all. It turns out that Idera Diagnostics Manager has been updated to include information about VM performance that is collected from the Virtual Center if you have it installed in your data center.

Now, overall this is a good thing because DBAs now have additional information that they didn’t have access to unless they had negotiated access to Virtual Center with their VM Administrator. My experience consulting is that few DBAs have access to, or even know about Virtual Center, even though it is one of the critical tools for troubleshooting performance problems. Another item that most DBAs don’t know is that their VM administrator can configure roles and read-only security access to allow anyone the ability to see performance and configuration information inside of Virtual Center, but not be able to make changes, so there really is no good reason for a DBA to not have access to the performance data available in Virtual Center.

Now that DBAs have access to the data, it is important to know what is being displayed and what it really means. Unfortunately, the way the information is being presented makes it easy to misinterpret what is being shown, so the end result is confusion for DBAs that don’t know a lot about VMware, or what the information they are looking at actually means. An example of the VM CPU Ready information from Diagnostics Manager is shown in the chart in Figure 1.

clip_image002[8]
Figure 1 – VM CPU Ready graph

Looking at this chart on the report what would your first interpretation of it be? So far, 3 out of 3 people, including the one on twitter, have thought that there was a problem with this VM. The reality of the situation is that there is absolutely nothing wrong with this VM, it is doing just fine. It just so happens that the scale of this graph makes it easy to jump to an interpretation that this is showing a percentage value, and we all know that anything performance related that is over 80% can’t be good. It doesn’t say that, but it doesn’t provide any context for the information’s meaning either. To make matters even more confusing, CPU Ready in VMware is available as a summation value, which happens to be what is shown here in milliseconds, and as a percentage (RDY% in esxtop) of time spent waiting with work to do to be scheduled by the hypervisor. We can confirm that the information presented is the summation value and not the percentage by looking at the real-time information available in Virtual Center for the same server as shown in the chart in Figure 2.

clip_image004
Figure 2 – CPU Ready real-time summation from Virtual Center

So now that we know it’s a summation and represented in milliseconds what exactly does that tell us? Unfortunately, it actually doesn’t tell us anything on it’s own.

What is CPU Ready Time and why do we even care?

CPU Ready Time is the time that the VM waits in a ready-to-run state (meaning it has work to do) to be scheduled on one or more of the physical CPUs by the hypervisor. It is generally normal for VMs to have small values for CPU Ready Time accumulating even if the hypervisor is not over subscribed or under heavy activity, it’s just the nature of shared scheduling in virtualization. For SMP VMs with multiple vCPUs the amount of ready time will generally be higher than for VMs with fewer vCPUs since it requires more resources to schedule/co-schedule the VM when necessary and each of the vCPUs accumulates the time separately.

At what point does CPU Ready Time start to affect performance?

To be honest it is always having some minimal effect, but it really depends on a lot of different factors, for example which CPU Ready value you are looking at and where you are getting the information. If you are looking at raw RDY% values from esxtop, the value has a completely different meaning than the summation values that are available from Virtual Center. Inside of Virtual Center, the level of summation you are looking at when reading the values also affects the meaning that the value has, and you have to perform calculations to convert the summation into a percentage to know the effect as documented in the VMware Knowledge Base. In this case, for a real-time summation, the data point is actually a 20 second summation of ready time accumulation.

At one point VMware had a recommendation that anything over 5% ready time per vCPU was something to monitor. In my experience for a SMP SQL VM, anything over 5% per vCPU is typically a warning level and anything over 10% per vCPU is critical. The reason this specifically says per vCPU is that each vCPU allocates 100% to the VM’s scheduling total, so a 4 vCPU VM would have a scheduling total of 400%. A 10% CPU Ready on a 4 vCPU VM only equates to 2.5% per vCPU. If this isn’t already complex, it gets worse.

This makes providing a general recommendation impossible for this counter, because it depends on a lot of different factors. For example, if the VM has had a CPU Limit placed on it, whenever the VM exceeds its allocated limit it will accumulate CPU Ready time while it waits to be allowed to execute again. If the CPU Limit is being enforced under business SLAs or a chargeback system, the VM could easily have high CPU Ready values that fit what is required for the configuration. Using the formula from the KB article to convert a summation value to percent, if we round the average of 81.767 down to 80 for simple math, this results in:

(80 / (20s * 1000)) * 100 = 0.4% CPU ready

Four tenths of a percent CPU Ready time, which is not going to negatively impact the performance of the VM. The example VM shown above, also has 8 vCPUs allocated to it, so after taking this into account, it really only has 0.05% per vCPU, well below the older recommended value.

What scenarios cause high CPU Ready times?

While there are a number of scenarios where high CPU Ready times can occur, there are generally two common scenarios that I see when I am consulting. The most common reason tends to be host over subscription, where too many vCPUs have been allocated per pCPU ratio wise. While ESX 5 supports a maximum of 25 vCPUs per physical CPU, this is definitely a case where just because you can, doesn’t mean it’s good to do. As always your mileage may vary based on your specific VM workloads, but typically I start to see problems when a host is in the range of 2-2.5X over subscribed for server workloads.

The second common scenario that I see where CPU Ready times are high is when a larger SMP VM for SQL Server, for example one with 4-8 vCPUs is running on a host that has a lot of smaller VMs with 1-2 vCPUs for application servers. Depending on the number of physical processors, and the total number of vCPUs allocated on the host, the larger resource allocation for the SQL Server VM results in it having to wait longer for the hypervisor to preempt the necessary physical CPUs to schedule/co-schedule the workload. Often in cases where this occurs, after asking some questions I find that the number of vCPUs for the SQL Server was increased from four to eight due to performance problems for the VM. Unfortunately, if CPU Ready time was the original problem, increasing the vCPUs actually doesn’t improve performance, it generally makes things worse.

What do I do if this is actually a problem?

If you have gone through the information and you can see that CPU Ready is really a problem for your VMs there are a couple of different things that can be done. The correct one depends on your virtual infrastructure. If the problem is purely host over subscription vCPU to pCPU ratio wise, start off by evaluating whether the VMs need to have the number of configured vCPUs to determine if any of them can be reduced to lower the ratio. If this can’t be done, the only real answer is to add additional hosts to allow the load to be balanced better and reduce the over subscription rates. If the problem is specific to the larger SMP VMs for applications like SQL Server, evaluate whether you can consolidate the larger VMs onto one or most hosts and move the smaller VMs to the other hosts to separate the VMs based on their sizes. This has worked well for a number of clients that I have worked with were they truly needed eight or more vCPUs for their workload.

Summary

Understanding the data that you are looking at and what it actually means is critical to making the right decisions about what is happening in a virtualized environment. CPU Ready time specifically requires a good understanding of what the value actually is showing and how it relates to the configuration of the VM, the other VMs on the host, and the physical host resources. If you are looking at summation data for the CPU Ready time, converting it to a CPU Ready percent value is what provides the proper meaning to the data for understanding whether or not it is actually a problem. However, keep in mind that other configuration options like CPU Limits can affect the accumulated CPU Ready time and must be checked as well. Whenever I am performing a health check of a SQL Server VM on VMware, I make sure that I get screenshots of the CPU Ready information from Virtual Center for each of the summation levels available so that I can determine whether or not it is affecting the performance of the VM, but I am always careful to calculate using the correct formula what the percentage value actually works out to and then review the rest of the VM configuration before making any conclusions.  In the worst case I’ve seen, for one client the CPU Ready time was roughly 63% per vCPU, and you could visibly see the VM freeze while moving the mouse in a RDP session.  Reviewing the configuration showed that the VM had 8 vCPUs on a host with 8 physical CPUs that was also running 10 other VMs with a total of 14 additional vCPUs.  Moving that VM back down to 2 vCPUs was instant relief to their biggest bottleneck, and then we started talking about hardware changes to fit their increased virtualization usage.  If you’d like expert assistance with implementing, configuring, troubleshooting, or understanding SQL Server on VMware we have a number of different services to fit your needs.

Clustering SQL Server on Virtual Machines (Round 2)

Recently there was lengthy discussion on the #sqlhelp hash tag on Twitter about clustering SQL Server on VMs and whether or not that was a good idea or not. Two years ago I first blogged about this same topic on my blog post, Some Thoughts on Clustering SQL Server Virtual Machines. If you haven’t read that post, I recommend reading it before continuing with this one because it gives a lot of background that I won’t be rehashing as a part of this post. However, a lot has changed in VMware and Hyper-V since I wrote that original post and those changes really affect the recommendations that I would make today.

As I stated in the twitter discussion, we have a number of clients at SQLskills that run WSFC clusters across VMs for SQL Server HA and few have problems with the clusters related to them being VMs. There are some additional considerations that have to be made when working with VMs for the cluster nodes, a big one is that you should plan for those nodes to be on different hosts so that a hardware failure of the host doesn’t take out both of your cluster nodes, which defeats the purpose of having a SQL cluster entirely. There is also the additional layer of the hypervisor and resource management that plays into having a cluster on VMs but with proper planning and management of the VM infrastructure this won’t be a problem, it’s just another layer you have to consider if you do happen to have a problem.

In response to the discussion, Chuck Boyce Jr (Blog|Twitter) wrote a blog post that provided his opinion, which was not to do it, that started a separate discussion later on Twitter. The biggest problem Chuck points out is rooted in problems with inter-team communications within an IT shop. To be honest, Chuck’s point is not an incorrect one, I see this issue all the time, but it’s not specific to VMs. If you work in a shop that has problems with communication between DBAs, Windows administrators, VM administrators, the networking team, and any other IT resource in the business, the simple fact is those problems can be just as bad for a physical implementation of a SQL cluster as they might be for a VM implementation. The solution is to fix the communication problem and find ways to make the “team” cooperate better when problems arise, not avoid merging technologies in the hopes of preventing problems that will still occur in a physical implementation as well.

Am I saying that clustering VMs for SQL Server is for every shop? No, certainly not. There are plenty of places where clustering isn’t the best solution overall. However, with virtualization, depending on the infrastructure, the other SQL HA options that exist might not be a better decision as they would in a physical world either. One of the biggest things to think about is where are the VMs ultimately going to be stored? If the answer is a shared SAN then options like Database Mirroring and Log Shipping don’t really provide you with the same advantages that they do in physical implementation, the big one being that you have a second copy of the database on completely different storage generally. Yes I know that you could have two SQL Servers connected to the same SAN physically that use Database Mirroring, and my response to that would be that a cluster probably would make more sense because the SAN is your single point of failure in either configuration.

If you are new to clustering SQL Server, I wouldn’t recommend that you start out with VMs for your failover cluster. The odds are that you also don’t have a lot of VM experience and if there is a problem, you aren’t going to be able to troubleshoot it as effectively because you have two new technologies that you have to try and dig through. If you are comfortable with clustering SQL Server, adding virtualization as a new technology to the mix is really not that big of a deal, you just need to read the configuration guides and whitepapers for how to setup the VMs, usually your VM administrator is going to have to do this so it’s a good area to break the ice with them and work together to start the open lines of communication, to allow for a supported WSFC implementation and then finally install SQL Server and manage it like you would any other SQL Server failover cluster.

Where else would I recommend not implementing a cluster on VMs? iSCSI SANs that only offer 1Gb/s connectivity over Ethernet, simply because you are likely to run into I/O limitations quickly, and to build the cluster you have to use the software initiator for iSCSI so there is a CPU overhead associated with the configuration. Generally the host has limited ports so you end up sharing the networking between normal traffic and the iSCSI traffic which can be problematic as well. Does that mean it’s not possible? No – I have a number of clients that have these setups and they work fine for their workloads, but it’s not a configuration I would recommend if we were planning a new setup from the ground up.

The big thing that I work through with clients when they are considering whether to cluster VMs for SQL Server is the business requirements for availability and whether or not those can be met without having to leverage one of the SQL HA options or not. With the changes in VMware ESX 5 and Hyper-V 2012, you can scale VMs considerably, and both platforms allow for virtualized NUMA configurations inside of the guest VM for scalability, so the performance and sizing considerations I had two years ago are no longer primary concerns for the implementation to me. If we need 16 vCPUs and 64GB RAM for the nodes, with the correct host configuration, we can easily do that, and we can do it without performance problems while using Standard Edition licensing if we plan the infrastructure correctly.

In my previous post on this topic I linked to a number of VMware papers, and in the post prior to that one I linked to even more papers that include best practice considerations for configuration and sizing of the VMs, how to configure the VMs for clustering, and many other topics. Newer versions of these documents exist for ESX 5 and a number also exist for Hyper-V as well. I recommend that anyone looking at running SQL in a VM, whether as a clustered instance or not, spend some time reading through the papers about the hypervisor you want to run the VM on so you understand how it works, the best practices for running SQL Server on that hypervisor, and what to look for while troubleshooting problems should they occur.

In the end, Microsoft supports SQL Server failover clustering on SVVP certified hypervisors, so there isn’t a hard reason to not consider using VMs objectively to evaluate whether they might be an appropriate fit your business requirements.  When I teach about virtualization in our IE3: High Availability & Disaster Recovery class, most of the perceptions at the start of the virtualization module are negative towards SQL Server on VMs, often from past experiences of failed implementations.  By the end of the demos for the module, most of the opinions have changed, and in a lot of cases attendees have found and been able to communicate correctly with their VM administrator to get the problem fixed while I am performing demos of specific problems and their causes.  In the last year I’ve setup a number of SQL Server clusters on VMs for clients where it was the best fit for their needs.  If you would like assistance with reviewing the infrastructure, business requirements, and determining the best configuration for your needs, I’d be happy to work with you as well. 

New Article: Performance Problems with SQL Server 2012 Enterprise Edition Under CAL Licensing

My latest article on SQLPerformance.com was published yesterday and highlights the potential for performance problems while using the default configuration of SQL Server 2012 using the Enterprise Server+CAL upgrade licensing under Software Assurance.

http://www.sqlperformance.com/2012/11/system-configuration/2012-cal-problems

Faking Hardware NUMA on a SMP Machine for Learning and Demos

I’ve blogged about NUMA a couple of times in the past and we cover it extensively in our Immersion Events, specifically when discussing SQLOS and memory and scheduling management.  One of the challenges with this is that laptops are SMP machines, so unless you have a server with hardware NUMA it is impossible to show the differences in how SQLOS boots up with or without NUMA, at least that’s what I used to think.  It turns out that in Windows 7 and Windows Server 2008 R2, the support for > 64 processor support and processor groups allows you to be able to fake a hardware NUMA configuration at the OS level for testing purposes.

To show how this works I am going to use one of my test VMs running in VMware Workstation 8 on my Dell M6700 laptop that I use regularly for teaching and presentation demos.  For the purposes of this post I have reconfigured the VM to have four CPUs with two cores each and 16GB RAM.

SMP Configuration (Default)

Under the default configuration of the VM has a single memory node presented by Windows to SQLOS so it starts up with a single NUMA node and all of the logical processors are assigned to that node, as shown in the Server Properties dialog below.

image

We can also see this information from the DMVs with the following query:

SELECT
    mn.processor_group
    , mn.memory_node_id
    , n.node_id AS cpu_node_id
    , n.online_scheduler_count
    , n.cpu_affinity_mask
    , n.online_scheduler_mask
FROM sys.dm_os_memory_nodes AS mn
INNER JOIN sys.dm_os_nodes AS n
    ON mn.memory_node_id = n.memory_node_id;

image

NUMA Configuration (bcdedit)

A fake NUMA configuration can be created by setting the groupsize value in the Boot Configuration Data (BCD) using BCDEdit to the number of CPUs that should be grouped into each processor group. The appropriate values are 1 to 64 in powers of 2. This isn’t documented in the standard BCDEdit documentation in the Books Online, but it is documented in the Windows Drivers development section.

BCDEdit /set (Windows Drivers)

To use BCDEdit you have to be running from a elevated command prompt as a local administrator.  For the purposes of this post I am going to use a groupsize of 2.

image

After changing the BCD, it is necessary to restart the machine for the configuration changes to take effect to the configuration.

Disclaimer:  This is not something you should do on a production SQL Server and this is only something that should be used for testing and/or learning about SQLOS on a non-NUMA hardware machine.

After rebooting the server, if we look at the Server Properties in, we will have four NUMA nodes, each with two logical processors assigned to it.

image

Looking at the DMVs in SQL Server will also show the changes to how SQLOS started up and configured the instance.

image

Additionally, the Performance Counters in PerfMon for the Buffer Nodes we can see memory usage information per node:

image

Keep in mind this is all false information, but you can really learn a lot about SQL Server and perform tests of behaviors under NUMA using this method.

New Article: Tracking Database File AutoGrowth with Event Notifications

My latest article on SQL Server Central was published live today.  This article shows how to use Event Notifications to monitor and alert on database file auto growth events inside of SQL Server and is another edition to the series I committed to writing on using Event Notifications in SQL Server to automate common DBA tasks

http://www.sqlservercentral.com/articles/autogrowth/93229/

Look for another article in the series in the future.  Enjoy!

New Article on SQLPerformance.com comparing “Observer Overhead” of Trace vs Extended Events

I have been so busy this week that I didn’t get a chance to blog about this yesterday when it happened, but I had a new article published on SQLPerformance.com that compares the performance impact or "observer overhead" of using SQL Trace and Extended Events.  I had a lot of fun running different benchmarks for this article and the results are very revealing about the overhead associated with diagnostic data collection against SQL Server under load. 

http://www.sqlperformance.com/2012/10/sql-trace/observer-overhead-trace-extended-events

I’ll be writing additional performance related articles on SQLPerformance.com in the next few months along with other members of SQLskills so make sure you add the RSS Feed to your favorite feed reader.

How useful are query_hash and query_plan_hash for troubleshooting?

After my last post, I was contacted by email about the usefulness of query_hash in actual troubleshooting scenarios since the query_hash value is based on the statement and not the database in a system.  It just happened that the example template from SQL Server 2012 that I showed had the query_hash included in it’s definition.  The premise for the question was that Software as a Service (Saas) solutions have multiple databases, sometimes hundreds to even thousands on the same server, that have the exact same code base, so using query_hash to aggregate data is not as useful in these scenarios since the data is not guaranteed to be evenly distributed.  My first feedback on this was that my blog post didn’t actually relate any information to the query_hash action in Extended Events directly, but there happens to be more to this situation.

The query_hash and query_plan_hash provide the query finger prints in SQL Server 2008+, and simplify the effort required for analyzing adhoc/prepared workloads as well as workloads that use stored procedures in multiple databases.  To look at how to leverage this information in SaaS scenarios to identify the database that caused the most executions of the query_hash being reviewed we’ll make use of the following workload to show the accumulated effects of a specific query_hash per database.

CREATE DATABASE Test;
GO
USE Test;
GO
CREATE PROCEDURE dbo.TestProc
AS
BEGIN
    SELECT TOP 10 *
    FROM master..spt_values
    WHERE type = N’P’;
END
GO
CREATE DATABASE Test1
GO
USE Test1
GO
CREATE PROCEDURE dbo.TestProc
AS
BEGIN
    SELECT TOP 10 *
    FROM master..spt_values
    WHERE type = N’P’;
END
GO
CREATE DATABASE Test2
GO
USE Test2
GO
CREATE PROCEDURE dbo.TestProc
AS
BEGIN
    SELECT TOP 10 *
    FROM master..spt_values
    WHERE type = N’P’;
END
GO
DBCC FREEPROCCACHE;
GO

After creating the databases and objects, we’ll setup an Extended Events session to capture the statement level completed events for the session_id running the test queries and capture the query_hash to show how to use it to look at execution information per database from the query statistics in SQL Server. Since I am using SQL Server 2012, I am not using a target with the event session, but will instead capture the information using the Live Data Viewer to make it easier to show in this post.

CREATE EVENT SESSION [SQLskills_Query_Hash] ON SERVER
ADD EVENT sqlserver.sp_statement_completed(
    ACTION(sqlserver.query_hash)
    WHERE ([sqlserver].[session_id]=(66))),
ADD EVENT sqlserver.sql_statement_completed(
    ACTION(sqlserver.query_hash)
    WHERE ([sqlserver].[session_id]=(66)));

Once the session is created and started, we can run the following tests and review the output:

USE Test
GO

EXECUTE dbo.TestProc
GO 2

SELECT TOP 10 *
FROM master..spt_values
WHERE type = N’P’;
GO 5

USE Test1
GO

EXECUTE dbo.TestProc
GO 4

SELECT TOP 10 *
FROM master..spt_values
WHERE type = N’P’;
GO 7

USE Test2
GO

EXECUTE dbo.TestProc
GO 8

SELECT TOP 10 *
FROM master..spt_values
WHERE type = N’P’;
GO 3

After collecting the data, I disconnected the UI from the live stream and then grouped the events by the query_hash and summed the values for logical_reads, duration, cpu_time, and writes

image

Here we can see that all of the statements that executed had the same query_hash value, even though they were a mix of adhoc executions and procedures from different databases.  If we want to break this down to determine each databases executions we can do that by taking the query_hash value, converting it into a binary(8) data type from the bigint value produced by Extended Events, and then query sys.dm_exec_query_stats.

SELECT
    sql_handle,
    plan_handle,
    execution_count,
    total_logical_reads,
    total_elapsed_time,
    dbid,
    objectid
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
WHERE query_hash = CONVERT(BINARY(8), CONVERT(BIGINT, 1640387627010439277));

image

Using this method we can further breakdown the information to see where our biggest impacts are.  Using the query_hash at the server level in Extended Events makes it possible to identify statements in a workload that might not have long executions but execute frequently leading to a "death by a thousand cuts".  If you need to dig into the database level information, you could add the sqlserver.database_id action to the statement level events, or just query the information out of the query statistics maintained inside of SQL Server.

Customizing Extended Events Templates in SQL Server 2012

One of the features of the Extended Events UI in SQL Server 2012 is the ability to export an event session definition out to a template for future reuse.  This can be accomplished through the right-click menu for an event session and then by selecting the Export Session menu item as shown in Figure 1.

image
Figure 1: Exporting a session

Event sessions that have been exported are saved in the %USERPROFILE%\Documents\SQL Server Management Studio\Templates\XEventTemplates path on the machine that is running SQL Server Management Studio.  When you open up a New Session Wizard or the New Session dialog to create a new event session, the templates will display the user defined templates as shown in Figure 2.

image
Figure 2: Template selection

The basic event session templates created by Management Studio can be customized to display in the UI similar to how the built-in templates display.  The built-in templates are stored in the C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Templates\sql\xevent path on the client machine.  By editing one of these templates, you can grab the additional XML elements that provide the metadata about the event session template for display in the UI.  The XML from the Count Query Locks template is show below to highlight the necessary elements and their location in the XML document

<event_sessions xmlns="http://schemas.microsoft.com/sqlserver/2008/07/extendedeventsconfig">
  <event_session name="lock_counter_final" maxMemory="4" eventRetentionMode="allowSingleEventLoss" trackCausality="true" dispatchLatency="30" maxEventSize="0" memoryPartitionMode="none">
    <templateCategory>
      <!-- _locID_text="templateCategory" _locComment = "" -->Locks and Blocks</templateCategory>
    <templateName>
      <!-- _locID_text = "templateName" _locComment = "" -->Count Query Locks</templateName>
    <templateDescription>
      <!-- _locID_text = "templateDescription" _locComment = "" -->This template counts the number of locks acquired by each query based on the query_hash value. You can use this template to identify the most lock intensive queries for investigation and tuning.</templateDescription>
    <event package="sqlserver" name="lock_acquired">

The templateCategory provides the categorization in the template drop down, and the templateName provides the name that is displayed.  The templateDescription is what shows in the UI once the template is actually selected.  The comments within the XML are not required for the UI to display the information and the XML can be simplified down to:

    <templateCategory>Template Category</templateCategory>
    <templateName>Template Name</templateName>
    <templateDescription>Template Description</templateDescription>

By adding these elements manually into the session template XML between the event_session and first event elements, the UI can interpret the information for display about the template.  However, if the template resides in the user templates path, only the templateDescription element is rendered in the UI.  The templates in the user templates path all display in the UI as shown in Figure 2 above, but once selected the template description will populate as shown in Figure 3.

image

If you move the template file to the default template path, the UI will appropriately categorize the template based on the templateCategory element and will display the name in the templateName element as shown in Figure 4.

image

This shows two customized templates from the SQLskills collection of event sessions that be released as a set of templates that we’ve made use of in client work, and that I have written about in different blog posts and articles over the last three years while working with Extended Events.  Watch this blog for a final release of these templates in the near future.

Announcing the SQLskills Availability Group Demonstrator

This spring at SQL Connections, I presented a pre-conference seminar with Kimberly on the new features in SQL Server 2012, including Availability Groups.  While on my flight to Las Vegas, I was working on an application in C# that motivated me to write a new demonstrator application for use with Availability Groups.  I spent a little time putting together a small application that I have used for every Availability Group demonstration since, and I’ve been asked a few times if I would make the application publicly available.  Paul and I discussed this and agreed that we’d offer the application as a free download.

The application is really simple in its design and usage, so I am going to use this post to show how to configure the application as well as how to use it for demonstrations of Availability Groups, or just to play with all the features provided by Availability Groups during your own testing.

Application Configuration Options

The application uses a basic application configuration file that most .NET applications use.  The configuration file is really simple and provides the ability to customize the defaults that are available in the New Connection window to simplify demonstrations based on your own environment setup.

image

The settings in the configuration file provide the default values for the following options in the New Connection window.

image

Using the application

I usually start out with a simple Availability Group (hereafter referred to as an AG) configuration where a SQL Server login is used and the AG was created through the UI.  I use the application to connect directly to the primary server using a SQL Server login without using the Availability Group Listener to show how you can connect directly to the primary server with its name.  Then I perform a failover to the synchronous replica that is configured for high availability, and open a new connection to show that the connectivity will fail since the SQL Server login wasn’t created on the secondary.  Then I use DDL to create the login and show that even though I’ve created the login on the secondary, it will fail because the SID for the new login isn’t the same as the SID that existed on the original primary server.  At this point I explain that you could use sp_help_revlogin to transfer the login with the SID intact, but then show that Partially Contained Databases with database authentication can prevent having to manually migrate every login, so I failover to the original primary and setup partial containment on all the nodes and migrate the server level login to being a contained database login.  Then I close all of the connection windows and create a new connection that uses the Availability Group Listener for the connection so that I can then failover between the primary replica and the high availability synchronous successfully. A single New Connection in the application can support this.

Then I configure ReadOnly routing for the AG and show two connections, one configured for the Read/Write workload, and another configured as a ReadOnly connection to the listener using a ReadOnly workload.

image

After setting up ReadOnly routing, and then specifying that the connection is for ReadOnly operation in the application, it will automatically redirect the connections to the first readable secondary in the routing configuration as expected.  Performing a failover will allow the application to show the expected failover error messages and then redirect the applications connections based on the configuration of the ApplicationConnection windows to the desired hosts in the AG.

image

An example of the failover messages returned is below:

image 

Once the failover completes, the application will reconnect to the appropriate nodes based on the ReadOnly routing configuration for the AG.

Requirements

The application requires .NET 4.0 and the SQL Server 2012 Native Client be installed to make use of the new ApplicationIntent connection string option. 

UPDATE:

You also need to create the following table in whatever database you are using for the demonstration:

USE [DBNAME]
GO
CREATE TABLE InsertCount (RowID int identity primary key)
GO
INSERT INTO InsertCount DEFAULT VALUES;
GO

The application can be downloaded from:

http://www.sqlskills.com/free-tools/availability-group-demonstrator/ 

If you have any questions, feel free to email me at jonathan@sqlskills.com or through the contact form on this blog.