Mapping Events from Trace to Extended Events

Every time I talk about migrating from Profiler and Trace to Extended Events (XE), I seem to add something to my presentation.  It’s not always intentional, and I don’t know if it’s because I just can’t stop tweaking my demos and contents, or something else, but here in London at IEPTO2 this week, I added a query that helps you see what event in Extended Events is comparable to the event you’re used to using in Trace/Profiler.  While most of the events in XE have a name similar to the event in Trace (e.g. sp_statement_completed for SP:StmtCompleted), some mappings aren’t so intuitive.  For example, SP:Starting in Trace is module_start in XE, and SP:Completed in Trace is module_end in XE.  That’s not so bad, but if you’re ever monitoring file growths, note that the database_file_size_change event in XE is the event for the following four events in trace: Data File Auto Grow, Data File Auto Shrink, Log File Auto Grow, and Log File Auto Shrink.

(Note: if you happen to notice databases_data_file_size_changed or databases_log_file_size_changed in the list of XE, do not use either event.  Both are deprecated and won’t return any information.  Use database_file_size_change.)

Anyway, those fun one-offs aside, you’ll see that the query and its output are pretty straightforward:

SELECT 
	[t].[trace_event_id] [TraceEventID], 
	[t].[name] [Trace_Event], 
	CASE	
		WHEN [t].[name] LIKE '%Audit%' AND [xe].[xe_event_name] IS NULL THEN 'This event is not implemented in Extended Events, it may be a Server Audit Event'
		WHEN ([t].[name] LIKE '%Audit Lo%' OR [t].[name] LIKE'%Audit Server%') AND [xe].[xe_event_name] IS NOT NULL THEN 'The XE Event [' + [xe].[xe_event_name] + '] exists, but use Server Audit for this event instead'
		WHEN [xe].[xe_event_name] IS NULL THEN 'No comparable XE Event'
		ELSE [xe].[xe_event_name]
	END	[XE_Event],
	[xe].[package_name] [XE_Package]
FROM [sys].[trace_events] [t] 
LEFT OUTER JOIN [sys].[trace_xe_event_map] [xe] ON [t].[trace_event_id] = [xe].[trace_event_id]
ORDER BY [t].[name] ;
GO 

You can run this on SQL Server 2012 or higher.  You could run it on SQL Server 2008/2008R2, but remember that there was not a comparable event in Extended Events for every event in Trace until SQL Server 2012.  Thus you will have some unmapped events in the 2008/2008R2 releases.

Something that you’ll notice in the query: I’m accounting for Audit events.  The  Audit Feature was introduced in SQL Server 2008 but was Enterprise Edition only.  In SQL Server 2012, Server Audit is supported in all editions, so all Audit events from Trace should instead be captured with Server Audit (which leverages Extended Events behind the scenes).

Finally, don’t forget that if you’re looking to take an existing Trace and convert it to Extended Events, you can use Jonathan’s stored procedure to get the event session DDL.

Happy Extended Eventing!

 

SQL Trace versus Extended Events: How long does it take to startup a trace vs. an event session?

Yesterday I asked how everyone uses Trace and Profiler…  Jonathan and I have talked about creating a library of Extended Events script to help people migrate off of Profiler and Trace and that post was designed to generate some ideas about what people might find useful.

I haven’t gotten a ton of comments, but I did get a few (thank you to those have responded!), and I decided to take one of them and create a Trace and create an Extended Events session and see how long it took for each.  Jonathan has mentioned before that he can create an XE session as fast as a Trace, and I’ve been thinking that I can as well so I thought I’d test it.  It’s a straight-forward Trace versus Extended Events test.  Want to see what’s faster?  Watch the video here.

 

Tell me how you use Profiler and Trace

Back in April I wrote a post asking why people tend to avoid Extended Events.  Many of you provided feedback, which I greatly appreciated.  A common theme in the responses was time.  Many people found that creating an event session in Extended Events that was comparable to one they would create in Trace took longer.  Ok, I get that…so as a follow up I am interested in knowing how you typically use Profiler and Trace.  If you would be willing to share that information, I would be extremely grateful.  At a minimum, I’d like to know what events you add, what columns you select for those events, what filters you add, and then how you view the data (e.g. do you run Profiler and watch the data live, or do you script it out as a server side trace and write it to a file, then view it later for analysis?).  If it’s easiest to post the trace definition (i.e. export it from the Profiler UI) that works too.  Again, any information you can share in the comments (or if you want to send me an email that’s great too) would be greatly appreciated.  My goal is to create a library of event session definitions for these “typical” traces that people create and share them publicly.  Stay tuned for another post once I (hopefully!) start getting some information.  Thanks again!

SQL Server Query Store: Default Settings

Last week I wrote a quick post about storage settings for Query Store.  Today I’m going to take a moment to discuss the default settings that are applied when you initially enable Query Store.  I’m not going go through all the settings and what they mean, you can find that information here.  But I do want to discuss from where they originate.

When you enable Query Store through the UI, you have the option to change any of options (e.g. Max Size (MB), Size Based Cleanup Mode).  The values for the settings are pulled from model, just like other database defaults (Recovery Model, Compatability Mode, etc.).  Now, what’s interesting is that you can change these defaults in model, but only using T-SQL.  In fact, you can only enable Query Store for model and msdb using TSQL, the Query Store page is not available for either database when you expand Databases, then System Database, and right-click on the database and select Properties.

The Query Store documentation specifically states that it cannot be enabled for master and tempdb, which leads us to believe we can enable it for model and msdb.  Now, I haven’t figured out a use case for either (I have yet to troubleshoot query performance against model or msdb), but I figured I’d try it for both.  I started with model, just using the basic syntax which would pull the default values for the options:

USE [master];
GO
ALTER DATABASE [model] SET QUERY_STORE = ON;
GO
ALTER DATABASE [model] SET QUERY_STORE (OPERATION_MODE = READ_WRITE);
GO

Then I checked to confirm the values that were set:

USE [model];
GO
SELECT actual_state_desc, readonly_reason, desired_state_desc, current_storage_size_mb,
max_storage_size_mb, flush_interval_seconds, interval_length_minutes,
stale_query_threshold_days, size_based_cleanup_mode_desc,
query_capture_mode_desc, max_plans_per_query
FROM sys.database_query_store_options;
GO
Output from sys.database_query_store_options for model

Output from sys.database_query_store_options for model

My output was empty.  Huh?

I tested msdb.

USE [master];
GO
ALTER DATABASE [msdb] SET QUERY_STORE = ON;
GO
ALTER DATABASE [msdb] SET QUERY_STORE (OPERATION_MODE = READ_WRITE);
GO
USE [msdb];
GO
SELECT actual_state_desc, readonly_reason, desired_state_desc, current_storage_size_mb,
max_storage_size_mb, flush_interval_seconds, interval_length_minutes,
stale_query_threshold_days, size_based_cleanup_mode_desc,
query_capture_mode_desc, max_plans_per_query
FROM sys.database_query_store_options;
GO

This time I had output:

Output from sys.database_query_store_options for msdb

Output from sys.database_query_store_options for msdb

Ok.  So…maybe I can only use model to change default values.  Let’s see if that works.  I edited the Query Store settings for model:

USE [master];
GO
ALTER DATABASE [model] SET QUERY_STORE (
OPERATION_MODE = READ_WRITE,
CLEANUP_POLICY = (STALE_QUERY_THRESHOLD_DAYS = 78),
DATA_FLUSH_INTERVAL_SECONDS = 900,
INTERVAL_LENGTH_MINUTES = 60,
MAX_STORAGE_SIZE_MB = 2048,
QUERY_CAPTURE_MODE = ALL,
SIZE_BASED_CLEANUP_MODE = AUTO,
MAX_PLANS_PER_QUERY = 200);
GO

Then I created a new database and enabled Query Store:

CREATE DATABASE [QSTest]
CONTAINMENT = NONE
ON  PRIMARY
( NAME = N'QSTest', FILENAME = N'C:\Databases\QSTest.mdf' , SIZE = 8192KB , FILEGROWTH = 65536KB )
LOG ON
( NAME = N'QSTest_log', FILENAME = N'C:\Databases\QSTest_log.ldf' , SIZE = 8192KB , FILEGROWTH = 65536KB );
GO
USE [master];
GO
ALTER DATABASE [QSTest] SET QUERY_STORE = ON;
GO
ALTER DATABASE [QSTest] SET QUERY_STORE (OPERATION_MODE = READ_WRITE);
GO

I checked sys.database_query_store_options:

Output from sys.database_query_store_options for QSTest user database

Output from sys.database_query_store_options for QSTest user database

It definitely picked up the changes I made to model.

So…if I can’t see the settings for model in sys.database_query_store_options, then is Query Store really enabled?  I ran a few queries against model, then checked the Query Store catalog views:

USE [model];
GO
SELECT *
FROM sys.all_objects;
GO
SELECT *
FROM sys.all_columns;
GO
SELECT *
FROM sys.database_files;
GO
SELECT qst.query_text_id, qsq.query_id, qsq.object_id, qsq.count_compiles, qst.query_sql_text, qst.statement_sql_handle, qsq.query_hash
FROM sys.query_store_query_text qst
JOIN sys.query_store_query qsq ON qst.query_text_id = qsq.query_text_id;
GO

Guess what?  There was no data in the Query Store tables.  So…let’s repeat the test against msdb.

USE [msdb];
GO
SELECT *
FROM sys.all_objects;
GO
SELECT *
FROM sys.all_columns;
GO
SELECT *
FROM sys.database_files;
GO
SELECT qst.query_text_id, qsq.query_id, qsq.object_id, qsq.count_compiles, qst.query_sql_text, qst.statement_sql_handle, qsq.query_hash
FROM sys.query_store_query_text qst
JOIN sys.query_store_query qsq ON qst.query_text_id = qsq.query_text_id;
GO

In this case, there WAS data in the Query Store tables.

So, what does this mean?  That you can only truly enable and use Query Store against one system database: msdb.  It cannot be enabled for master and tempdb (it throws an error: Msg 12438, Level 16, State 1, Line 19
Cannot perform action because Query Store cannot be enabled on system database tempdb.) and while it can be enabled for model, it’s only for the purposes of changing the default settings for Query Store for new databases on this instance (I’ve confirmed this with the SQL Server team).  Is that something you want to do?  Perhaps…but I can do those on a per-database case when I’m initially enabling Query Store, or later on if so desired. I’m always wary of changing settings in model, though I do change Recovery Model to SIMPLE so I don’t end up with a ridiculously large transaction log…the default is FULL and if someone doesn’t setup transaction log backups then that log is going to grow until it consumes all space on the disk.  I also set Auto Growth values to fixed, reasonable numbers.

For each database where I enable Query Store, I’d consider the workload and then look at the settings.  I tend to think that the default value of 100MB for MAX_STORAGE_SIZE_MB is really low, and I would be inclined to bump up STALE_QUERY_THRESHOLD_DAYS from 30 to something a bit higher.  I’d also probably drop  DATA_FLUSH_INTERVAL_SECONDS to something lower than 900 seconds (15 minutes) if my storage can support it.  This setting determines how often Query Store data is flushed to disk.  If it’s every 15 minutes, then I could potentially lose 15 minutes of Query Store data if my server happened to crash before it could be written to disk.  I’d also think about changing INTERVAL_LENGTH_MINUTES to a value smaller than 60 if I wanted to aggregate my query data over a smaller amount of time.  Sometimes interesting events happen within a 60 minute time frame, and they get can lost when data is aggregated across that window.  However, aggregating more frequently means I’m adding processing overhead to the system – there’s a trade-off there to figure out.

The take home message…you probably don’t need to enable Query Store for any system database, but if you want to change the default values that user databases pick up when you enable it, you can do that using model.

SQL Server Query Store: Data Storage Options (and a Request!)

Microsoft recently announced that SQL Server 2016 RTM will be available June 1, 2016, and it was also confirmed that Query Store will be available in ALL editions of SQL Server.  This is pretty exciting, as there was some concern that it would be Enterprise-only.  I’ve done a couple sessions on Query Store this year, and the response to the feature has been fantastic.  If you haven’t checked it out yet, I highly recommend it (download 2016 RC3 here), and in this post I want to talk about the Query Store data – where it’s located and how it’s retained.

Query Store Data Storage

Enabling Query Store is pretty simple.  It’s done a per-database basis (note that master and tempdb cannot be enabled for Query Store) either through the UI or via T-SQL.  I first restored a copy of the AdventureWorks2016 database (download from here).   Interestingly enough, Query Store is already enabled for the database.  But if it were not, to enable it you right-click on the database and select Properties, go to the Query Store page, and then select Read Write for the Operation Mode (Requested).  This is the code that runs if you script it out from the UI:

USE [master]
GO
ALTER DATABASE [AdventureWorks2016] SET QUERY_STORE = ON
GO
ALTER DATABASE [AdventureWorks2016] SET QUERY_STORE (OPERATION_MODE = READ_WRITE)
GO

 

Multiple settings are then configured to the default values for Query Store – it’s analogous to running this code:

USE [master]
GO
ALTER DATABASE [AdventureWorks2016] SET QUERY_STORE = ON
GO
ALTER DATABASE [AdventureWorks2016] SET QUERY_STORE (
OPERATION_MODE = READ_WRITE,
CLEANUP_POLICY = (STALE_QUERY_THRESHOLD_DAYS = 30),
DATA_FLUSH_INTERVAL_SECONDS = 900,
INTERVAL_LENGTH_MINUTES = 60,
MAX_STORAGE_SIZE_MB = 100,
QUERY_CAPTURE_MODE = ALL,
SIZE_BASED_CLEANUP_MODE = AUTO,
MAX_PLANS_PER_QUERY = 200);
GO

 

The setting I want to point out today is MAX_STORAGE_SIZE.  Understand that with Query Store, the data is persisted *IN* the user database.  Which means that if you backup and restore the database to another instance (or detach and attach), the Query Store data goes with the database.  This is pretty handy.  You have the ability to set how much space can be used by Query Store within the user database with the MAX_STORAGE_SIZE setting.  The default value, 1024MB, is a good start.  Because workloads vary, and what data *you* might want to keep is different than what someone else want to keep, and there aren’t any recommendations as to what this size “should” be at this point.  It’s something you’re going to have test out in your environment – just like figuring out the “best” size for the transaction log for a database, or the size of tempdb for an instance.  Note that the more varied the workload (more unique plans), the more space you might need.  Remus Rusanu points out in a recent post that the SQL Server team vastly under-estimated the space needed for a client workload – it consumed the Query Store allocated space in minutes rather than months.  Disk space is pretty cheap – I would probably start with a few GB of space for Query Store and monitor how much is used with this query:

SELECT
actual_state_desc,
current_storage_size_mb,
max_storage_size_mb,
flush_interval_seconds,
interval_length_minutes,
size_based_cleanup_mode_desc
FROM sys.database_query_store_options;
GO

SQL Server can automatically start to purge data from the store based on how much of that allocated space is used.  This is determined by size_based_cleanup_mode_desc setting, which defaults to AUTO.  With AUTO set, SQL Server will start purging data when the space used gets close the max_storage_size_mb setting.  If you have this set to OFF, then when current_storage_size_mb is reached, Query Store will stop collecting data and actual_state_desc will switch to READ_ONLY.

 Using Query Store to Find Query Regressions or Improvements

Ok, so now we’re clear on those storage-related settings.  There’s more of a conversation to have about how you’re going to use this data to find regressed queries, etc…but that’s an entirely separate post.  I want you now to think about the value of this data and how can you use it to measure the impact of code changes, SQL Server patches and upgrades, hardware changes and so on.

When you restore a copy of your production database to your Test/Dev/QA/UAT environment (because you have at least one of those, right?), the Query Store data from production will come with it.  This is great because you then run your workload in that environment and do comparisons using the Query Store data.  Awesome.  But what about this scenario, proposed by an attendee of last week’s IEPTO2 class…

You have an automated process developed in-house to backup and restore production databases down to lower environments (like Test/Dev/QA/UAT) on a regular basis (daily, weekly).  Because the Query Store data lives in the user database, you actually want to clear the Query Store data once the database is restored in the lower environment.  Then you test new code, etc. against the database.

When the production data is restored again, all data collected during testing in Test/Dev/QA/UAT will be lost…unless there is a way to export it.  Meaning, let’s save off that Query Store generated from testing, before the refresh so I can do comparisons between THOSE tests.

Now, there are catalog views that allow you to view the Query Store data.  You can copy that data into another database using SELECT INTO, and then do comparisons, but wouldn’t it be nice to have some kind of export option?  There’s a Connect item for that:

Export Query Store tables separately from the database tables: https://connect.microsoft.com/SQLServer/feedback/details/2620017/export-query-store-tables-separately-from-the-database-tables

If you think this is something that would be useful for Query Store, please up vote it!  Again, Query Store is available in ALL editions of SQL Server 2016, this is definitely a feature you CAN use and will want to use!  This potential option won’t make it into RTM, but with the change in how SQL Server is releasing CUs, the more important the SQL Server team sees this option (as a result of votes), the faster it might make it in a release.

If you haven’t looked into Query Store, I definitely recommend doing so!  When I demo’d the feature last week, one of our attendees said that he would consider skipping 2014 and going straight to 2016, just for this feature alone.  A feature that’s available in EVERY Edition.

Why do YOU avoid Extended Events?

On Monday this week I had an interesting exchange on Twitter with a bunch of folks who are die-hard Profiler/Trace users, and have no interest in using Extended Events.  To wit:

Tweet from Denny about Profiler (used with permission)

Tweet from Denny about Profiler (used with permission)

Now, Denny and I are good friends, and his tweet didn’t upset me in any way, it just got me thinking.  Why are DBAs and developers so resistant to using Extended Events?  I have some theories, but I realized I should collect some data.

Therefore, whether you:

  • Have never tried Extended Events,
  • Have tried Extended Events but would rather keep using Profiler/Trace,
  • Use Extended Events occasionally but still prefer Profiler/Trace,

I want to hear from you.  Whatever the reason, I want to know – so please leave a comment below.  My goal is to understand what the main challenges are so that I can then provide options and solutions, or create Connect items for the product team to address any gaps in functionality.

Extended Events *is* the replacement for Profiler/Trace; it’s not going away.  I really want people to be prepared for the time when Profiler and Trace are removed from the product.  And I want to provide feedback to the SQL Server product team to address limitations that people find in Extended Events.  If the feature is lacking something, we need to work together to create change.

Thanks in advance for your help, and if you haven’t tried XE, or are looking for a refresher, then please attend my webinar next Tuesday, April 5th at 12PM EDT: Kicking and Screaming: Replacing Profiler with Extended Events.  I’d love to see you there and can help get you started with XE!

EDIT 2:37PM EDT: If you are still running 2008 or 2008R2, then our advice has always been to stick with Trace and Profiler.  If you’re running SQL Server 2012 and higher, then I recommended Extended Events.  Why?  Because it wasn’t until SQL Server 2012 that every event from Trace had a comparable event in Extended Events.  So,if your argument is that you don’t want to learn XQuery and XML because you’re on 2008 or 2008R2, I’m right there with you and will tell you that’s fine, wait until 2012 to use XE.

Instant File Initialization: Easier to Enable in SQL Server 2016 (and some updated numbers)

The ability to have SQL Server data files skip zero initialization when they are created or grown has been available since SQL Server 2005.  By default, when you create a new data file in SQL Server, or extend the size of an existing one, zeroes are written to the file.  Depending on the size of the file or its growth, and the type of storage, this can take a while.  With Instant File Initialization (IFI), space is allocated for the data file but no zeroes are written.  Prior to SQL Server 2016, to enable this feature you had to edit the Local Security Policy to give the account that runs the SQL Server service the “Perform volume maintenance tasks” right (from Start | Run, type secpol, within the Local Security Policy expand Local Policies, then User Rights Assignment).  This was a task that DBAs had to perform separate from the SQL Server installation (or have a server admin do it for them), and if you did not make the change before installing SQL Server, then it required restarting SQL Server after making the change for it to take affect.  This has changed with the SQL Server 2016 installation, as you can now select the option “Grant Perform Volume Maintenance Task privilege to SQL Server Database Engine Service” when you specify the service accounts, and this will grant the right to the service account at that time.

Enable Instant File Initialization during SQL Server 2016 installation

Enable Instant File Initialization during SQL Server 2016 installation

There is a potential security risk to using this feature, which Kimberly discusses in her Instant Initialization – What, Why and How? post.  The information presented in her post is still valid and worth the read, but Glenn and I did re-run some tests recently, just to get some current numbers to show the benefits of IFI.  We ran the same four tests that Kimberly ran way back in 2007 (!) on four different sets of storage: two sets of 15K disks (one in a RAID10 array, the other in a RAID1 array) and two sets of flash storage (FusionIO cards).  More information on the storage at the end of the post.  The tests were:

 

1 Create 20GB database
2 Grow existing database by 10GB
3 Restore 30GB empty database
4 Restore 30GB database with 10GB data

 

The tests were run on two different physical servers, both running SQL Server 2014.  Details for each storage system are listed below for reference, and the test results were as we expected:

 

Duration for file modification or database restore, with and without IFI

Duration for file modification or database restore, with and without IFI

 

The time to zero out a file and write data is a function of sequential write performance on the drive(s) where the SQL Server data file(s) are located, when IFI is not enabled.  When IFI is enabled, creating or growing a data file is so fast that the time is not of significant consequence.  The time it takes to create or grow a value varies in seconds between 15K, SSD, flash, and magnetic storage when IFI is enabled.  However, if you do not enable IFI, there can be drastic differences in create, grow, and restore times depending on storage.

Storage Details:

  • 15K RAID10 = Six (6) 300GB 15K disks in RAID 10
  • Flash Drive1 = 640GB Fusion-io ioDrive Duo
  • Flash Drive2 = 2.41TB Fusion-io ioDrive2 Duo
  • 15K RAID1 = Two (2) 300GB Seagate Savvio 15K drives in RAID 1

Note: This post was edited on April 13, 2016, to clarify the storage configuration based on a helpful comment.

SQL Server Setup Error: The directory name is invalid

A few weeks ago I got an email from someone who had attended our Accidental DBA IE class last year, and this person was getting the following error when trying to apply a cumulative update:

SQL Server Setup failure: The directory name is invalid.

SQL Server Setup failure: The directory name is invalid.

The initial email didn’t have a lot of details, so I started asking questions to understand what version was being installed, the environment configuration, etc.  Turns out this was a two-node Windows Server Failover Cluster (WSFC) with multiple SQL Server 2012 instances installed, and one of the instances was still running on the node this person was trying to patch.  To be clear, the two nodes were SRV1 and SRV2, and the instances were PROD-A and PROD-B running on SRV1, and PROD-C which was running on SRV2.  This person was trying to install the cumulative update on SRV2.

Now, those of you that manage clusters may be thinking “Doesn’t this DBA know that the way you do rolling upgrades is by not having any instances running on the node you’re trying to patch?”  Well, not everyone is an experienced DBA, a lot of people are Accidental or Junior DBAs, and if this is the first cluster you’re supporting, you may not know that, or understand why.  Further, when you update a single node on a stand-alone server (one that’s NOT in a cluster) it’s not like you shut down the instance yourself and apply the CU, right?

We checked the summary installation log, located in C:\Program Files\Microsoft SQL Server\110\Setup Bootstrap\Log and found the following Exit message:

The directory ‘M:\a13e546ad3e0de04a828\’ doesn’t exist.

The M drive was a resource for PROD-C, along with the N drive.  There was also a quorum drive (Q) and the local C drive.  So how was M not available?

Well, it was initially, when the install started, and when the installer runs, it puts the files on the first network drive that it finds (if it’s an administrative installation), or the drive with the most free space (see: ROOTDRIVE property).  In this case, the M drive met the criteria.  When the installer then stopped the instance and took the cluster disks offline, the M drive was suddenly gone, hence the invalid directory.

You could argue that this is a bug…maybe…but the solution I suggested was to move PROD-C over to the SRV1 node, then run the installation.  You could also specify the directory as part of a command-line install, therefore using a different disk, but downtime was permitted in this scenario, so the failover wasn’t a deal-breaker.  Once this was done, the installation ran fine, and the latest CU was applied on that node.  The DBA then went through the process of failing all the instances over to the patched node, and then applying the CU on SRV1.

As an aside, if you’re not sure of the current service pack, cumulative update, or hotfix available for your SQL Server version, I recommend this site which has all versions and releases and links to the downloads.  And, for those of you running SQL Server 2014, CU5 for SP1 just came out yesterday and has some interesting fixes (see https://support.microsoft.com/en-us/kb/3130926).

 

Collection of Baseline Scripts

The topic of baselines in SQL Server is one that I’ve had an interest in for a long time.  In fact, the very first session I ever gave back in 2011 was on baselines.  I still believe they are incredibly important, and most of the data I capture is still the same, but I have tweaked a couple things over the years.  I’m in the process of creating a set of baseline scripts that folks can use to automate the capture of this information, in the event that they do not have/cannot afford a third-party monitoring tool (note, a monitoring tool such as SQL Sentry’s Performance Advisor can make life WAY easier, but I know that not every can justify the need to management).  For now, I’m starting with links to all relevant posts and then I’ll update this post once I have everything finalized.

These scripts are just a starting point for what to monitor.  One thing I like to point in our IEPTO2: Performance Tuning and Optimization course is that there is A LOT of data you can capture related to your SQL Server environment.  Your options include Performance Monitor (using Custom Data Collectors), queries via Extended Events or Trace (depending on version), and any data from the DMVs or system views within SQL Server. You have decide what to capture based on

1) What problem you might be trying to solve in your environment, and

2) What information is most important for you to have.  Start simple, and then work your way up.

Figure out the one or two most critical things to capture, and start there, and then add on.

If you find there’s something missing from my scripts, let me know and I’ll try to get it added!

Monitoring in general

Configuration Information

Disk and I/O

Maintenance

Errors

Performance

Pluralsight

Capture Blocking Information with Extended Events and the Blocked Process Report

I am a big fan of Adam Machanic’s WhoIsActive script, and when customers have issues with performance, it’s one of the first tools I recommend because it’s so simple to use and provides great information.  Very often it helps with quickly determining an issue, but sometimes there’s a need to capture more information, particularly when locking and blocking is part of the issue.  Adam’s script has an option to include blocking information, for example including the [blocking_session_id] column in the output and using @find_block_leaders = 1as an parameter.  But sometimes you need more information, like the blocked process report.  I’ve found one of the easiest ways to get that in SQL Server 2012 and higher is Extended Events.  If you’re running SQL Server 2005 and higher, you can use Event Notifications to capture the blocked process report.  This option is nice because you are notified when the problem occurs.  For those of you using SQL Server 2008R2 and below, you also have the option of capturing the blocked process report event through a server-side Trace.  But if you’re on SQL Server 2012 and higher, you can use Extended Events and the blocked process report.  Note: the blocked_process_report event does not exist in SQL Server 2008 or SQL Server 2008R2, which is why Trace is the method there.  The drawback to Extended Events is that you don’t get a notification that blocking occurred, but for those who are not as comfortable with Event Notifications – for whatever reason – Extended Events is a very simple alternative.

The Setup

In order to capture a blocked process report, you must have the blocked process threshold system configuration option enabled.  A good starting value is 15, which is the threshold in seconds at which the report is generated.  To set this value, run the following code:

EXECUTE sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
EXECUTE sp_configure 'blocked process threshold', 15;
GO
RECONFIGURE;
GO
EXECUTE sp_configure 'show advanced options', 0;
GO
RECONFIGURE;
GO

The following code will create the event session and then start it.  Note that you can create the event session and just have it defined in your system without running it.  Then, if you start to have blocking you can set the blocked process threshold and start the event session.

/*
check to see if the event session exists
*/
IF EXISTS ( SELECT  1
FROM    sys.server_event_sessions
WHERE   name = 'Capture_BlockedProcessReport' )
DROP EVENT SESSION [Capture_BlockedProcessReport] ON SERVER;
GO

/*
create the event session
edit the filename entry if C:\temp is not appropriate
*/
CREATE EVENT SESSION [Capture_BlockedProcessReport]
ON SERVER
ADD EVENT sqlserver.blocked_process_report
ADD TARGET package0.event_file(
SET filename=N'C:\Temp\Capture_BlockedProcessReport.xel'
)
WITH (MAX_MEMORY=8192 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,
TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF);
GO

/*
start the event session
*/
ALTER EVENT SESSION [Capture_BlockedProcessReport]
ON SERVER
STATE = START;
GO

Capturing Data

Once the event session is started, then you just wait until the blocking occurs.  The following code can be used to generate an example in your test/dev environment:

/*
create a table and insert
one row without committing
*/
USE [tempdb];
GO

CREATE TABLE [BlockingTest] (
[ID] INT IDENTITY(1,1) PRIMARY KEY,
[INFO] VARCHAR(10)
);
GO

BEGIN TRANSACTION
INSERT INTO [BlockingTest] ([INFO]) VALUES ('SQLskills');
GO

/*
run the following statement in a different window
*/
USE [tempdb];
GO
SELECT *
FROM [BlockingTest];

GO

After about 15 seconds, run the following code back in the original window:

/*
clean up (run in original window)
*/
USE [tempdb];
GO
COMMIT;
GO
DROP TABLE [BlockingTest];
GO

You can then stop the event session, and either leave it there until you need it again, or drop it entirely:

/*
stop the event session
*/
ALTER EVENT SESSION [Capture_BlockedProcessReport]
ON SERVER
STATE = STOP;
GO

/*
drop the event session
*/
DROP EVENT SESSION [Capture_BlockedProcessReport]
ON SERVER;
GO

Viewing the Report

To view the output from extended events you can open the .xel file in Management Studio or query the data using the sys.fn_xe_file_target_read_file function. I typically prefer the UI, but there’s currently no great way to copy the blocking report text and view it in the format you’re used to.  But if you use the function to read and parse the XML from the file, you can…

SELECT
    event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    event_data.query('(event/data[@name="blocked_process"]/value/blocked-process-report)[1]') as [blocked_process_report]
FROM
(
    SELECT CAST(event_data AS XML) AS event_data
    FROM sys.fn_xe_file_target_read_file('C:\Temp\Capture_BlockedProcessReport*.xel', NULL, NULL, NULL)
) AS sub;
GO

Depending on how long you let the blocking continue, you may have captured more than one event and therefore have multiple reports in the output:

Retrieving the blocked process report

Retrieving the blocked process report

 

You can click on the output for any row to see the blocked process in XML format, and then work through the blocking:

The blocked process report

The blocked process report

[Huge thanks to Jonathan for help with the XML.  I don’t think XML and I will ever be friends.  Geez.]

Summary

If you’re in need of the blocked process report and running SQL Server 2012, you now have another option for getting that information.  If you’re still new to extended events, check out the first two stairways in my XE series on SQLServerCentral

Happy New Year!