As part of the 2008 DBA class we're teaching down here in Melbourne, I did a demo of using predicates and file targets with extended events, so I want to blog the script for people to play with.

For background info on extended events see:

In this scenario, I'd like to track queries that make heavy usage of the CPU on my system.

First off I'll create a test database to play with.

USE MASTER;
GO
IF DATABASEPROPERTYEX ('production', 'Version') > 0 DROP DATABASE production;
GO

CREATE DATABASE production;
GO
USE production;
GO

CREATE TABLE t1 (c1 INT IDENTITY, c2 UNIQUEIDENTIFIER ROWGUIDCOL DEFAULT NEWID(), c3 CHAR (5000) DEFAULT 'a');
CREATE CLUSTERED INDEX t1_CL ON t1 (c1);
CREATE NONCLUSTERED INDEX t1_NCL ON t1 (c2);
GO

SET NOCOUNT ON;
INSERT INTO t1 DEFAULT VALUES;
GO 1000

– Get the database ID to plug into the event session
SELECT DB_ID ('production');
GO

Notice that I grabbed the database ID of the new database I created, as I'll need that when defining the extended event predicate.

IF EXISTS (SELECT * FROM sys.server_event_sessions WHERE name = 'EE_ExpensiveQueries')
   DROP EVENT SESSION EE_ExpensiveQueries ON SERVER;
GO

CREATE EVENT SESSION EE_ExpensiveQueries ON SERVER
ADD EVENT sqlserver.sql_statement_completed
   (ACTION (sqlserver.sql_text, sqlserver.plan_handle)
      WHERE sqlserver.database_id = 18 /*DBID*/  AND cpu > 10 /*total ms of CPU time*/)
ADD TARGET package0.asynchronous_file_target
   (SET FILENAME = N'C:\SQLskills\EE_ExpensiveQueries.xel', METADATAFILE = N'C:\SQLskills\EE_ExpensiveQueries.xem')
WITH (max_dispatch_latency = 1 seconds);
GO

I'm monitoring the event that fires whenever a statement completes, and I'm filtering by database ID (make sure you plug in the correct database ID when you try this yourself) and by the number of milliseconds of CPU time the statement used. Note: when using predicates you should always make sure that you order the predicate tests such that the most restrictive predicates (those most likely to evaluate to false) are first in the list, so the predicate evaluation 'short-circuits' as quickly as possible (a standard programming practice with boolean logic). I'm also using a file target, just to show how it can be done, instead of the ring buffer which is commonly used for extended events demos.

Now I'm going to turn on the event session and do some queries in the production database.

ALTER EVENT SESSION EE_ExpensiveQueries ON SERVER STATE = START;
GO

USE production;
GO

SELECT COUNT (*) FROM t1 WHERE c1 > 500;
GO

SELECT SUM (c1) FROM t1 WHERE c3 LIKE 'a';
GO

ALTER INDEX t1_CL ON t1 REORGANIZE;
GO

ALTER INDEX t1_CL ON t1 REBUILD;
GO

That should have generated some long-running queries. Now I'll switch in to the context of master (so the querying of the event session itself doesn't get captured by the event session) and see what I've captured.

USE master;
GO

SELECT COUNT (*) FROM sys.fn_xe_file_target_read_file
   ('C:\SQLskills\EE_ExpensiveQueries*.xel', 'C:\SQLskills\EE_ExpensiveQueries*.xem', NULL, NULL);
GO

In this case I've got 3 entries. I can pull these out using the following code:

SELECT data FROM
   (SELECT CONVERT (XML, event_data) AS data FROM sys.fn_xe_file_target_read_file
      ('C:\SQLskills\EE_ExpensiveQueries*.xel', 'C:\SQLskills\EE_ExpensiveQueries*.xem', NULL, NULL)
 ) entries;
GO

But I get three XML blobs back, like so:

 eedemo1 Tracking expensive queries with extended events in SQL 2008

What's more useful is to pull everything out of the XML blob programmatically using the code below:

SELECT
   data.value (
      '(/event[@name=''sql_statement_completed'']/@timestamp)[1]', 'DATETIME') AS [Time],
   data.value (
      '(/event/data[@name=''cpu'']/value)[1]', 'INT') AS [CPU (ms)],
      CONVERT (FLOAT, data.value ('(/event/data[@name=''duration'']/value)[1]', 'BIGINT')) / 1000000
      AS [Duration (s)],
   data.value (
      '(/event/action[@name=''sql_text'']/value)[1]', 'VARCHAR(MAX)') AS [SQL Statement],
      SUBSTRING (data.value ('(/event/action[@name=''plan_handle'']/value)[1]', 'VARCHAR(100)'), 15, 50)
      AS [Plan Handle]
FROM
   (SELECT CONVERT (XML, event_data) AS data FROM sys.fn_xe_file_target_read_file
      ('C:\SQLskills\EE_ExpensiveQueries*.xel', 'C:\SQLskills\EE_ExpensiveQueries*.xem', null, null)
) entries
ORDER BY [Time] DESC;
GO

Time                    CPU (ms) Duration (s) SQL Statement                              Plan Handle
———————– ——– ———— —————————————— ————————————————–
2009-10-16 17:59:29.623 30       1.214875     ALTER INDEX t1_CL ON t1 REBUILD;           0x06001000EB672A07B8C0C807000000000000000000000000
2009-10-16 17:59:28.407 20       0.024076     ALTER INDEX t1_CL ON t1 REORGANIZE;        0x0600100003594903B8C0C807000000000000000000000000
2009-10-16 17:59:28.343 51       0.045144     SELECT SUM (c1) FROM t1 WHERE c3 LIKE 'a'; 0x06001000FAF5B11EB820C307000000000000000000000000

Now I can plug in one of the plan handles to a query of sys.dm_exec_query_plan to get the graphical query plan:

SELECT [query_plan] FROM sys.dm_exec_query_plan (0x06001000FAF5B11EB820C307000000000000000000000000);
GO

And clicking on the resulting XML 'link' gives the query plan:

 eedemo2 Tracking expensive queries with extended events in SQL 2008

 eedemo3 Tracking expensive queries with extended events in SQL 2008

And now I can tweak the production workload to potentially behave better.

Just for kicks I went back into the context of the production database and ran the XML parsing again to capture the query plan – try it yourself – pretty gnarly! :-)

Now, if I was going to make this more useful, I'd use a ring buffer target, with a polling mechanism every few seconds to make sure that I can capture the graphical query plan for expensive queries before the plan is pushed out of cache – maybe I'll get around to doing that and publish the complete solution.

In the meantime, here's a zip file containing the entire script that you can download and play with: EE_ExpensiveQueries.zip (1.72 kb)

Enjoy!