T-SQL Tuesday #67 – monitoring log activity with Extended Events


On the second Tuesday of each month, many people in the SQL Server community join together to all blog on the same topic – a cool idea from Adam Machanic many years ago.

This month’s topic is Extended Events, hosted by Jes Borland (b | t) – you can see her original post here.

Yesterday in class I was discussing monitoring transaction log activity with a student, to show that when a transaction commits, there is a write to the transaction log file. This is easy to do with Extended Events.

I’m going to use the file_write_completed event to track writes occurring and the transaction_log event to watch log records being generated so we can see transactions committing.

First of all I’ll setup my simple scenario:

USE [master];


    NAME = N'Test_data',
    FILENAME = N'D:\SQLskills\Test_data.mdf')
    NAME = N'Test_log',
    FILENAME = N'C:\SQLskills\Test_log.ldf',
    SIZE = 1MB,

USE [test];

	c2 CHAR (1000) DEFAULT 'a');


And here’s the simple Extended Event session:

-- Drop the session if it exists. 
	SELECT * FROM sys.server_event_sessions
		WHERE [name] = N'MonitorLog')

-- Create the event session
	ADD EVENT [sqlserver].[file_write_completed],
	ADD EVENT [sqlserver].[transaction_log]
	ADD TARGET [package0].[ring_buffer]
		WITH (MAX_MEMORY = 50MB, max_dispatch_latency = 1 seconds)

-- Start the session

I ran both of those scripts, then executed the INSERT a few times.

Now if I go to Object Explorer, I can use the Live Data Viewer (in SQL Server 2012 onwards). Find the session, right-click on it and select Watch Live Data.


Choose the following columns to view (right click on any column name and select Choose Columns…): name, timestamp, database_id, file_id, size, operation.

And then run another INSERT (you might have to run two to make the first set of data show in the Viewer) and you’ll see data like below.


And then you can play around and watch stuff happening.

Enjoy! (and check out the other posts from this T-SQL Tuesday)

November classes in Chicago

We’ve released our Fall lineup of classes for registration!

Our November classes in Chicago, IL will be:

  • IE0: Immersion Event for Junior/Accidental DBAs
    • November 16 – 18
  • IEPTO1: Immersion Event on Performance Tuning and Optimization – Part 1
    • November 16 – 20
  • IEPDS: Immersion Event on Practical Data Science
    • November 16 – 19

You can get all the logistical, registration, and curriculum details by drilling down from our main schedule page.

We hope to see you there!

New course: Index Fragmentation Internals, Analysis, and Solutions

My latest Pluralsight course has been published!

It’s called SQL Server: Index Fragmentation Internals, Analysis, and Solutions and is just over 2.5 hours long.

The modules are:

  • Introduction
  • Index Structure and Index Uses
  • Types of Index Fragmentation
  • Causes of Index Fragmentation
  • Detecting Index Fragmentation
  • Avoiding Index Fragmentation
  • Removing Index Fragmentation

Check it out here.

We now have more than 130 hours of SQLskills online training available, all for as little as $29/month through Pluralsight (including more than four thousand other developer and IT training courses). That’s unbeatable value that you can’t afford to ignore.

Finding a transaction in the log for a particular user

In the last IEHADR class we just had in Chicago, I was doing a demo of looking in the transaction log to find the point at which a table was dropped so a restore could be performed (as described in this blog post). One of the students asked how to find a transaction for a particular user, so I demo’d that and thought it would make a good little post.

This can be done using fn_dblog, or if the relevant log isn’t available on the system any more, using fn_dump_dblog, albeit more slowly.

The two pieces of information needed are the user and the rough time that the transaction occurred.

The user can’t be used to search in the log directly, but every LOP_BEGIN_XACT log record contains the SID of who ran the transaction. The SID can be obtained from the username using the SUSER_SID function (remember that it’s more complicated if someone’s used EXECUTE AS, as that can mask who they really are – details in this post).

For instance, on my laptop:


Then you can use that SID as a filter for fn_dblog (or fn_dump_dblog), like so:

	[Current LSN],
	[Transaction ID],
	[Begin Time],
	LEFT ([Description], 40) AS [Description]
	fn_dblog (NULL, NULL)
Current LSN             Operation                       Transaction ID Begin Time               Description
----------------------- ------------------------------- -------------- ------------------------ ----------------------------------------
00000021:0000049d:0001  LOP_BEGIN_XACT                  0000:000006c8  2015/06/03 11:18:13:790  Backup:CommitDifferentialBase;0x01050000
00000021:000004a5:0001  LOP_BEGIN_XACT                  0000:000006c9  2015/06/03 11:18:13:810  Backup:CommitLogArchivePoint;0x010500000
00000021:000004a5:0002  LOP_BEGIN_XACT                  0000:000006ca  2015/06/03 11:18:13:810  Backup:CommitLogArchivePoint;0x010500000
00000021:000004a7:0003  LOP_BEGIN_XACT                  0000:000006cb  2015/06/03 11:18:13:820  INSERT;0x0105000000000005150000008e888d4
00000021:000004a7:0004  LOP_BEGIN_XACT                  0000:000006cc  2015/06/03 11:18:13:820  AllocHeapPageSimpleXactDML;0x01050000000
00000021:000004a7:0007  LOP_BEGIN_XACT                  0000:000006cd  2015/06/03 11:18:13:820  AllocFirstPage;0x01050000000000051500000
00000021:000004ad:0002  LOP_BEGIN_XACT                  0000:000006ce  2015/06/03 11:18:13:820  INSERT;0x0105000000000005150000008e888d4
00000021:000004ae:0001  LOP_BEGIN_XACT                  0000:000006cf  2015/06/03 11:18:16:112  INSERT;0x0105000000000005150000008e888d4
00000021:000004af:0001  LOP_BEGIN_XACT                  0000:000006d0  2015/06/03 11:19:17:306  INSERT;0x0105000000000005150000008e888d4
00000021:000004b0:0001  LOP_BEGIN_XACT                  0000:000006d1  2015/06/03 11:22:35:451  DELETE;0x0105000000000005150000008e888d4
00000021:000004b1:0001  LOP_BEGIN_XACT                  0000:000006d2  2015/06/03 11:27:42:998  INSERT;0x0105000000000005150000008e888d4
00000021:000004b2:0001  LOP_BEGIN_XACT                  0000:000006d3  2015/06/03 11:29:56:044  DELETE;0x0105000000000005150000008e888d4


Obviously the transactions above are a contrived example. You can imagine the case of lots of transactions spread out over a few hours (or even over a day, being investigated through log backups with fn_dump_dblog) and to narrow it down to the transaction you want you could look through the list manually for the rough time or specify a time range on the SELECT using predicates on the Begin Time column in the fn_dblog output.

For example:

	[Current LSN],
	[Transaction ID],
	[Begin Time],
	LEFT ([Description], 40) AS [Description]
	fn_dblog (NULL, NULL)
AND ([Begin Time] > '2015/06/03 11:18:15' AND [Begin Time] < '2015/06/03 11:18:25');
Current LSN             Operation                       Transaction ID Begin Time               Description
----------------------- ------------------------------- -------------- ------------------------ ----------------------------------------
00000021:000004ae:0001  LOP_BEGIN_XACT                  0000:000006cf  2015/06/03 11:18:16:112  INSERT;0x0105000000000005150000008e888d4
00000021:000004af:0001  LOP_BEGIN_XACT                  0000:000006d0  2015/06/03 11:19:17:306  INSERT;0x0105000000000005150000008e888d4
00000021:000004b0:0001  LOP_BEGIN_XACT                  0000:000006d1  2015/06/03 11:22:35:451  DELETE;0x0105000000000005150000008e888d4

And if you knew what the operation was, you could narrow it down by the Description too.

Then it’s a simple case of taking the Current LSN of the LOP_BEGIN_XACT log record of the transaction you’re interested in, and restoring a copy of the database using the STOPBEFOREMARK trick (that I showed in my previous post on using this stuff) to restore to a point just before that transaction.


New course: part 2 of Kimberly’s Optimizing Stored Procedure Performance

Kimberly’s latest Pluralsight course is live!

This is part 2 of her very popular course: SQL Server: Optimizing Stored Procedure Performance. The course is more than 3 hours long and has the following modules:

  • Introduction
  • Session Settings
  • Session Settings (section 2)
  • Special Considerations
  • Troubleshooting Techniques

Check it out here.

We now have more than 130 hours of SQLskills online training available, all for as little as $29/month through Pluralsight (including more than four thousand other developer and IT training courses). That’s unbeatable value that you can’t afford to ignore.


How would indexes on AG readable secondaries work?

Last weekend there was a suggestion on the MVP distribution list about having temporary nonclustered indexes on AG readable secondaries, in the same way that you can have temporary statistics on them. I replied that in my opinion it would be extremely difficult to do that, and said I’d blog about why. Here’s my explanation. Note that this isn’t an exhaustive list of all the problems, just the major ones that I see.

Where to store them?

There are several options that spring to mind for how to store these temporary indexes, with varying degrees of difficulty:

  1. Automatically create a temporary filegroup on the readable secondary
    1. This option has the problem that the readable secondary is read-only, and adding a filegroup would mean adding entries in a bunch of system tables (including sys.sysfiles1, sys.sysdbfiles, sys.sysbrickfiles, sys.sysprufiles). Even if this problem could be surmounted, there’s still the problem of…
    2. The readable secondary is read-only, so where to store all the information about the indexes themselves? There are a large number of system tables that have information about an index (including sys.sysallocunits, sys.sysrowsets, sys.sysrscols, sys.sysidxstats, sys.sysiscols). Even if this problem could be surmounted, there’s still the problem of maintaining the index (see below).
  2. Create an empty filegroup on the primary replica so there’s an empty filegroup to use on the readable secondary
    1. This only solves 1.a above.
  3. Store them in tempdb, the same as the temporary statistics
    1. This solves 1.a and 1.b, but then has the added difficulty of…
    2. Keeping track of the fact that there are indexes in tempdb that are really for another database, which already happens for temporary statistics, so that’s doable but there’s the much more complex problem of…
    3. Making the Storage Engine (specifically the Access Methods) get data from the tempdb index instead of an index in the real database. That’s not a code change in the Storage Engine (because the Access Methods just creates and uses a data set over whatever the Query Processor asks it to), but the query plan will have to know that the index it’s referencing which purports to be on a table in the real database is actually in tempdb, so it asks the Storage Engine to read from the correct place. I can imagine this being quite a challenge for things like a key lookup based on a nonclustered index seek/scan.
    4. This is the only potentially practical solution in my mind for where to create the temporary indexes.

That’s the easier part taken care of.

How to maintain them?

Assuming they are stored as in #3 above, then there’s the really hard problem of how to maintain the indexes, to keep them up-to-date with the underlying table (i.e. when an INSERT, UPDATE, or DELETE happens, make sure that all nonclustered indexes are updated accordingly). Creating the index would be relatively trivial using an online index build-like method, so I won’t go into that.

Back in SQL Server 2000, it would be much easier (although there weren’t any AGs back then :-) because the Storage Engine was responsible for real-time maintenance of nonclustered indexes. Since SQL Server 2005, however, this has been the purview of the Query Processor, so it drives maintaining indexes and the Storage Engine just does what it’s asked (insert a record here, update this column, etc.). There is functionality available to the Storage Engine to ask the Query Processor to manipulate nonclustered indexes without an operation occurring on the table – DBCC CHECKDB uses it for fixing missing or extra records in nonclustered indexes during repair operations.

Here’s the major problem with temporary indexes: the only data coming from the primary replica are physical log records. How to translate those physical log records into nonclustered index maintenance operations?

The only log records that are interesting are those related to the table itself (i.e. changes to the heap or clustered index). But log records are replayed by the recovery portion of the Storage Engine, not the Access Methods, so in the recovery portion, there is no context at all about the table, it’s columns, indexes, and so on. To make this work, the following would be required:

  1. The recovery portion would have to know that the log records LOP_INSERT_ROWS, LOP_DELETE_ROWS, LOP_MODIFY_ROW, LOP_MODIFY_COLUMNS, plus anything else like truncating the table, changing from a heap to a clustered index and vice versa, for certain allocation units (i.e. those for any table with a temporary nonclustered index) need to be passed to a new piece of code to do the temporary index maintenance.
  2. The new piece of code would have to know about the new indexes and do the index maintenance. In other words, Storage Engine nonclustered index maintenance like in SQL Server 2000 would have to be built again. There’s a complication with table record modifications, as the LOP_MODIFY_ROW and LOP_MODIFY_COLUMNS log records don’t say which columns are changing – just the offset and length of the change (they’re physical log records remember). The log record could potentially be changed to have a bitmap of column IDs being changed, that’s only present when there’s a temporary index on a readable secondary. More conditional code.
  3. Another option is for the Query Processor on the primary replica to know that there’s a temporary index and generate special logical operation log records to aid in the maintenance of the index on the readable secondary (in the same way that transactional replication works). I think this would be easier than having to interpret regular log records and figure out what to do.
  4. All this new code to be executed would potentially slow down the replay of log records, with the slow down increasing with each temporary index that’s added.


To summarize, there are possible solutions to how to store the temporary indexes and how to maintain them, but it’s a lot of tricky work and involves a big change to the recovery code, which is always fraught with danger as it’s such a critical portion of the Storage Engine. And I haven’t touched on partitioning, rebuild/reorganize, and other things you can do with indexes.

IMHO, the easiest solution is to not use temporary indexes and just create the indexes on the primary replica that you want to be there on the readable secondary. This doesn’t require any code changes and is available today. Of course there’s the potential downside of extra space being taken up, extra logging, and extra time to send that log to the secondaries, but I think this is the way to go.

Hope you found this interesting!

Survey: page life expectancy

It’s been a while since I’ve done a survey, and this one is inspired by a discussion I had with Matt Slocum (b | t) yesterday that led me to update my PLE post Page Life Expectancy isn’t what you think… with the discovery that on NUMA systems, Buffer Manager: Page Life Expectancy is a harmonic mean, not an arithmetic mean – big difference.

Anyway, in this survey I want some information about your page life expectancy and other supporting information:

  • SQL Server version
  • Number of logical processor cores
  • Buffer Node: Page Life Expectancy and Target/Total Node Memory for each buffer pool partition
  • Buffer Manager: Page life Expectancy

Feel free to run the code below any way you want, and also add a single preceding column to the result set (e.g. server name or number) if you want, but PLEASE do not add any *rows* of data apart from what I’ve asked for otherwise it makes the data processing very time consuming, especially if you send results from hundreds of servers. I know people that do that are trying to be helpful, but I really don’t need any other data apart from what I’ve asked for.

You can send me results in email in a text file or spreadsheet, or leave a comment below. The code will work on SQL Server 2005 onwards.

[Edit: 6/17/15 – the survey is closed – please don’t send any more results – thanks!]

I’ll editorialize the results in a week or two.


	LEFT (@@version, 25),
	'Cores available:',
	COUNT (*)
FROM sys.dm_os_schedulers


FROM sys.dm_os_performance_counters
WHERE [object_name] LIKE '%Memory Node%'
	AND [counter_name] = 'Target Node Memory (KB)'


FROM sys.dm_os_performance_counters
WHERE [object_name] LIKE '%Memory Node%'
	AND [counter_name] = 'Total Node Memory (KB)'


	[object_name] AS [ObjectName],
	[counter_name] AS [CounterName],
	[cntr_value] AS [CounterValue]
FROM sys.dm_os_performance_counters
WHERE [object_name] LIKE '%Buffer Node%'
	AND [counter_name] = 'Page life expectancy'


FROM sys.dm_os_performance_counters
WHERE [object_name] LIKE '%Buffer Manager%'
	AND [counter_name] = 'Page life expectancy';

Make sure you’re eligible to vote in the PASS elections this year!

Last year PASS introduced some eligibility requirements for voting in the 2014 PASS elections, but a lot of people didn’t get the message for various reasons (including me) before the deadline expired. PASS did the right thing and allowed people to become eligible by delaying the elections.

This year there are similar eligibility requirements and you’ve got until June 1st to make sure you’re eligible.

Check out the blog post from Bill Graziano (Immediate Past President of PASS) that explains what to do and how to verify your eligibility.

I’ve checked and I’m eligible (early this year! :-)



Identifying queries with SOS_SCHEDULER_YIELD waits

[Edit 2016: Check out my new resource – a comprehensive library of all wait types and latch classes – see here.]

One of the problems with the SOS_SCHEDULER_YIELD wait type is that it’s not really a wait type. When this wait type occurs, it’s because a thread exhausted its 4ms scheduling quantum and voluntarily yielded the CPU, going directly to the bottom of the Runnable Queue for the scheduler, bypassing the Waiter List. A wait has to be registered though when a thread goes off the processor, so SOS_SCHEDULER_YIELD is used.

You can read more about this wait type:

  • Here on my blog
  • Here in my post on the SQLPerformance.com blog

You want to investigate these waits if they’re a prevalent wait on your server, as they could be an indicator of large scans happening (of data that’s already in memory) where you’d really rather have small index seeks.

The problem is that they’re not a real wait type, so you can’t use my script to look at sys.dm_os_waiting_tasks and get the query plans of threads incurring that wait type, because these threads aren’t waiting for a resource, so don’t show up in the output of sys.dm_os_waiting_tasks!

The solution is to use the sys.dm_exec_requests DMV, as that will show the last_wait_type for all running requests. Below is a script you can use.

FROM sys.dm_exec_requests [er]
INNER JOIN sys.dm_exec_sessions [es] ON
	[es].[session_id] = [er].[session_id]
OUTER APPLY sys.dm_exec_sql_text ([er].[sql_handle]) [est]
OUTER APPLY sys.dm_exec_query_plan ([er].[plan_handle]) [eqp]
    [es].[is_user_process] = 1
	AND [er].[last_Wait_type] = N'SOS_SCHEDULER_YIELD'

That will give you the code and query plan of what’s happening, but even with that it might not be obvious which exact operator is causing that wait so you may need to resort to capturing SQL Server call stacks, as I explain in the first blog post link above.


New 4-day Immersion Event on Practical Data Science with Rafal Lukawiecki

We’re very excited to announce that we’re offering a new Immersion Event on Practical Data Science using Azure Machine Learning, SQL Data Mining, and R, presented by our great friend Rafal Lukawiecki.

We’ve known Rafal well since 2003, and apart from being an expert in BI and data mining, he’s a superb, highly-engaging speaker, who’s presented keynotes, workshops, and sessions to high acclaim at conferences around the world including TechEds and Microsoft’s brand new Ignite conference. In fact, you can watch Rafal’s 75-minute, 300-level session from Ignite earlier this week, on the same topic as the new Immersion Event, on Channel 9 here – it’s really interesting!

Rafal’s course will be four days long, with the following modules:

  1. Overview of Practical Data Science for Business
  2. Data
  3. Process
  4. Algorithm Overview
  5. Tools and Getting Started
  6. Segmentation
  7. Classification
  8. Basic Statistics
  9. Model Validation
  10. Classifier Precision
  11. Regressions
  12. Similarity Matching and Recommenders
  13. Other Algorithms (Brief Overview)
  14. Production and Model Maintenance

You can read through the detailed curriculum here.

The first offering of this new class will be in Chicago, at our usual location, from November 16-19, 2015 – all the details are here.

Cool stuff – hope to see you there!