What is the most worrying cause of log growth (log_reuse_wait_desc)?

Two weeks ago I kicked off a survey that presented a scenario and asked you to vote for the log_reuse_wait_desc value you’d be most worried to see on a critical database with a 24×7 workload.

Here are the results:

logreuse What is the most worrying cause of log growth (log reuse wait desc)?

Another very interesting spread of responses – as always, thanks to everyone who took the time to respond.

Remember that you have no knowledge about anything on the system except what the log_reuse_wait_desc is for the database with constantly growing log.

Before I start discussing them, I’m going to explain the term log truncation that I’ll be using. This is one of the two terms used (the other is log clearing) to describe the mechanism that marks currently-active VLFs (virtual log files) in the log as inactive so they can be reused. When there are inactive VLFs, the log can wrap around (see this post for more details) and doesn’t have to grow. When there are no available VLFs because log truncation isn’t possible, the log has to grow. When log truncation can’t make any VLFs inactive, it records the reason why, and that’s what the log_reuse_wait_desc value in sys.databases gives us. You can read more about how the log works in the this TechNet Magazine article I wrote back in 2009, and get all the log info you could ever want in my logging and recovery Pluralsight course.

We also need to understand how the log_reuse_wait_desc reporting mechanism works. It gives the reason why log truncation couldn’t happen the last time log truncation was attempted. This can be confusing – for instance if you see ACTIVE_BACKUP_OR_RESTORE and you know there isn’t a backup or restore operation running, this just means that there was one running the last time log truncation was attempted. You can also see some weird effects – for instance if you do a log backup, and then see LOG_BACKUP immediately as the log_reuse_wait_desc value. I blogged about the reason for that phenomenon here.

Let’s quickly consider what each of the log_reuse_wait_desc values in the list above means (and this isn’t an exhaustive analysis of each one):

  • NOTHING: Just as it looks, this value means that SQL Server thinks there is no problem with log truncation. In our case though, the log is clearly growing, so how could we see NOTHING? Well, for this we have to understand how the log_reuse_wait_desc reporting works. The value is reporting what stopped log truncation last time it was attempted. If the value is NOTHING, it means that at least one VLF was marked inactive the last time log truncation occurred. We could have a situation where the log has a huge number of VLFs, and there are a large number of active transactions, with each one having its LOP_BEGIN_XACT log record in successive VLFs. If each time log truncation happens, a single transaction has committed, and it only manages to clear one VLF, it could be that the speed of log truncation is just waaaay slower than the speed of log record generation, and so the log is having to grow to accommodate it. I’d use my script here to see how many active transactions there are, monitor VLFs with DBCC LOGINFO, and also track the Log Truncations and Log Growths counters for the database in the Databases perfmon object. I need to see if I can engineer this case and repeatably see NOTHING.
  • CHECKPOINT: This value means that a checkpoint hasn’t occurred since the last time log truncation occurred. In the simple recovery model, log truncation only happens when a checkpoint completes, so you wouldn’t normally see this value. When it can happen is if a checkpoint is taking a long time to complete and the log has to grow while the checkpoint is still running. I’ve seen this on one client system with a very poorly performing I/O subsystem and a very large buffer pool with a lot of dirty pages that needed to be flushed when the checkpoint occurred.
  • LOG_BACKUP: This is one of the most common values to see, and says that you’re in the full or bulk_logged recovery model and a log backup hasn’t occurred. In those recovery models, it’s a log backup that performs log truncation. Simple stuff. I’d check to see why log backups are not being performed (disabled Agent job or changed Agent job schedule? backup failure messages in the error log?)
  • ACTIVE_BACKUP_OR_RESTORE: This means that there’s a data backup running or any kind of restore running. The log can’t be truncated during a restore, and is required for data backups so can’t be truncated there either.
  • ACTIVE_TRANSACTION: This means that there is a long-running transaction that is holding all the VLFs active. The way log truncation works is that it goes to the next VLF (#Y) from the last one (#X) made inactive last time log truncation works, and looks at that. If VLF #Y can’t be made inactive, then log truncation fails and the log_reuse_wait_desc value is recorded. If a long-running transaction has its LOP_BEGIN_XACT log record in VLF #Y, then no other VLFs can be made inactive either. Even if all other VLFs after VLF #Y have nothing to do with our long-running transaction – there’s no selective active vs. inactive. You can use this script to see all the active transactions.
  • DATABASE_MIRRORING: This means that the database mirroring partnership has some latency in it and there are log records on the mirroring principal that haven’t yet been sent to the mirroring mirror (called the send queue). This can happen if the mirror is configured for asynchronous operation, where transactions can commit on the principal before their log records have been sent to the mirror. It can also happen in synchronous mode, if the mirror becomes disconnected or the mirroring session is suspended. The amount of log in the send queue can be equated to the expected amount of data (or work) loss in the event of a crash of the principal.
  • REPLICATION: This value shows up when there are committed transactions that haven’t yet been scanned by the transaction replication Log Reader Agent job for the purpose of sending them to the replication distributor or harvesting them for Change Data Capture (which uses the same Agent job as transaction replication). The job could have been disabled, could be broken, or could have had its SQL Agent schedule changed.
  • DATABASE_SNAPSHOT_CREATION: When a database snapshot is created (either manually or automatically by DBCC CHECKDB and other commands), the database snapshot is made transactionally consistent by using the database’s log to perform crash recovery into the database snapshot. The log obviously can’t be truncated while this is happening and this value will be the result. See this blog post for a bit more info.
  • LOG_SCAN: This value shows up if a long-running call to fn_dblog (see here) is under way when the log truncation is attempted, or when the log is being scanned during a checkpoint.
  • AVAILABILITY_REPLICA: This is the same thing as DATABASE_MIRRORING, but for an availability group (2012 onward) instead of database mirroring.

Ok – so maybe not such a quick consideration of the various values :-) Books Online has basic information about these values here.

So which one would *I* be the most worried to see for a 24 x 7, critical database?

Let’s think about some of the worst case scenarios for each of the values:

  • NOTHING: It could be the scenario I described in the list above, but that would entail a workload change having happened for me to be surprised by it. Otherwise it could be a SQL Server bug, which is unlikely.
  • CHECKPOINT: For a critical, 24×7 database, I’m likely using the full recovery model, so it’s unlikely to be this one unless someone’s switched to simple without me knowing…
  • LOG_BACKUP: This would mean something had happened to the log backup job so it either isn’t running or it’s failing. Worst case here would be data loss if a catastrophic failure occurred, plus the next successful log backup is likely to be very large.
  • ACTIVE_BACKUP_OR_RESTORE: As the log is growing, if this value shows up then it must be a long-running data backup. Log backups can run concurrently so I’m not worried about data loss of a problem occurs.
  • ACTIVE_TRANSACTION: Worst case here is that the transaction needs to be killed and then will take a long time to roll back, producing a lot more transaction log before the log stops growing.
  • DATABASE_MIRRORING: Worst case here is that a crash occurs and data/work loss occurs because of the log send queue on the principal, but only if I don’t have log backups that I can restore from. So maybe we’re looking at a trade off between some potential data loss or some potential down time (to restore from backups).
  • REPLICATION: The worst case here is that replication’s got itself badly messed up for some reason and has to be fully removed from the database with sp_removedbreplication, and then reconfigured again.
  • DATABASE_SNAPSHOT_CREATION: The worst case here is that there are some very long running transactions that are still being crash recovered into the database snapshot and that won’t finish for a while. It’s not possible to interrupt database snapshot creation, but it won’t affect anything in the source database apart from log truncation.
  • LOG_SCAN: This is very unlikely to be a problem.
  • AVAILABILITY_REPLICA: Same as for database mirroring, but we may have another replica that’s up-to-date (the availability group send queue could be for one of the asynchronous replicas that is running slowly, and we have a synchronous replica that’s up-to-date.

The most worrying value is going to depend on what is likely to be the biggest problem for my environment: potential data loss, reinitialization of replication, very large log backup, or lots *more* log growth from waiting for a long-running transaction to commit or roll back.

I think I’d be most worried to see ACTIVE_TRANSACTION, as it’s likely that the offending transaction will have to be killed which might generate a bunch more log, cause more log growth, and more log to be backed up, mirroring, replicated, and so on. This could also lead to disaster recovery problems too – if a long-running transaction exists and a failover occurs, the long-running transaction has to be rolled back before the database is fully available (even with fast recovery in Enterprise Edition).

So there you have it. 37% of you agreed with me. I discussed this with Glenn just after I posted it and we agreed on ACTIVE_TRANSACTION as the most worrying type (so I didn’t just pick this because the largest proportion of respondents did).

I’m interested to hear your thoughts on my pick and the scenario, but please don’t rant about how I’m wrong or the scenario is bogus, as there is no ‘right’ answer, just opinions based on experience.

And that’s the trick when it comes to performance troubleshooting – although it’s an art and a science, so much of how you approach it is down to your experience.

Happy troubleshooting!

New interview available online

The Simple Talk folks from Red Gate have been running a Geek of the Week interview series over the last few months and this week I’m their featured geek.

You can read the interview here.

Enjoy!

Survey: what is the most worrying cause of log growth?

Time for another fun survey, this time around factors causing the transaction log to grow.

Imagine you’re the DBA for a company and you come to work Monday morning and find that the transaction log on one of the most critical databases, which has a 24×7 workload, is growing repeatedly. You check the log_reuse_wait_desc field in sys.databases to get an idea what the problem is.

Without any other information, which of the following values for log_reuse_wait_desc would you be most concerned to see for this database? (And these are listed in the order they appear in Books Online…) Give it some serious thought – and look at the results of the previous survey I did about wait stats to help guide your thinking, as the most common value you’ve seen in your environment isn’t necessarily the most worrying one.

Edit: the survey is over – please see here for the results.

(Note – if you’re using Internet Explorer, you may not see a Vote button. Just hit Enter after selecting your choice.)


Once again it’s deliberate that there is no ‘it depends’ option, and no ‘Other’ option for you to type something, so please don’t leave comments complaining about that.

I’ll editorialize the results next week, along with what my pick would be and rationales for picking each of them.

Thanks!

What is the most worrying wait type?

Two weeks ago I kicked off a survey that presented a scenario and asked you to vote for the wait type you’d be most worried to see after a code roll-out to production.

Here are the results:

worryingwait What is the most worrying wait type?

A very interesting spread of responses – thank you to all who thought about it and responded.

Remember, you have no other information apart from what the top wait type is over the week since the new code was rolled out. No knowledge of the I/O subsystem, memory, workload, code, or anything else. And that makes it a bit hard to pick what the most worrying wait type would be.

Let’s quickly consider what each of the wait types listed above could mean (and this is by no means an exhaustive examination of each wait type):

  • ASYNC_NETWORK_IO: the classic cause of this wait type is RBAR (Row-By-Agonizing-Row) processing of results in a client, instead of caching the results client-side and telling SQL Server to send more. A common misconception is that this wait type is usually caused by network problems – that’s rarely the case in my experience.
  • CXPACKET: this wait type always accrues when parallelism happens, as the control thread in a parallel operation waits until all threads have completed. However, when parallel threads are given unbalanced amounts of work to do, the threads that finish early also accrue this wait type, leading to it maybe becoming the most prevalent. So this one could be benign, as the workload has lots of good parallelism, but could be malignant if there’s unwanted parallelism or problems causing skewed distribution of work among parallel threads.
  • LCK_M_IX: this wait type occurs when a thread is waiting for a table or page IX lock so that a row insert or update can occur. It could be from lock escalation to a table X or S lock causing all other threads to wait to be able to insert/update.
  • LCK_M_X: this wait type commonly occurs when lock escalation is happening. It could also be caused by using a restrictive isolation level like REPEATABLE_READ or SERIALIZABLE that requires S and IS locks to be held until the end of a transaction. Note that distributed transactions change the isolation level to SERIALIZABLE under the covers – something that’s bitten several of our clients before we helped them. Someone could also have inhibited row locks on a clustered index causing all inserts to acquire page X locks – this is very uncommon though.
  • PAGEIOLATCH_SH: this wait type occurs when a thread is waiting for a data file page to be read into memory. Common causes of this wait being the most prevalent are when the workload doesn’t fit in memory and the buffer pool has to keep evicting pages and reading others in from disk, or when query plans are using table scans instead of index seeks, or when the buffer pool is under memory pressure which reduces the amount of space available for data.
  • PAGELATCH_EX: the two classic causes of this wait type are tempdb allocation bitmap contention (from lots of concurrent threads creating and dropping temp tables combined with a small number of tempdb files and not having TF1118 enabled) and an insert hotspot (from lots of concurrent threads inserting small rows into a clustered index with an identity value, leading to contention on the index leaf-level pages). There are plenty of other causes of this wait type too, but none that would commonly lead to it being the leading wait type over the course of a week.
  • SOS_SCHEDULER_YIELD: the most common cause of this wait type is that the workload is memory resident and there is no contention for resources, so threads are able to repeatedly exhaust their scheduling quanta (4ms), registering SOS_SCHEDULER_YIELD when they voluntarily yield the processor. An example would be scanning through a large number of pages in an index. This may or may not be a good thing.
  • WRITELOG: this wait type is common to see in the first few top waits on servers as the transaction log is often one of the chief bottlenecks on a busy server. This could be caused by the I/O subsystem not being able to keep up with the rate of log flushing combined with lots of tiny transactions forcing frequent flushes of minimal-sized log blocks.

So which one would *I* be most worried about seeing after a week of running the new code? The biggest problem is that we don’t know what the pattern of wait statistics was before the code roll-out, so we don’t know what’s normal for this production system, and it’s always better to have a baseline (or multiple snapshots) of wait statistics to compare with. We also don’t know what *other* wait types were the next few most prevalent types.

I’d be unsurprised to see ASYNC_NETWORK_IO, CXPACKET, PAGEIOLATCH_SH, or WRITELOG as the top wait, and I’d be intrigued to see SOS_SCHEDULER_YIELD as the top wait as that’s very uncommon. None of these would be the most worrying IMHO. PAGELATCH_EX would mean there’s a big bottleneck somewhere, which may be easy to alleviate (if in tempdb) or not, but queries would still be progressing.

And that leaves the two locking waits. When I designed the survey and scenario, I had it in mind that LCK_M_IX would be the most worrying as it would likely be from a bunch of threads waiting for other long-running, many-row updates to complete under an escalated table X lock. Now, thinking more about LCK_M_X, there would have to be a lot of blocked lock escalations happening to cause this – but in the meantime other queries are progressing.

I sat back for ten minutes after writing the last paragraph and thought about it some more, and came to a conclusion: LCK_M_IX is what I’d be most worried about. Most of the other waits are likely to be very small in duration, and although they’d make the overall system be slower, the locking waits can easily cause individual queries to time out and fail, leading to a very poor user experience. Although the fix for the locking waits could be as simple as enabling read-committed snapshot isolation (and paying attention to tempdb configuration and performance), the overall user experience is my primary concern, hence my biggest worry is the wait that could have the biggest effect on someone’s use of the system.

Now remember, this thought experiment was conducted with a very contrived scenario, and my reasoning is based on that. And it’s just my reasoning. More than 90% of you thought differently about your answer, which doesn’t mean either of us is wrong, just that we analyzed the scenario in a different way. A lot of it comes down to deciding what it is that makes you worried: user impact, overall workload performance, cost/complexity to alleviate, or something else. I picked user impact.

Some of you may cynically think I picked LCK_M_IX as it’s the answer that had the fewest responses, but if you ask Erin or Jon, they’ll tell you that I ran the survey past them (at lunch during our IE2 class two weeks ago) and told them my answer would be LCK_M_IX.

I’m interested to hear your thoughts on my pick and the scenario, but please don’t rant about how I’m wrong or the scenario is bogus, as there is no ‘right’ answer, just opinions based on experience.

And that’s the trick when it comes to performance troubleshooting – although it’s an art and a science, so much of how you approach it is down to your experience.

Happy troubleshooting!

PS Don’t forget to run the code from my previous post and get some free insight from me into your production wait statistics.

Send me your wait stats and get my advice and 30 days of free Pluralsight in return

[Edit: 3/25/14 No more codes left - thanks for all the data! - please don't send any more.]

Yes, you read that correctly. Call me crazy, but I’ve been wanting to do this for a while.

Here’s the deal:

  1. You run the code from this post that creates a 24-hour snapshot of the wait stats on your server. (You can do it on as many servers as you want! Or multiple times on the same server. Seriously, the more data the better.)
  2. You send me the wait stats info in email.
  3. I’ll give you some feedback and insight on your wait stats.
  4. I’ll send the first 500 of you a code to get 30 days of free access to all 100+ hours of SQLskills Pluralsight training classes.
  5. I’ll also editorialize the results (possibly over several posts) in April

I was going to work with Pluralsight to get codes for just my SQL Server: Performance Troubleshooting Using Wait Statistics course, but they’re cool with giving away a month of all our courses. Nice!

What’s the catch? There is no catch. I get lots of real-life data, you get lots of real-life training. No credit-card required.

Why am I doing this? I *really* like looking at wait statistics data (kind of like Jonathan with his unhealthy obsession with analyzing deadlocks) and it’s been more than 3 years since my last huge wait stats survey. Also I *really* like helping people understand what wait statistics mean, I *really* like being sleep-deprived :-), and I’d *really* like to show you how awesome our online training courses are.

When does this offer end? You’ve got until the end of March to send me your wait stats info.

Rules? One code per email address that sends me wait stats, no matter how many servers you send in – I’ve got a limited supply. Please don’t post as a comment with the stats in (I’ll delete them) – email only. I’ll send you the code and info on how to redeem it.

Use the T-SQL code below please (or grab the zip file from here), and leave the WAITFOR set to 24 hours. Send me an email [Edit: the survey's done now - please don't send any more data] with the results in plain text or a spreadsheet. Everybody wins.

Look forward to hearing from you!

PS There is no perf hit from running this code, or any nasty side-effect. It simply creates two snapshots of the sys.dm_os_wait_stats output 24 hours apart.

PPS Thanks to those who pointed out the typo that crept in between coding and posting – sorry about that.

/*============================================================================
  File:     WaitStats2014.sql

  Summary:  24-hour snapshot of wait stats

  SQL Server Versions: 2005 onwards
------------------------------------------------------------------------------
  Written by Paul S. Randal, SQLskills.com

  (c) 2014, SQLskills.com. All rights reserved.

  For more scripts and sample code, check out

http://www.SQLskills.com

  You may alter this code for your own *non-commercial* purposes. You may
  republish altered code as long as you include this copyright and give due
  credit, but you must obtain prior permission before blogging this code.

  THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF
  ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED
  TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
  PARTICULAR PURPOSE.
============================================================================*/

IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
	WHERE [name] = N'##SQLskillsStats1')
	DROP TABLE [##SQLskillsStats1];

IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
	WHERE [name] = N'##SQLskillsStats2')
	DROP TABLE [##SQLskillsStats2];
GO

SELECT [wait_type], [waiting_tasks_count], [wait_time_ms],
       [max_wait_time_ms], [signal_wait_time_ms]
INTO ##SQLskillsStats1
FROM sys.dm_os_wait_stats;
GO

WAITFOR DELAY '23:59:59';
GO

SELECT [wait_type], [waiting_tasks_count], [wait_time_ms],
       [max_wait_time_ms], [signal_wait_time_ms]
INTO ##SQLskillsStats2
FROM sys.dm_os_wait_stats;
GO

WITH [DiffWaits] AS
(SELECT
-- Waits that weren't in the first snapshot
		[ts2].[wait_type],
		[ts2].[wait_time_ms],
		[ts2].[signal_wait_time_ms],
		[ts2].[waiting_tasks_count]
	FROM [##SQLskillsStats2] AS [ts2]
	LEFT OUTER JOIN [##SQLskillsStats1] AS [ts1]
		ON [ts2].[wait_type] = [ts1].[wait_type]
	WHERE [ts1].[wait_type] IS NULL
	AND [ts2].[wait_time_ms] > 0
UNION
SELECT
-- Diff of waits in both snapshots
		[ts2].[wait_type],
		[ts2].[wait_time_ms] - [ts1].[wait_time_ms] AS [wait_time_ms],
		[ts2].[signal_wait_time_ms] - [ts1].[signal_wait_time_ms] AS [signal_wait_time_ms],
		[ts2].[waiting_tasks_count] - [ts1].[waiting_tasks_count] AS [waiting_tasks_count]
	FROM [##SQLskillsStats2] AS [ts2]
	LEFT OUTER JOIN [##SQLskillsStats1] AS [ts1]
		ON [ts2].[wait_type] = [ts1].[wait_type]
	WHERE [ts1].[wait_type] IS NOT NULL
	AND [ts2].[waiting_tasks_count] - [ts1].[waiting_tasks_count] > 0
	AND [ts2].[wait_time_ms] - [ts1].[wait_time_ms] > 0),
[Waits] AS
	(SELECT
		[wait_type],
		[wait_time_ms] / 1000.0 AS [WaitS],
		([wait_time_ms] - [signal_wait_time_ms]) / 1000.0 AS [ResourceS],
		[signal_wait_time_ms] / 1000.0 AS [SignalS],
		[waiting_tasks_count] AS [WaitCount],
		100.0 * [wait_time_ms] / SUM ([wait_time_ms]) OVER() AS [Percentage],
		ROW_NUMBER() OVER(ORDER BY [wait_time_ms] DESC) AS [RowNum]
	FROM [DiffWaits]
	WHERE [wait_type] NOT IN (
        N'BROKER_EVENTHANDLER',         N'BROKER_RECEIVE_WAITFOR',
        N'BROKER_TASK_STOP',            N'BROKER_TO_FLUSH',
        N'BROKER_TRANSMITTER',          N'CHECKPOINT_QUEUE',
        N'CHKPT',						N'CLR_AUTO_EVENT',
        N'CLR_MANUAL_EVENT',            N'CLR_SEMAPHORE',
        N'DBMIRROR_DBM_EVENT',          N'DBMIRROR_EVENTS_QUEUE',
        N'DBMIRROR_WORKER_QUEUE',       N'DBMIRRORING_CMD',
        N'DIRTY_PAGE_POLL',				N'DISPATCHER_QUEUE_SEMAPHORE',
        N'EXECSYNC',					N'FSAGENT',
        N'FT_IFTS_SCHEDULER_IDLE_WAIT',	N'FT_IFTSHC_MUTEX',
        N'HADR_CLUSAPI_CALL',           N'HADR_FILESTREAM_IOMGR_IOCOMPLETION',
        N'HADR_LOGCAPTURE_WAIT',        N'HADR_NOTIFICATION_DEQUEUE',
        N'HADR_TIMER_TASK',				N'HADR_WORK_QUEUE',
        N'KSOURCE_WAKEUP',				N'LAZYWRITER_SLEEP',
        N'LOGMGR_QUEUE',				N'ONDEMAND_TASK_QUEUE',
        N'PWAIT_ALL_COMPONENTS_INITIALIZED',
        N'QDS_PERSIST_TASK_MAIN_LOOP_SLEEP',
        N'QDS_CLEANUP_STALE_QUERIES_TASK_MAIN_LOOP_SLEEP',
        N'REQUEST_FOR_DEADLOCK_SEARCH',	N'RESOURCE_QUEUE',
        N'SERVER_IDLE_CHECK',           N'SLEEP_BPOOL_FLUSH',
        N'SLEEP_DBSTARTUP',				N'SLEEP_DCOMSTARTUP',
        N'SLEEP_MASTERDBREADY',         N'SLEEP_MASTERMDREADY',
        N'SLEEP_MASTERUPGRADED',        N'SLEEP_MSDBSTARTUP',
        N'SLEEP_SYSTEMTASK',            N'SLEEP_TASK',
        N'SLEEP_TEMPDBSTARTUP',         N'SNI_HTTP_ACCEPT',
        N'SP_SERVER_DIAGNOSTICS_SLEEP',	N'SQLTRACE_BUFFER_FLUSH',
        N'SQLTRACE_INCREMENTAL_FLUSH_SLEEP',
        N'SQLTRACE_WAIT_ENTRIES',       N'WAIT_FOR_RESULTS',
        N'WAITFOR',						N'WAITFOR_TASKSHUTDOWN',
        N'WAIT_XTP_HOST_WAIT',          N'WAIT_XTP_OFFLINE_CKPT_NEW_LOG',
        N'WAIT_XTP_CKPT_CLOSE',         N'XE_DISPATCHER_JOIN',
        N'XE_DISPATCHER_WAIT',          N'XE_TIMER_EVENT')
	)
SELECT
	[W1].[wait_type] AS [WaitType],
	CAST ([W1].[WaitS] AS DECIMAL (16, 2)) AS [Wait_S],
	CAST ([W1].[ResourceS] AS DECIMAL (16, 2)) AS [Resource_S],
	CAST ([W1].[SignalS] AS DECIMAL (16, 2)) AS [Signal_S],
	[W1].[WaitCount] AS [WaitCount],
	CAST ([W1].[Percentage] AS DECIMAL (5, 2)) AS [Percentage],
	CAST (([W1].[WaitS] / [W1].[WaitCount]) AS DECIMAL (16, 4)) AS [AvgWait_S],
	CAST (([W1].[ResourceS] / [W1].[WaitCount]) AS DECIMAL (16, 4)) AS [AvgRes_S],
	CAST (([W1].[SignalS] / [W1].[WaitCount]) AS DECIMAL (16, 4)) AS [AvgSig_S]
FROM [Waits] AS [W1]
INNER JOIN [Waits] AS [W2]
	ON [W2].[RowNum] <= [W1].[RowNum]
GROUP BY [W1].[RowNum], [W1].[wait_type], [W1].[WaitS],
	[W1].[ResourceS], [W1].[SignalS], [W1].[WaitCount], [W1].[Percentage]
HAVING SUM ([W2].[Percentage]) - [W1].[Percentage] < 95; -- percentage threshold
GO

-- Cleanup
IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
	WHERE [name] = N'##SQLskillsStats1')
	DROP TABLE [##SQLskillsStats1];

IF EXISTS (SELECT * FROM [tempdb].[sys].[objects]
	WHERE [name] = N'##SQLskillsStats2')
	DROP TABLE [##SQLskillsStats2];
GO

Last minute half-price offer for Sydney and Melbourne classes

As we draw close to our IE1 Immersion Events in Sydney and Melbourne, we have a few seats still available so we’re offering a special offer. For each registration at the regular price, you can bring another person for half price (US$1995 + GST). First-come-first-served!

Drop us an email if you’re interested – here.

Thanks

Survey: what is the most worrying wait type?

Here’s another survey for you, and I’m really looking forward to seeing and discussing the results of this one.

Imagine you’re the new DBA for a company (which shouldn’t be hard for most of you :-) and a week ago a new release of the client and server code was rolled out, and you cleared out the wait statistics using DBCC SQLPERF. You’ve never monitored wait statistics on this production SQL Server before, although you’ve done so extensively in previous places you’ve worked. Today you look at the aggregate wait statistics from the last week, using a script that presents them in descending order of total wait time for that wait type. You’ve been on vacation for the last week and so don’t know if there have been any complaints about performance since the roll-out.

Survey is now closed – see here for results

Without considering anything except the wait type name, which of the following wait types (listed in alphabetical order) would you be most concerned to see as the top wait type for the past week (i.e. the most aggregate wait time)?


It’s deliberate that there is  no ‘it depends’ option, and no ‘Other’ option for you to type something, so please don’t leave comments complaining about that.

I’ll editorialize the results next week, along with what my pick would be and rationales for picking each of them.

Thanks!

Corruption recovery using DBCC WRITEPAGE

A couple of week ago I kicked off a survey about the extent of your experience with the DBCC WRITEPAGE command. Here are the results:

UsingWRITEPAGE Corruption recovery using DBCC WRITEPAGE

The “Other” values are:

  • 9 x “Read your post on it, may practice it one day, but it’ll always be last resort to use.”
  • 6 x “I read your post on it.”
  • 4 x “I know about it (from your posts/immersion event) and it’s in my toolbox whenever I get called in to perform disaster recovery.”
  • 1 x “It’s like knowing the Unix recursive delete command as root, dangerous and powerful.”
  • 1 x “Not since 2k pages, and then PageEdit.exe was easier.”

For those of you who don’t know what DBCC WRITEPAGE is, it’s an undocumented command that allows you to make direct changes to data file pages. It’s an extremely powerful command that is very useful for creating corrupt databases, and in extreme cases, for helping to repair otherwise irretrievably-corrupt databases. I blogged about it a year ago in this post.

I never advocate using it to repair a corrupt database and then continue using that database in production, and in fact I refuse to do that if a client asks as there are just too many potential problems that could occur. I’ll only ever use it to help recover data from a damaged database, where there’s already a copy of the databases files in existence, and the client has signed an agreement acknowledging that any data file I use it on is a duplicate, is not on a production instance, and changes to it will in no way affect production.

There are all kinds of things I’ve used DBCC WRITEPAGE for to allow data recovery to progress, and in my Advanced Corruption Recovery Techniques course on Pluralsight course I demonstrate one of them, which I’m going to describe here.

Imagine a database that’s suffered corruption and there’s an off-row LOB data value that’s inaccessible because the owning data row is corrupt. Your mission is to retrieve the data. And to make it impractical to manually piece together the value using DBCC PAGE dumps, the LOB value is a few megabytes, so it’s stored on hundreds of linked text pages.

The process:

  • Create another row with an off-row LOB value (the ‘dummy’ row)
  • Find out the dummy row’s page and offset on the page
  • Calculate the offset of the dummy LOB value’s in-row root in the variable-length portion of the record
  • Calculate the offset of the dummy LOB in-row root’s size, text timestamp, and pointer fields
  • Find the corrupt row’s LOB in-row-root’s size, text timestamp, and pointer field values
  • Use DBCC WRITEPAGE to overwrite the dummy LOB in-row root fields
  • Select from the dummy row and you’ll get back the LOB value you wanted to save

Cool, eh? Desperate times call for clever measures, and with procedures like this I regularly recover data from client databases that Microsoft and other data recovery firms either can’t help with or have given up on. It’s not a fast process as it can take quite a while to figure out exactly what to modify to make things work, but when the data comes back, the clients are always ecstatic.

And with some knowledge of the data structures in a data file, careful use of DBCC WRITEPAGE, and plenty of practice, you can do it too.

Or call us to help you :-)

More on using Transaction SID from the transaction log

Back in 2012 I blogged about using fn_dblog and fn_dump_dblog to figure out the point at which something occurred that you’d like to restore to just before (e.g. a table drop). I also mentioned that you can use the SUSER_SNAME () function on on the [Transaction SID] column for the LOP_BEGIN_XACT log record of the operation to find out who performed the operation.

Yesterday in our IE2 Performance Tuning class in Tampa, someone asked me what the [Transaction SID] column would show if someone had run EXECUTE AS. As I wasn’t 100% certain, I decided to test and write a quick blog post.

First off I’ll set up a database and table to use:

USE [master];
GO
CREATE DATABASE [Test];
GO
ALTER DATABASE [Test] SET RECOVERY SIMPLE;
GO
USE [Test];
GO
CREATE TABLE [TestTable] ([c1] INT IDENTITY);
GO
INSERT INTO [TestTable] DEFAULT VALUES;
GO 5

Next I’ll create a Kimberly user for a SQL login, and a Katelyn user for a Windows login:

-- Create Kimberly login and user
CREATE LOGIN [KimberlyLogin] WITH PASSWORD = 'NiceWife';
CREATE USER [KimberlyUser] FOR LOGIN [KimberlyLogin];
EXEC sp_addrolemember N'db_owner', N'KimberlyUser';
GO

-- Create Katelyn user
CREATE USER [KatelynUser] FOR LOGIN [APPLECROSS\Katelyn];
EXEC sp_addrolemember N'db_owner', N'KatelynUser';
GO

Now I’ll delete a single row as me and each of the users and logins:

-- Delete as me
DELETE FROM [TestTable] WHERE [c1] = 1;
GO

-- Now delete as Kimberly user
EXECUTE AS USER = N'KimberlyUser';
DELETE FROM [TestTable] WHERE [c1] = 2;
REVERT;
GO

-- Now delete as Kimberly login
EXECUTE AS LOGIN = N'KimberlyLogin';
DELETE FROM [TestTable] WHERE [c1] = 3;
REVERT;
GO

-- Now delete as Katelyn user
EXECUTE AS USER = N'KatelynUser';
DELETE FROM [TestTable] WHERE [c1] = 4;
REVERT;
GO

-- Now delete as Katelyn login
EXECUTE AS LOGIN = N'APPLECROSS\Katelyn';
DELETE FROM [TestTable] WHERE [c1] = 5;
REVERT;
GO

Finally I’ll pull the [Transaction SID] for each of the delete operations and pass it into SUSER_SNAME ():

SELECT
	[Operation], [Transaction Name], [Transaction SID],
	SUSER_SNAME ([Transaction SID]) AS [WhoDidIt?]
FROM fn_dblog (NULL, NULL)
WHERE [Operation] = N'LOP_BEGIN_XACT'
AND [Transaction Name] = 'DELETE';
GO
Operation       Transaction Name  Transaction SID                                             WhoDidIt?
--------------- ----------------- ----------------------------------------------------------- -------------------
LOP_BEGIN_XACT  DELETE            0x0105000000000005150000003A5014D05A957BF8F5C8882EE8030000  APPLECROSS\paul
LOP_BEGIN_XACT  DELETE            0x9A9A69BEACF67E4994E2F2DEE35BC02F                          KimberlyLogin
LOP_BEGIN_XACT  DELETE            0x9A9A69BEACF67E4994E2F2DEE35BC02F                          KimberlyLogin
LOP_BEGIN_XACT  DELETE            0x0105000000000005150000003A5014D05A957BF8F5C8882EFE030000  APPLECROSS\Katelyn
LOP_BEGIN_XACT  DELETE            0x0105000000000005150000003A5014D05A957BF8F5C8882EFE030000  APPLECROSS\Katelyn

So the answer is that the log record contains the SID of who you’re executing as. The only way to tell who is really running the code would be through auditing.

Enjoy!

Learning to listen

Fellow MVP Chris Shaw (b | t) wrote a blog post earlier this week on learning from mistakes and tagged me to write a similar post – so here it is.

Let me start with a story first. Back in April 1999 I was working for Microsoft on the SQL team as a developer in the Storage Engine team. I’d just joined a few months earlier after spending 5 years at DEC and certainly wasn’t a rookie developer, but I was new at Microsoft. I was writing the DBCC INDEXDEFRAG code and upgrading DBCC SHOWCONTIG (and merging in the functionality of the old, undocumented TVF fn_indexinfo), and I was pretty excited by it (and I still am – defrag/reorg is cool :-).

I was so excited that after I’d had the code for one major part of the project code reviewed by a fellow dev, I checked it in. I remember standing in the corridor outside my office talking with that dev and a tester, when our boss comes along and the conversation turns to my project and I said something like ‘Yup, I just checked in the next part’. My boss’s face started to turn pink and he said ‘ come into my office…’

I then got bawled out because I wasn’t supposed to check in before the boss had reviewed the code it too. I messed up because I hadn’t been listening properly in our team meeting where code review policies had been explained. It wasn’t a nice experience and it was entirely my fault. I decided to start listening. Really listening.

Listening is a critical skill in life – both at work (interacting with clients and colleagues/friends) and at home (interacting with family and friends) – and in my opinion is the most important of the various communication skills (I talk about this and a lot more in my Pluralsight course Communications: How to Talk, Write, Present, and Get Ahead).

As much as possible I like to use what’s called ‘active listening’, where you’re making a conscious effort to focus on what is being said. This involves doing the following:

  • Stop typing and make sure the person can tell you’re paying attention. Even if you can type and listen at the same time, it’s rude and gives the impression that you’re not listening properly.
  • Make eye contact with the person (even just occasionally if that makes you feel uncomfortable). A lot of how humans express themselves involves using the face and eyes so nuanced conversations (e.g. business negotiations, spousal arguments) can be misunderstood without picking up on the visual clues.
  • Remove distractions that prevent you from listening (e.g. step away from a source of noise, turn your back on a television).
  • Ask for a repeat of something that’s not clear so you can make sure you understand it.
  • Summarize what you’ve just heard to make sure you heard what they think they said. This is a great way of being able to get someone to explain something again.

All of these things will make you a better listener, leading to a smoother work and home life. At least that’s the plan :-)

Occasionally I slip up and realize I didn’t listen properly to something important, and then I make doubly sure I’m doing it right for the next few times.

To end with, here are two quotes I like to do with listening:

“I like to listen. I have learned a great deal from listening carefully. Most people never listen.” – Ernest Hemingway
“We have two ears and one mouth so that we can listen twice as much as we speak.” – Epictetus