Here's a little script I knocked up this afternoon to tell me who has open transactions on the server - not just the single oldest active transaction that DBCC OPENTRAN returns.

It gives back:

  • session ID
  • transaction begin time
  • how many log records have been generated by the transaction
  • how much log space has been taken up by those log records
  • how much log space has been reserved in case the transaction rolls back
  • the last T-SQL that was executed in the context of the transaction
  • the last query plan that was executed (only for currently executing plans)

It's ordered by the transaction begin time. I had some trouble using CROSS APPLY with the sys.dm_exec_query_plan DMV - if the plan isn't available, it blows out the entire result-set for that transaction. After messing around for ten minutes I discovered the joys of the OUTER APPLY operator - which is the same as CROSS APPLY but allows NULL values from the function being cross-applied.

Also thanks to fellow MVP Aaron Bertrand (twitter|blog) for pointing out a mistake in the way I was calling sys.dm_exec_query_plan.

Here's the script with some example output:

SELECT s_tst.[session_id],
   s_es.[login_name] AS [Login Name],
   S_tdt.[database_transaction_begin_time] AS [Begin Time],
   s_tdt.[database_transaction_log_record_count] AS [Log Records],
   s_tdt.[database_transaction_log_bytes_used] AS [Log Bytes],
   s_tdt.[database_transaction_log_bytes_reserved] AS [Log Reserved],
   s_est.[text] AS [Last T-SQL Text],
   s_eqp.[query_plan] AS [Last Query Plan]
FROM sys.dm_tran_database_transactions s_tdt
   JOIN sys.dm_tran_session_transactions s_tst
      ON s_tst.[transaction_id] = s_tdt.[transaction_id]
   JOIN sys.[dm_exec_sessions] s_es
      ON s_es.[session_id] = s_tst.[session_id]
   JOIN sys.dm_exec_connections s_ec
      ON s_ec.[session_id] = s_tst.[session_id]
   LEFT OUTER JOIN sys.dm_exec_requests s_er
      ON s_er.[session_id] = s_tst.[session_id]
   CROSS APPLY sys.dm_exec_sql_text (s_ec.[most_recent_sql_handle]) AS s_est
   OUTER APPLY sys.dm_exec_query_plan (s_er.[plan_handle]) AS s_eqp
ORDER BY [Begin Time] ASC;
GO

session_id Login Name        Begin Time              Log Records Log Bytes Log Reserved Last T-SQL Text                      Last Query Plan
---------- ----------------- ----------------------- ----------- --------- ------------ ------------------------------------ ---------------
54         ROADRUNNERPR\paul 2010-02-01 15:28:48.560 2           236       8550         begin tran insert into t1 values (1) NULL
55         ROADRUNNERPR\paul 2010-02-01 16:38:18.373 3           356       8852         insert into t1 values (3)            NULL

(For the hardware setup I'm using, see this post. For the baseline performance measurements for this benchmark, see this post.)

In my previous post in the series, I described the benchmark I'm optimizing - populating a 1-TB clustered index as fast as possible using default values. I proved to you that I had an IO bottleneck because the IOs to the data file (from checkpoints) and the transaction log file (from transactions committing) were causing contention.

Several people commented that I might have mis-configured the iSCSI IO subsystem - so first off I want to look at that. Fellow MVP Denny Cherry (twitter|blog), who knows a lot more than me about IO subsystems, volunteered to discuss my iSCSI setup with me to make sure I hadn't goofed anywhere (many thanks Denny!). It seems like I haven't. I'm using a single iSCSI array right now, with a single NIC on the server dedicated to the iSCSI traffic - once I move to multiple volumes, I'll add in more NICs.

Looking at Task Manager in the middle of a 6-hour test run to see the network utilization through the NIC shows that it's not saturated, as shown below.

 

I ran the DELL smcli utility for two hours during the most recent test to see what peak throughput I'm getting, using the following command:

smcli -n Middle_MD3000 -c "set session performanceMonitorInterval=5 performanceMonitorIterations=1440;save storageArray performanceStats file=\"c:\\MiddlePerfStats.csv\";"

I saw around 101MBytes/sec. and earlier when testing the smcli settings I'd seen 106MBytes/sec. I'm sure once I remove some of the contention that this will get better.

On to the test!

The first thing I wanted to try was optimizing my use of the transaction log - i.e. doing as few and as large IOs as possible to the log. My hypothesis is that by changing the batch size from one to, say, 10 or 100 SQL Server can make more efficient use of the log.

I changed my T-SQL script to take a batch size parameter and use explicit transactions inserting <batch-size> records. Everything else remained the same as the baseline. I picked 128 concurrent threads as my test point. In the baseline, the 128-thread insert test took 8 hours 8 minutes 27 seconds to complete (29307 seconds). My simple T-SQL code changed to (with $(rows) and $(batch) being SQLCMD parameters to the script):

DECLARE @counter BIGINT;
DECLARE @inner   SMALLINT;
DECLARE @start   DATETIME;
DECLARE @end     DATETIME;

SELECT @counter = 0;
SELECT @start = GETDATE ();

WHILE (@counter < $(rows))
BEGIN
   SELECT @inner = 0;
   BEGIN TRAN;
   WHILE (@inner < $(batch))
   BEGIN
      INSERT INTO MyBigTable DEFAULT VALUES;
      SELECT @inner = @inner + 1;
   END
   COMMIT TRAN;
   SELECT @counter = @counter + $(batch);
END;

SELECT @end = GETDATE ();

INSERT INTO msdb.dbo.Results VALUES (CONVERT (INTEGER, DATEDIFF (second, @start, @end)));
GO

Below are the results for 128 threads with batch sizes varying from 10 to 10000:

Increasing the batch size from 1 record to 10 records caused the elapsed time to drop from 29307 seconds to 21167 seconds - a 28% improvement! However, increasing the batch size further didn't improve things any more. Hmmm.

Next I tried different numbers of concurrent connections with a batch size of ten to see if the improvement was universal:

 

So moving from a single-record batch to a ten record batch has the same dramatic effect on performance for varying numbers of connections.

What's going on?

The answer involves some deep internals of the transaction log structure and IO behavior. The transaction log is split up internally into sections called virtual log files (VLFs for short). These VLFs are used to manage which parts of the log can be overwritten with new log records. If this is all unfamiliar, go read my TechNet Magazine article on Understanding Logging and Recovery in SQL Server which explains about the log in more detail and about transactions, otherwise what's coming next may not make much sense.

Whenever a transaction commits, all the transaction log records up to and including the LOP_COMMIT_XACT log record for that transaction (including log records from other transactions that are inter-mingled with those from the one that just committed) must be written out to disk, so the transaction is durable (called write-ahead logging). But it's not individual log records that are written to disk, the unit of IO for the transaction log is a log block. VLFs are split internally into dynamically sized log blocks, ranging from 4KB to 60KB in 512-byte increments, as needed. There are algorithms to figure out how big log blocks should be based on what log records are being flushed out.

With the single record batch, the amount of log generated in the transaction totals 4952 bytes. You can find this by doing something like:

CHECKPOINT;  -- to force the log in the SIMPLE recovery model to clear
GO

INSERT INTO MyBigTable DEFAULT VALUES;
GO

SELECT * FROM fn_dblog (NULL, NULL); -- show me all active log
GO

Then add up the values in the Log Record Length column for all records from the first LOP_BEGIN_XACT log record to the LOP_COMMIT_XACT with the matching Transaction ID.

Anyway, a single-record transaction generates 4952 bytes of transaction log, which will be flushed out in our case in a log block that's 5120 bytes (the nearest multiple of 512 above 4952), with a bunch of zero-padding at the end. For ten single-record transactions, that's 10 small IOs.

The ten-record transaction generates 48688 bytes of transaction log and will be flushed out in a log block that's 49152 bytes (the nearest multiple of 512 above 48688). This is clearly more efficient than 10 smaller IOs and is why changing to batches makes things go faster.

A 100-record transaction generates 489628 bytes of transaction log, which is clearly more than the 60KB log-block maximum, so it actually covers multiple log blocks (varying from 14-16 in my tests - I don't remember the exact algorithms). You can see the log block changing when the middle number in the Current LSN column of the fn_dblog output increases. It looks like it's jumping up, and it is - this is the offset of the log block within the current VLF divided by 512.

Because the 100-record transaction is split into multiple log blocks, there's no real IO gain during log flushes over the 10-record transaction - which is illustrated by the results I saw.

Now, even with this speed increase from the increased batch size, the average-disk-write-queue-length is still anywhere from 20-40 when running the 128-connection test, so there's still a significant bottleneck there. In fact, the transaction log grew significantly still during these tests (up to 23GB in one case) - for an explanation of this phenomenon, see Interesting case of watching log file growth during a perf test. I picked up some wait stats queries from fellow-MVP Glenn Berry (twitter|blog) to run while the tests were running. I'm using his as they're published and anyone can download and use them (see his blog post for the queries, and Bob Ward's CSS blog post that's starting to document the wait types).

His queries tell me that with the 10-record per batch and 128-connections:

  • Average write-stall to the data file is 37ms
  • Average write-stall to the log file is 18ms
  • Top wait types are PAGELATCH_EX (55% of all waits), PAGELATCH_SH (28% of all waits), and WRITELOG (14% of all waits)

The first two waits are the Storage Engine waiting for the buffer pool to kick out dirty pages so that buffers can be used to hold newly created pages, and the third is the log manager waiting for log block flushes to disk to complete.

Clearly IO IO IO.  Let's look at a a perfmon capture during the 10-records per batch test with 128 connections:

 

Let's go through each counter (top to bottom in the list in the image) and explain what's going on. I deliberately picked this time-slice, as it really simply shows the effect of contention:

  • Pages Allocated/sec: this is the light blue line and is the Access Methods part of the Storage Engine (the dev team I used to run) creating new data and index pages for the clustered index we're populating.
  • Checkpoint pages/sec: this is the pink line at the bottom left and bottom right of the capture. This is the buffer pool writing out dirty pages during a periodic checkpoint.
  • Avg. Disk sec/Write: this is the dark blue line that's tracking just above the thick black line. It's the amount of time in ms for a write to complete. You can see that it has a minimum around 12:51:00 and then varies wildly, hitting as high as 50+ms for a single write.
  • Avg. Disk Write Queue Length: this is the highlighted line in thick black. It also has a minimum around 12:51:00 and varies wildly the rest of the time.
  • Disk Write Bytes/sec: this is the dark green line at the top that shows the number of bytes being written to disk from all IO sources. Same story around 12:51:00.
  • Log Growths: A simple counter since the database was created/server started. It's over 100 and off the chart.
  • Log Bytes Flushed/sec: this is the red, fairly constant line around 1/3 the way up and is log blocks being flushed to disk because of transaction commits or checkpoints.
  • Lazy writes/sec: this is the light green line at the bottom and is the buffer pool having to force data pages to be written to disk (along with all transaction log flushed up to the point of the last log record to change the page being written) to make space for images of newly created pages.

This time-slice is really cool in that it shows what happens when contention goes away. Just before 12:51:00, a checkpoint ends and the lazywriter has nothing to do - so the only IOs hitting the disks are those coming from the transaction log flushing out log blocks as transactions commit. You can see the Avg. Disk Write Queue Length drop down to 2-3, the Avg. Disk sec/Write drop to about 5ms, and most beautiful of all (look, I'm a big geek ok? :-), the Disk Write Bytes/sec (the green line) drops down to be exactly equal to the Log Bytes Flushed/sec - proving that it's just log flushes hitting the disk. This is the no-contention case. It happens again for brief spell about 10 seconds later - the lazywriter most likely created a temporary surfeit of empty buffers. All the rest of the time, the lazywriter and checkpoints play havoc with the write throughput on the drives by causing contention.

It's clearly time to try some separation of files to relieve the contention - and that's what I'll cover in the next post in the series.

Hope you're enjoying the series - these take a long time to write up!

Back in September I blogged about an old 2005 bug that prevented DBCC CHECKFILEGROUP checking the partitions of an object on the specified filegroup unless *all* partitions of the object are on the specified filegroup (not a smart way to set things up!). The bug was fixed ages ago in 2005 but has only just been fixed in 2008. The post which explains the bug in more detail is DBCC CHECKFILEGROUP bug on SQL Server 2008. This is an important feature to be able to split the consistency checks of a partitioned VLDB over a series of days - see CHECKDB From Every Angle: Consistency Checking Options for a VLDB for more details.

SQL Server 2008 SP1 Cumulative Update 6 (which you can get here) has the bug fix for 2008 finally. The KB article which describes the bug is 975991.

You're all running regular consistency checks, right?

(For the hardware setup I'm using, see this post.)

As part of my new benchmarking series I first wanted to play around with different configurations of data files and backup files for a 1-TB database to see what kind of performance gains I can get invoking the parallelism possible when backing up and restoring the database. To do that I need a way to quickly populate a 1-TB database so that I can mess around with different numbers of files and so on. It doesn't matter what the data in the database is, as backup doesn't care - as long as there's a terabyte of it. Why a terabyte? It's a nice round number, it's a common enough database size right now, and I have the storage to play around with it.

But then my plans got derailed. In figuring out how to most quickly populate a 1-TB database, I realized that in itself would be an interesting benchmark to explore, so that's what I'm doing first.

My aim is to give you improvements you can use in real life. If you think this is going to be boring, skip down to the end of the post where I show a detailed perfmon and explain what's going on in my overloaded IO subsystem, then come back up here :-)

The baseline for this benchmark is contrived - I'm going to populate a 1-TB clustered index (so I can do other experiments with the index) as quickly as I can. The interesting part is that I'm starting with a brain-dead database layout, configuration, and method of populating the table, and then I'm going to alter different things to see the effect on the system. The effects and gains will be the interesting part as it will expose parts of how SQL Server works which *WILL* be applicable to real-life situations and workloads - the whole point of me doing all of this is to show you improvements, why they work, and how they could be useful to you.

When doing any kind of performance testing it's *essential* to have a baseline with which to compare - otherwise how do you know what effect a variation is having? This post describes my baseline setup, measurements, and limitations I start to notice.

First let me describe the baseline setup:

  • Single data file and log file stored on the same volume, on an 8-drive RAID-10 array (each drive is a 300GB 15k SCIS drive), connected through iSCSI to the server
  • Data file is created to be 1-TB, with instant file initialization enabled
  • Log file is created to be 256MB
  • Database is using the SIMPLE recovery model

Yes, I'm deliberately putting the data and log on the same RAID array. I *want* to see some contention so I can prove to you how separation of data and log can reduce contention and improve performance sometimes.

Database and table creation script:

CREATE DATABASE BackupPerfTest ON
(NAME = 'BackupPerfTest_Data',
   FILENAME = 'K:\BackupPerfTest.mdf',
   SIZE = 1TB,
   FILEGROWTH = 1GB)
LOG ON
(NAME = 'BackupPerfTest_Log',
   FILENAME = 'K:\BackupPerfTest.ldf',
   SIZE = 256MB,
   FILEGROWTH = 50MB);
GO

ALTER DATABASE BackupPerfTest SET RECOVERY SIMPLE;
GO

USE BackupPerfTest;
GO

CREATE TABLE MyBigTable (c1 BIGINT IDENTITY, c2 CHAR (4100) DEFAULT 'a');
GO
CREATE CLUSTERED INDEX MyBigTable_cl ON MyBigTable (c1);
GO

I figured the fastest way to fill the database is to have a single table with one row per page, and that having SQL Server create the large CHAR column inside itself, rather than me doing a REPLICATE, would be quickest.

For the actual population of the table, I worked out that 134217728 table rows gives me a terabyte, with each row taking up a single 8KB page.

The baseline setup of the table population is:

  • Varying numbers of concurrent connections (16, 32, 64, 128, 256) to the server (8-way DELL PowerEdge 1950)
  • Each connection runs a simple script that inserts 134217728 / number-of-connections rows into the table, and then logs the elapsed time into a results table
  • Each insert is done as a single-statement implicit transaction (if I don't do an explicit BEGIN TRAN/COMMIT TRAN, SQL Server does it for me)
  • A monitor connection pings the results table every 5 minutes checking to see if number-of-connections results are there, and sending me email if so

Yes, I'm deliberately using this method to insert each row. Again, I want to be able to make improvements and see the effect of the changes.

Each connection will be running this script:

SET NOCOUNT ON;
GO

DECLARE @counter BIGINT;
DECLARE @start   DATETIME;
DECLARE @end     DATETIME;

SELECT @counter = 0;
SELECT @start = GETDATE ();

WHILE (@counter < $(rows))
BEGIN
   INSERT INTO MyBigTable DEFAULT VALUES;
   SELECT @counter = @counter + 1;
END;

SELECT @end = GETDATE ();

INSERT INTO msdb.dbo.Results VALUES (CONVERT (INTEGER, DATEDIFF (second, @start, @end)));
GO

This is run through SQLCMD, and the number of rows to insert is passed into the T-SQL script using:

sqlcmd -S(local)\SQLDev01 -dBackupPerfTest -i"C:\SQLskills\Populate1TBTest\PopulateTable.sql" -v rows=%1

%1 in the line above is passed from a master CMD that kicks off number-of-connections CMD windows, each of which just runs the SQLCMD line above.

So - a simple setup.

Here's a graph of the results:

 

The elapsed times for all connections to complete their work (as there could be up to an hour between the first and last to complete) were:

  • 16-way: 15 hours 25 minutes 5 seconds
  • 32-way: 13 hours 50 minutes 18 seconds (11% faster)
  • 64-way: 10 hours 12 minutes 48 seconds (27% faster)
  • 128-way: 8 hours 8 minutes 27 seconds (20% faster)
  • 256-way: 7 hours 24 minutes 21 seconds (9% faster)

More connections clearly leads to a faster runtime, but the improvements from doubling the number of threads clearly aren't directly proportional to the number of threads. The biggest improvement was from 32 to 64 threads, and then the percentage gain started to tail off. Let's look at the page allocations per second for each experiment too:

 

As I'd expect, the pace at which pages are being allocated in the database increases with more threads and the percentage improvements line pretty much matches that of the elapsed time graph above. There's a slight difference in the 128 and 256 gains here because the graph is show what the perf counter number was after the experiment reached a steady state. I noticed that the log grew enormously for the last few tests, which caused the steady-state number to not be reached for a while. I've already blogged about that phenomenon in Interesting case of watching log file growth during a perf test.

I also watched what was happening in perfmon to see if there were any obvious performance problems going on. Here's a perfmon graph for the 64-way test once it reached steady-state and the log wasn't growing:

 

Analysis: 

  • The black line is above 0 when a checkpoint is occuring.
  • The green line represents how much data is being written to the K: volume, where the data and log file are. You can see there's a constant amount of data being written all the time (transaction log records when the implicit transactions commit) with large spikes whenever a checkpoint occurs and causes the actual data pages to be written out.
  • The light blue line is the pages allocated per second. You can see that it takes a major nose dive whenever a checkpoint occurs. I'm speculating that this is because of disk contention preventing the transaction log being written to disk (thus slowing down the transaction throughput) while the checkpoint is occuring and writing out data pages
  • The dark blue line at the bottom is the average disk seconds per write. You can see that it hovers around 4-5 milliseconds and spikes to 16-17 when a checkpoint occurs.
  • The brown line in the middle is the average disk write queue length. It hovers around 18-19 and spikes to around 25 when a checkpoint occurs.

Observations:

  • This system is clearly bottlenecked in the I/O subsystem
  • There is a direct correlation between checkpoints occuring and: increased disk queue length, reduced transaction throughput

This is what I would expect to see on a system with lots of updates where the log and data are on the same volume. Remember that I've got them on a fast RAID-10 array. This debunks the theory I've often heard that contention doesn't matter on RAID arrays. Yes, it does. You can still max out the throughput capabalities of any IO subsystem - it just depends what you're trying to do with it. Imagine if I had nonclustered indexes on this table too - more logging and more pages being allocated - much worse performance...

So now I've got my baseline and there are some obvious things I can try to improve things:

  • Data and log file separation
  • Varying number of data files
  • Varying placement of data files (e.g. different RAID arrays)
  • Manual checkpoints
  • Pre-sizing the log
  • Using explicit transactions with varying batch insert sizes
  • Using -E to get larger allocations

I'm not saying that all of these things are going to help, but over the next few weeks I'm going to try them all and report on the results. You'll be able to clearly see the effect of changing these on my benchmark, running on production-quality hardware, rather than just taking people's words for it.

I hope you're going to find these benchmarks and experiments useful - I'll be learning (hopefully) as I go along too.

Let me know if there's anything else you'd like to see me try, and if you're following along (I don't want to spend all this time if no-one's reading the series!)

Thanks!

Okay, this post is describing some new services we provide (we are a consulting firm after all :-), but there's a promotion at the bottom where you can save $$$ on one of the new services.

We're starting to offer a set of standalone auditing services that we can perform remotely, greatly reducing the cost of having someone come on-site to evaluate your SQL Server environment. The new standalone audit services we are providing are:

  • Database maintenance and operations audit of your SQL Server environment
  • Disaster recovery and high-availability audit of your SQL Server environment

The remote audits take the form of us giving you some non-intrusive scripts to run on each server, plus a detailed questionnaire to fill in. We analyze the results and send you a list of recommendations, with supporting explanations and links to deeper information.

As part of a regular remote consulting engagement, we'd also be engaging with the DBA team through web-meetings, phone calls, email conversations to undertake performance tuning, design work, and so on. However, many people just want to buy a small, limited block of time to run a quick audit and get the OK or a set of changes to implement to improve operations - this is where the standalone audits come in.

The standalone audit services we offer take a much shorter time (depending on environment size) than a regular consulting engagement, as they don't involve web-meetings and in-depth interaction with the DBA team, so they're a lot more cost-effective when on a tight budget. Of course, we also provide remote performance tuning, available in small blocks of time rather than an open-ended engagement, and the traditional 'figure out all sorts of problems' consulting engagements.

These audits are usually billed at our regular offsite consulting rate, but as an initial promotion I'll perform a database maintenance audit for $500 to the first 10 new customers who sign up before the end of January, limited to 4 hours of my time (a 50% saving over our regular rates).

Shoot me an email (through the contact button at the top of the page) if you're interested in any of these services and/or want to be one of the lucky few to take part in the promotion.

We look forward to working with you!

Categories:
Auditing | Consulting

I'm running some performance tests on the hardware we have (more details on the first of these tomorrow) and I was surprised to see some explosive transaction log growth while running in the SIMPLE recovery model with single row insert transactions!

Without spoiling tomorrow's thunder too much, I've got a setup with varying numbers of connections populating a 1TB table with default values, with the single data and log files on the same 8-drive RAID-10 array (again more details on why tomorrow). I was looking at perfmon while the test was running with 128 concurrent connections, and noticed some weird-looking behavior - basically what looked like a never-ending checkpoint occuring. So I dug in deeper and discovered the transaction log had grown over 7GB since the start of the test. Huh?

I restarted the test from scratch, with a 1TB data file and a 256MB log and watched what happened in perfmon; here's the screenshot of the first 90 seconds:

 

This is *so* interesting. The black line is the number of log growths, so you can see the log grows every time the line level goes up. The bright blue line is the number of pages being allocated per second to hold all the table rows my 128 connections are inserting. You can clearly see that every time there's a log growth, the allocations take a nose-dive - because no transactions can commit while the new portion of the log is being zeroed (remember that instant file initialization does not apply to the log - see this blog post). The green line (disk write bytes per second) spikes when the log grows because of the zeroes being written out to the log file by SQL Server (remember that SQL Server does the zeroing).

But why is the log growing in the first place? I'm in the SIMPLE recovery model and doing single-row implicit transactions (yes, I'm deliberately doing this) so there shouldn't be anything stopping the log from clearing during a checkpoint, right?

Wrong. The log starts out small (256MB) so one of the thresholds for triggering a checkpoint gets hit pretty fast (70% of the log file is used). So a checkpoint occurs (you can see the checkpoints occuring when the pink line at the bottom of the perfmon screen is above zero), and starts writing out to the data file, which is on the same disk as the log (see How do checkpoints work and what gets logged for an explanation of checkpoint mechanics), but it can't write fast enough (because of disk contention) to get to the log-clearing part before the log fills up completely and has to grow (because transactions are continuing at break-neck speed). So the log grows, and the insert transactions stop while the log is zeroed. And then the log starts to fill up again very quickly and another checkpoint is triggered, and so on and so on.

Eventually a steady state is reached where there's enough free log space during a checkpoint that no new log is required for the concurrent transactions to commit. You might ask why the checkpoint is so slow in the first place? Because I deliberately put the data file on the same RAID array as the log file, and both are being steadily hammered with writes - classic disk contention. Even though the RAID array is RAID-10 with 8x300GB 15k SCSI drives, the average disk queue write length is over 20 most of the time during the 128-way test because I'm simply trying to do too much.

The point of my blog post? Just because you don't have any of the classic causes of transaction log growth going on, doesn't mean you're going to be immune. In this case my (deliberate) poor physical layout of the database files and workload growth up to 128 concurrent connections caused the log to grow. What started out working when I was running 16 connections didn't work any more at 128 (actually I went back and re-ran some of the earlier tests and even with only 64 connections, the log grew to over 1GB before reaching steady-state).

Interesting eh?

Tomorrow I'll be continuing the perf/benchmarking series by creating my first benchmark and then tweaking the setup to see how I can improve performance (for instance with multiple data files, separation of log and data files - all the things I preach but have never *demonstrated*), but this behavior merited a post all on its own.

At the start of the year I decided to get seriously back into playing with electronics after a long haitus. I agonized about where to blog about all this stuff (as I like to blog a *lot*, as you all well know) and I initially thought it would be cool to have everything intermingled on this blog. Then I started to play and realized that I have a *ton* of stuff I want to blog about around electronics and microcontrollers - photos, videos, code - and that it would really interfere with the purpose of this blog - disseminating SQL Server info. Couple that with the fact that this blog is picked up by lots of SQL Server blog aggregators, and I started to worry that people would get a little miffed by *lots* of non-SQL stuff coming from this blog.

Soooo I decided to create a completely new blog on the site today, with a new 'theme'. I now know lots about CSS style sheets, ASP.NET, IIS, and BlogEngine.Net - probably way more than I really wanted to. But I have a nice shiny new blog dedicated to my electronics adventures. I've deleted the two posts since 1/1/10 about electronics and re-posted on the new blog (sorry to the 3 people who'd commented!). There's a link to the new blog at the top-left of this one - feel free to subscribe, drop in from time to time, or ignore it completely :-) 

You can get to the new blog at Paul's Electronics (in homage to the name of the electronics kit my Dad made me when I was 10).

Normal service will be resumed here very shortly...

Categories:
Personal

Yesterday I was discussing life-direction-changing-events with Kimberly and our great friend Libby and I thought it would be interesting to find out from people how their lives brought them to where they are today. I'm often tagged in these kinds of blog posts from other people, so today I'm going to start one of my own (I feel the power tingling through my veins!!! Er, ahem).

I'll go first. There are three very clear events/decisions in my life that brought me to where I am today.

Event #1 is the single most important one, as it shaped the way I developed my intellectual interests forever. On my tenth birthday in 1982, my wonderful Dad gave me an electronics kit and an electronics book. The book was Simple Electronics, a British Ladybird series book from 1979, aimed at helping kids build little electronics projects. Now that in itself wasn't wonderful, but Dad actually made me the kit from scratch over a few weeks in his shed (think 'shop' if you're in the US), including a cool box to keep the book, my wooden circuit boards (with screws to hold the components in place), and all the components, sorted in little bags and boxes with labels on. I think this is the best present I've ever been given - thanks Dad!

Pictures of the book and the box are below (with this blog post in the background - click for bigger images):

 

Sadly the book isn't the original one he gave me, but I bought one on Ebay a couple of years ago for posterity. The box, of course, is the original one that I still treasure. This kit changed my life completely - I became totally absorbed in electronics (see my recent blog posts here and here for the start of the revival), leading on to 8-bit computers, and my engineering degree in computer science and electronics from the University of Edinburgh. This birthday present put me on the course to being an engineer. But I almost didn't go to college. That happened because of event #2.

Event #2 was what stopped me joining the Royal Navy as a Weapons Electronics Officer. With my Dad having been in the Royal Navy as an engineer (he used to teach nuclear reactor theory and control electronics at the Navy's engineering school - H.M.S. Sultan - smart guy - the person I look up to most in the world), our family life was entwined with the Navy and it's way of life. I loved it and planned to follow my Dad's footsteps. I was in the Navy Cadets at school and did all kinds of summer courses with the Navy. Then when I was 17, I spent a week on H.M.S. Glasgow, a Type-42 Destroyer that saw serious action in the 1982 Falklands War, sailing from Edinburgh, around the top of Scotland through the Pentland Firth and The Minch, down to Glasgow on the opposite coast.

Here's a picture of H.M.S. Glasgow (click for a bigger image):

 

This *is* the original one, but sadly *isn't* mine. Lot's of scope for rampant megalomania with one of these to play with :-)

It was a fantastic experience but I discovered that I get badly seasick (and that lots of the equipment was pretty old and battle-hardened (i.e. 'simple')) so that put paid to my plans to join the Navy. Off to college...

Event #3 happened quite a bit later, and was my decision to accept a job offer from Microsoft, move to the US from Scotland and start being a SQL geek instead of a VMS file-system geek. The story behind that is explained in the first few Q&As in an email interview I did last year with Tom LaRock (twitter|blog). Without that decision, I'd never have been into SQL Server, started presenting, met Kimberly, etc etc and you wouldn't be reading this blog post.

So, those are my three life-changing events. What do you think are yours? It's not what got you to be a DBA, it what got you to where you are in your life.

In the great spirit of the SQL Community, I'm going to tag a few of my friends to have a go next who I think will provide interesting answers in an entertaining way (no pressure!) - if you blog in this series, please link back here so I can make sure I read them.

Enjoy!

Categories:
Personal

I finally had some time over the last two days to play with the Arduino board I picked up late last year. The Arduino is a pretty neat concept - wrapping a microcontroller up in a neat board that makes playing with sensors, displays, motors, etc and prototyping very simple. It's all open source and you can read more about it on their homepage (http://www.arduino.cc/) which also has a freeware IDE to use for programming. The board I have uses Amtel's ATmega328P processor, with 32K of flash memory and can do 20MIPS. They're very popular and opening up electronics and gadget hacking to non-techies.

Here's the 2009 rev of the Arduino Duemilanove board (image from their website, click for larger version):

 

They cost about $30 - I got mine as part of a kit from the Nuts'n'Volts magazine store but loads of online stores have them.  I just picked up some accessories yesterday from SparkFun who have the full range, including the Arduino Mega which has 54 IO pins - can't wait for that to arrive!

The possibilities for this are just endless. The IDE provides a full C++ environment with a bunch of helper classes already defined, which takes a lot out of the tedium of programming microcontrollers. If you're going to play with this, I recommend using some of the samples that come with the IDE and on their very extensive web site.

My current interest is with making things light up in clever ways so I thought I'd start off by writing a simple program to play with an LED array. The circuit's very simple: pins 2-11 from the Arduino connected through 220ohm resistors to the LED array, which is connected to ground on the other side. Future, more complex projects will include a circuit diagram (once I find a nice freeware program to do it).

Here's a photo of the board connected up and a close up of the very simple circuit (click for larger versions):

  

I put together two easy programs - one to move the lit LED from right-to-left and back again and one to move the lit LEDs from the middle out to the two sides and back in again. Kind of Knightrider-esque, but also the way the old SUN machines I used at university had their status lights on the back of the machine.

The code for the first one is:

/*
  10-bar LED array
 
  Connect the LED array to pins 11-2 through a 220R resistor, and to ground on the other side.
  01/14/2010
*/

void setup ()
{
  for (int loop = 2; loop < 12; loop++)
  {
    pinMode (loop, OUTPUT);    // Set the pin IO mode
    digitalWrite (loop, LOW);
  }
}

void loop ()
{
  // The loop has to start at 2 because 1 isn't an IO pin.
  // Go from right to left
  for (int loop = 2; loop < 12; loop++)
  {
    // Switch the LED on for 10 milliseconds
    digitalWrite (loop, HIGH);
    delay (10);
    digitalWrite (loop, LOW);
    delay (20);
  }
 
  // And from left to right again
  for (int loop = 11; loop > 1; loop--)
  {
    digitalWrite (loop, HIGH);
    delay (10);
    digitalWrite (loop, LOW);
    delay (20);
  }
}

And the second one changes the loop to be:

void loop ()
{
  // Start on LED 6 and go up to 10 (remember, shifted by 1)
  for (int loop = 7; loop < 12; loop++)
  {
    // Light the LED on the LHS of middle and the matching one on the RHS
    // This will light LEDs 6-5, 7-4, 8-3, 9-2, 10-1 (shifted by 1)

    digitalWrite (loop, HIGH);

    digitalWrite (13 - loop, HIGH);
    delay (10);
    digitalWrite (loop, LOW);
    digitalWrite (13 - loop, LOW);
    delay (20);
  }

  // And back down to the middle again
  for (int loop = 11; loop > 6; loop--)
  {
    digitalWrite (loop, HIGH);
    digitalWrite (13 - loop, HIGH);
    delay (10);
    digitalWrite (loop, LOW);
    digitalWrite (13 - loop, LOW);
    delay (20);
  }
}

Note there's no main() function - it's all taken care of. The wrapper calls your setup() function and calls the loop() function in an infinite loop. You only have to provide these functions and you can use all the C++ programming constructs (if you want to) or keep it pretty simple.

I took some short (5 second) movies of each program running - you can get them from SideToSide movie and MiddleToSides movie (1.9MB each).

This was just my introduction to the environment, the real fun will start with some of the projects I have lined up:

  • 3-D LED cubes (3x3x3, 4x4x4, 8x8x8 and maybe higher - I have 1000 3mm red LEDs winging their way towards me - $29.95 from Rackmount-Devices.com)
  • panel of 8x8 LED arrays controlled through MAX7219 chips
  • object recognition with a simple CMOS camera
It's going to be a fun year! This is going to satisfy not only my desire to get back into electronics and start tinkering, but also to write some nifty low-level code.

As always, let me know if this is interesting, you're doing something similar, want something explained, or you have an idea for a cool project.

Enjoy!

Categories:

A couple of years ago I blogged about 3rd-party file system filter drivers and how if they're not coded correctly to cope with NTFS alternate streams they will cause all kinds of weird corruption errors to be reported when DBCC CHECKDB is running. At the time I'd only seen the issue a couple of times so thought no more about it until a few weeks ago.

I was settling down to watch a movie with my laptop nearby when I got involved helping fellow-MVP Jonathan Kehayias (blog|twitter). As things progressed it dawned on me that his CHECKDBs were fine using regular database snapshots, but failed using the automatically-created snapshots, which use NTFS alternate streams (see this blog post where I explain about them). The errors produced were as below (borrowed from Jonathan's blog post):

2009-12-10 01:35:44.04 spid75      Error: 17053, Severity: 16, State: 1.
2009-12-10 01:35:44.04 spid75      E:\SQLData\MSSQL.1\DatabaseName.mdf:MSSQL_DBCC16: Operating system error 1784(The supplied user buffer is not valid for the requested operation.) encountered.
2009-12-10 01:35:44.07 spid98      DBCC CHECKDB (DatabaseName) WITH all_errormsgs, no_infomsgs, data_purity executed by Domain\UserName found 0 errors and repaired 0 errors. Elapsed time: 0 hours 0 minutes 9 seconds.
2009-12-10 01:35:44.45 spid98      Error: 926, Severity: 21, State: 6.
2009-12-10 01:35:44.45 spid98      Database 'DatabaseName' cannot be opened. It has been marked SUSPECT by recovery. See the SQL Server errorlog for more information. 

Lo-and-behold, it turned out to be a Diskeeper 10 filter driver that had a bug in it. You can read Jonathan's more in-depth recounting of the story on his blog here.

Diskeeper issued a fix a couple of weeks back, and Microsoft Product Support also put out a blog post with some details. Links are:

Hope this doesn't happen to you!

I've blogged about ghost records and the ghost cleanup task a couple of time before (the only place it is really explained AFAIK), but one of my fellow MVPs was asking me some questions about it today for a customer of theirs and couldn't find the trace flag to turn it off.

My previous blog posts about this are:

These explain what ghost records are and how the ghost cleanup process works.

On big systems it's possible for the ghost cleanup process to fall behind the rest of the system, with no hope of it catching up. It's a single-threaded task - so imagine a 16-way box with lots of deletes occuring, and a single CPU spending a few seconds every 5 seconds trying to remove all the ghosted records resulting from the deletes of all the other CPUs. It's pretty obvious that the ghost cleanup process is going to lag behind.

The problem with this scenario is that the ghost cleanup process will still pop up every 5 seconds (every 10 on 2008) and start removing ghost records, potentially causing performance issues by keeping pages in the buffer pool, generating log records, and causing physical IOs. The ghost cleanup task is also one of the background processes that can cause IOs to occur on what looks like a totally quiescent system.

There is a way to turn off the ghost cleanup task, using trace flag 661, as documented in KB 920093. Be careful though!!! If you disable the ghost cleanup task, the space taken up by deleted records will *NOT* be released for reuse by SQL Server until you do something else to remove it, like rebuilding an index.

One method people sometimes consider is to force ghost cleanup to clean everything by performing a table or index scan (thus queuing all the deleted records up for the ghost cleanup task). Although this is an alternative, it still uses the ghost cleanup task to do the work, and on a very busy system with a very large number of deletes (warning: generalization! :-) it can be much more efficient to remove the deleted-but-not-yet-reclaimed records using index reorganize or index rebuild.

Turning on this trace flag can be useful as a performance gain on systems with very heavy delete workloads, but only as long as you're careful about it. It's not something that's generally recommended but it may be useful to you.

Enjoy!

This is a quick post to clarify an article I saw on SQLServerCentral this morning that seemed to state that transaction rollbacks push data into database snapshots. This is absolutely not true.

A database page is copied into a database snapshot before it is changed in the source database. Although the mechanism is commonly called copy-on-write, it's more technically accurate to call it copy-before-write (but this makes it a bit harder to understand for many people). Once a page has been copied into the database snapshot, it is never removed from the database snapshot, and won't ever be copied into it again, as the database snapshot already has the correct point-in-time copy of the updated page. (For more info on database snapshots in general, see the Books Online entry Database Snapshots.)

A transaction makes one or more changes to the database, updating one or more pages. These pages will be copied into the database snapshot if they're not already there, so the pre-change image (that existed at the time the database snapshot was created) is preserved.

If the transaction rolls back, the rollback occurs by generating the reverse operations that the transaction performed and applying them to the database (e.g. an insert will be rolled back by the generation and application of a delete; an update will be rolled back by replacing the updated parts of the record with the pre-update values). I'll explain more about this in a future blog post.

These rollback operations will occur on the same pages that the initial transaction operations occured on. This means that no other pages will be changed by the rollback operations and so no further pages will be copied into the database snapshot by a rollback. Pages can't be removed from the database snapshot when the transaction rolls back because they have still changed in the source database (although the net effect of the transaction+rollback is no logical changes to the data, the page headers will have changed to have an updated Log Sequence Number on), and so the copy in the database snapshot is still required to preserve the point-in-time view of the database (at the physical level) as of the time the database snapshot was created.

I'll prove this to you with a simple script that you can play around with to convince yourself also.

USE master;
GO

DROP DATABASE SnapRollbackTest_Snapshot;
GO
DROP DATABASE SnapRollbackTest;
GO

CREATE DATABASE SnapRollbackTest;
GO
USE SnapRollbackTest;
GO

CREATE TABLE MyTable (c1 INT);
CREATE CLUSTERED INDEX MyTable_CL ON MyTable (c1);
GO

SET NOCOUNT ON;
GO

DECLARE @a INT;
SELECT @a = 1;
WHILE (@a < 100001)
BEGIN
    INSERT INTO MyTable (c1) VALUES (@a);
    SELECT @a = @a + 1;
END;
GO

CREATE DATABASE SnapRollbackTest_Snapshot ON
    (NAME = N'SnapRollbackTest', FILENAME = N'C:\SQLskills\test\SnapRollbackTest.mdfss')
AS SNAPSHOT OF SnapRollbackTest;
GO

-- Initial size
SELECT size_on_disk_bytes AS [Initial Size (bytes)] FROM sys.dm_io_virtual_file_stats (DB_ID ('SnapRollbackTest_Snapshot'), 1);
GO

-- Start transaction
BEGIN TRAN
GO
UPDATE MyTable SET c1 = 42;
GO
CHECKPOINT; -- to make sure absolutely everything is flushed to disk
GO

SELECT size_on_disk_bytes AS [After Transaction (bytes)] FROM sys.dm_io_virtual_file_stats (DB_ID ('SnapRollbackTest_Snapshot'), 1);
GO

-- Rollback
ROLLBACK TRAN;
GO

SELECT size_on_disk_bytes AS [After Rollback (bytes)] FROM sys.dm_io_virtual_file_stats (DB_ID ('SnapRollbackTest_Snapshot'), 1);
GO

Initial Size (bytes)
--------------------
196608

After Transaction (bytes)
-------------------------
1835008

After Rollback (bytes)
----------------------
1835008

You can clearly see that the size of the database snapshot did NOT increase at all because of the transaction rollback. Using the script above you can try this using a heap, clustered index, various combinations of row size and number of rows - the result will be the same - the database snapshot will not increase in size because of a transaction rollback. I tried a bunch of different combinations, all with the same result.

In the back of my head there's a niggly feeling that there's a funky, rare, pathalogical case where some weird combination of operations results in a page split when rolled-back, but I can't engineer it.

Bottom line - transaction rollbacks do not cause the database snapshot to increase in size, as the rollback operates on the database pages that have already been copied into the snapshot because they changed due to the operations of the transaction itself.

Hope this helps!

PS As one of the commenters pointed out, the initial snapshot size can be affected by the crash-recovery that is run when the snapshot is created. I go into details on that process in this post as it can be confusing when CHECKDB runs.

Due to some Microsoft date changes, the Spring SQL Connections show had to move to the week of April 12th. As a knock-on effect - and believe me, there were a bunch for us! - the public class Immersion Event we're doing in Boston had to move out of that week, and is now two weeks earlier in March. The locations remain the same though. Our Upcoming Events page has all the details and links to the registration sites (don't forget the Early-Bird specials for both!), but in a nutshell, it's:

  • 5-day intense public class on internals, maintenance, performance tuning, disaster recovery with Kimberly and I, March 29 - April 2, Boston
  • Spring SQL Connections, April 12 - 16, Las Vegas

Both events are going to be very cool - hope to see you at one (or both!) of them!

Categories:
Classes | Conferences

Happy New Year! 

The January edition of TechNet Magazine is available on the web now and has the latest installment of my regular SQL Q&A column.

This month's topics are:

  • Using backups for corruption recovery
  • Why snapshot isolation is required when using change tracking, and its performance implications
  • Is DBCC CHECKDB a really comprehensive integrity check?
  • The difference between data file and log file shrinking

Check it out at http://technet.microsoft.com/en-us/magazine/ee914612.aspx.

(This is my last blog post for 2009 - thanks to everyone who reads my blog and takes part in the SQL community - hope you have a Happy and Prosperous New Year!) 

Every so often you have to challenge yourself with a goal that actually stretches your abilities and tests your stamina. At the start of 2009 I set myself the goal of reading 50 books during the year. By the time January was over I'd already read 18 books so I upped the goal to 100, thinking it would be easily achievable. How wrong I was!

I'm very proud that I stuck with this through the year and met my goal, finishing the 100th book on December 29th. I deliberately chose the final book to be Scotland: The Story of a Nation, by Magnus Magnusson. He used to be the host of the UK quiz show Mastermind (that I loved as a teenager), and his catch phrase was "I've started, so I'll finish!". An appropriate statement on my undertaking this year!

If you held a gun to my head and forced me to pick from all the book's I've read this year, my #1 favorite book is Cormac McCarthy's The Road (no, I haven't seen the movie). Incredibly powerful, haunting, and ultimately sad book - I get a lump in my throat just thinking about the story and it's ending. If you only read one book next year, read that one.

Overall, it was an excellent experience and I recommend everyone to try something similar at some point in their lives. Many people have expressed an interest in seeing the complete list plus my favorites for the year, so this blog post is my summary for you all (and as a neat way of getting closure for me too). It's divided into three parts: data, top-10, and the complete list.

I hope you enjoy reading this as much as I've enjoyed putting it together, and it inspires you to try some of these books, or even to set yourself a reading goal next year. Do let me know what you think. And I'll leave you with the saying that's governed 2009 for me:

In omnibus requiem quaesivi, et nusquam inveni nisi in angulo cum libro!

Analysis of What I Read

I read a total of 39674 pages, or about 109 pages on average every day, and a book every 3.65 days. Of course some days I didn't read anything and some days I read 500 pages, depending on what I was doing. You might ask - how the hell did you make the time for that with everything else you do? Well, I flew 138000 miles during the year and spent quite a few days sitting by pools in hot places getting on time zones before teaching classes, mostly in India (2 trips) and Thailand (4 trips) - that's a lot of time right there. I made time when at home, reading pages here and there while cooking, taking a break from work, in bed, etc. It also helps that I love reading, and I read quickly (I don't speed-read, or skip sections, every word is read and digested).

Several people through the year poo-poo'd my goal, saying I must only be reading small books, or 'fast reads'. No. I picked a general range from my library (I've got 900+ books - I don't like electronic readers, and one of my favorite past-times is buying books). Here are two charts: the first shows the number of pages in each book, in order that I read them; the second shows the proportion of books in each genre I read.

 

The average book length was 397 pages, and as you can see, I'm a huge history buff, so 42% of all books were either hard history, or historical fiction. Make fun of me for producing charts if you want, I don't care :-)

The Top-10 

Now on to the top-10. I tried very hard get down to 10 and couldn't - so you get my top-11. It's just impossible for me to order them so I'll present them in the order I read them, along with a little picture of the cover, and my mini-review from my Facebook page (I always post a little review when I finish a book, the first 3 I read before Facebook got it's evil claws into my psyche). The fact that only one hard history book is on the list does not imply that the others I read weren't good - they almost all were excellent, but just not as hugely entertaining or enthralling as the fiction I read.

  #25 The Kite Runner; Khaled Hosseini; 400pp; Fiction; February 14th (From what I remember: My first exposure to life in Afghanistan. Follows the life of a kite-flying boy and his friend in Kabul as it's torn apart by conflict between the Taliban and other warlords. Very well written and highly recommended, as is the sequel A Thousand Splendid Suns that became my #38.)

  #29 The Road; Cormac McCarthy; 287pp; Fiction; February 26th (From what I remember: Follows a father and son heading west through post-apocalyptic USA and their encounters with other survivors. As I said above, my favorite book of the year. Incredibly powerful - a masterpiece. Go read it.)

  #37 Riding the Iron Rooster: By Train Through China; Paul Theroux; 480pp; Travel; March 30th (From what I remember: I love Theroux's travel writing - his knack for portraying people he meets and irreverent appraisals of places he passes through. In this book he explores the Chinese rail network. I'd love to follow in his footsteps on my next trip to China - been twice, but didn't take any trains.)

  #51 The White Tiger; Aravind Adiga; 304pp; Fiction; July 1st (Winner of last year's Man Booker Prize. Excellent story about a driver in India - I can really relate to it after our two trips to India earlier this year being driven around the streets. Quick read - recommended.)

  #59 The Enchantress of Florence; Salman Rushdie; 368pp; Historical Fiction; July 25th (My first Salman Rushdie book turned out to be a real page-turner. Excellent story, steeped in 16th Century history of the Mughals and Florence (a real favorite city of mine - planning a week-long trip next year). Richly told story, great twist at the end. Looking forward to reading a bunch more of his, on a UPS truck towards me already :-) Highly recommended!)

  #64 Shadow of the Silk Road; Colin Thubron; 400pp; Travel; August 6th (Terrific account of following the 4000-mile Silk Road from Xian to Antioch. Central Asia really seems to be "a paradise or hell of mingled ethnicities" with borders that don't really divide the peoples of the area. Very strongly recommended - although the book has engendered some serious wanderlust in this reader!)

  #78 The Name of the Rose; Umberto Eco; 552pp; Historical Fiction; September 21st (I love Eco's works (this is my 4th of his) - they're hugely involved and heavy going to read, with long sections of complex prose. His stories are always involved and erudite, and this is no exception - a murder-mystery set in an early 14th century Italian monastery, amidst the Imperial vs. Papal backed theological struggles of the time. Unfortunately I'd seen the movie so knew the end, but the book was excellent - lots of pithy, syllogistic discussion. Highly recommended, but not for the casual reader.)

  #80 The Elegance of the Hedgehog; Muriel Barbery; 336pp; Fiction; September 29th (Translated from French, this wonderful book concerns a concierge of an upper-class apartment building in Paris. She's low-born but very intelligent, which she hides from the vacuous residents of the building. The other major character is a 12-yr old girl, also hyper-intelligent, but unhappy and suicidal, with startling insights on life. Life changes for them both. Beautiful book, highly recommended.)

  #85 Sea of Poppies; Amitav Ghosh; 560pp; Historical Fiction; October 19th (I've got a real thing going for writers portraying life in India right now. This book follows the stories of a bunch of people around the time of the Opium Wars, who are linked into the trade in India - both Indians and Westerners. Various calamities befall the Indians and they end up on a schooner, the Ibis, heading down to Mauritius. A very compelling story, expertly told and I'll be picking up some more of his novels from Amazon. Highly recommended.)

  #93 The Meaning of Night: A Confession; Michael Cox; 720pp; Historical Fiction; December 7th (Superlative story telling! Been reading this (long) one for a few months on and off. Compelling tale of a man discovering his true origins and trying to win back what is his, with twists and turns along the way - written as a confession from the point of view of the man himself. Dark and brooding, mixed in with life in England in the 1850s. Highly recommended.)

  #100 Scotland: The Story of a Nation; Magnus Magnusson; 752pp; History; December 29th (I deliberately chose my goal-meeting final book of the year to be Magnusson's magnum opus: his 700pp work on the history of Scotland. Extraordinarily well-written and comprehensively researched, I strongly recommend this book to anyone with Scottish roots.)

The Complete List

And now, for completeness, here's the entire list of all 100 books I read, with links to Amazon.com so you can explore further.

  1. Mademoiselle Boleyn; Robin Maxwell; 355pp; Historical fiction; January 2nd
  2. Ghostwalk; Rebecca Stott; 368pp; Fiction; January 7th
  3. The Old Patagonian Express:By Train Through The Americas; Paul Theroux; 404pp; Travel; January 8th
  4. Persian Fire: The First World Empire and the Battle for the West; Tom Holland; 464pp; History; January 9th
  5. Eternity; Greg Bear; 416pp; Science fiction; January 10th
  6. Queen Isabella: Treachery, Adultery, and Murder in Medieval England; Alison Weir; 512pp; History; January 11th
  7. Our Dumb World: The Onion's Atlas of Planet Earth; The Onion; 256pp; Humor; January 13th
  8. Dead Reckoning: Tales of the Great Explorers 1800-1900; Helen Whybrow (Editor); 576pp; Travel; January 15th
  9. If You Liked School, You'll Love Work; Irvine Welsh; 320pp; Fiction; January 17th
  10. The Professor and the Madman:A Tale of Murder; Insanity, and the making of the O.E.D.; Simon Winchester; 288pp; History; January 18th
  11. Isaac Newton; James Gleick; 288pp; History; January 20th
  12. Twilight (The Twilight Saga, Book1); Stephanie Meyers; 544pp; Fiction; January 21st
  13. New Moon (The Twilight Saga, Book 2); Stephanie Meyers; 608pp; Fiction; January 22nd
  14. Brunelleschi's Dome: How a Renaissance Genius Reinvented Architecture; Ross King; 208pp; History; January 24th
  15. Eclipse (The Twilight Saga, Book 3); Stephanie Meyers; 640pp; Fiction; January 25th
  16. Breaking Dawn (The Twilight Saga, Book 4); Stephanie Meyers; 756pp; Fiction; January 27th
  17. The Secret Diary of Anne Boleyn; Robin Maxwell; 281pp; Historical Fiction; January 29th
  18. Brideshead Revisited; Evelyn Waugh; 368pp; Fiction; January 31st
  19. Signora da Vinci; Robin Maxwell; 448pp; Historical Fiction; February 1st
  20. Chasm City; Alastair Reynolds; 640pp; Science Fiction; February 6th
  21. A Short History of Byzantium; John Julius Norwich; 496pp; History; February 9th
  22. Eleanor of Aquitaine: A Life; Alison Weir; 441pp; History; February 11th
  23. To The Tower Born; Robin Maxwell; 320pp; Historical Fiction; February 12th
  24. Virgin: Prelude to the Throne; Robin Maxwell; 243pp; Historical Fiction; February 12th
  25. The Kite Runner; Khaled Hosseini; 400pp; Fiction; February 14th
  26. Hellboy Library Edition, Vol. 1: Seed of Destruction and Wake the Devil; Mike Mignola; 278pp; Comics; February 21st
  27. The Year 1000: What Life Was Like at the Turn of the First Millennium; Robery Lacey; 240pp; History; February 22nd
  28. Accelerando; Charles Stross; 415pp; Science Fiction; February 23rd
  29. The Road; Cormac McCarthy; 287pp; Fiction; February 26th
  30. The Last Apocalypse: Europe at the Year 1000 A.D.; James Reston Jr.; 336pp; History; February 28th
  31. The First Crusade: A New History: The Roots of Conflict between Christianity and Islam; Thomas Asbridge; 448pp; History; March 1st
  32. Hellboy Library Edition, Vol. 2: The Chained Coffin, The Right Hand of Doom, and Others; Mike Mignola; 278pp; Comics; March 3rd
  33. Lighthousekeeping; Jeanette Winterson; 252pp; Fiction; March 4th
  34. Marvel 1602; Neil Gaiman; 248pp; Comics; March 8th
  35. Eternals; Neil Gaiman; 256pp; Comics; March 8th
  36. The Absolute Sandman, Volume 4; Neil Gaiman; 608pp; Comics; March 13th
  37. Riding the Iron Rooster: By Train Through China; Paul Theroux; 480pp; Travel; March 30th
  38. A Thousand Splendid Suns; Khaled Hosseini; 432pp; Fiction; April 1st
  39. The Complete Memoirs of George Therston; Siegfried Sassoon; 656pp; History; April 12th
  40. A Tale of Two Cities; Charles Dickens; 544pp; Historical Fiction; April 21st
  41. Michelangelo and the Pope's Ceiling; Ross King; 384pp; History; April 28th
  42. Augustus: The Life of Rome's First Emperor; Anthony Everitt; 432pp; History; April 29th
  43. Genghis Khan and the Making of the Modern World; Jack Weatherford; 352pp; History; May 9th
  44. The Killer Book of Serial Killers; Tom Philbin; 352pp; History; May 13th
  45. Why We Suck; Denis Leary; 240pp; Non-Fiction; May 23rd
  46. Holy Terrors: Gargoyles on Medieval Buildings; Janette Rebold Benton; 140pp; History; May 28th
  47. Knights Templar: The Essential History; Stephen Howarth; 321pp; History; Mat 31st
  48. Dark Star Safari: Overland from Cairo to Capetown; Paul Theroux; 496pp; Travel; June 17th
  49. The Thirteenth Tale; Diane Setterfield; 432pp; Fiction; June 27th
  50. A Short History of Nearly Everything; Bill Bryson; 560pp; History; June 30th
  51. The White Tiger; Aravind Adiga; 304pp; Fiction; July 1st
  52. Anil's Ghost; Michael Ondaatje; 307pp; Fiction; July 2nd
  53. Time Bandit; Andy Hillstrand; 240pp; Non-Fiction; July 3rd
  54. The Sea; John Banville; 195pp; Fiction; July 4th
  55. The Glassblower of Murano; Marina Fiorato; 368pp; Historical Fiction; July 5th
  56. The Book of Unholy Mischief; Elle Newmark; 384pp; Historical Fiction; July 8th
  57. The Bookseller of Kabul; Asne Seierstad; 320pp; Non-Fiction; July 11th
  58. The Forge of Christendom: The End of Days and the Epic Rise of the West; Tom Holland; 512pp; History; July 18th
  59. The Enchantress of Florence; Salman Rushdie; 368pp; Historical Fiction; July 25th
  60. The Curious Incident of the Dog in the Night-Time; Mark Haddon; 226pp; Fiction; July 28th
  61. The Gathering; Anne Enright; 260pp; Fiction; July 29th
  62. Saving Fish From Drowning; Amy Tan; 528pp; Fiction; August 1st
  63. The Story of Tibet: Conversations with the Dalai Lama; Thomas Laird; 496pp; History; August 3rd
  64. Shadow of the Silk Road; Colin Thubron; 400pp; Travel; August 6th
  65. The Brief Wondrous Life of Oscar Wao; Junot Diaz; 352pp; Fiction; August 8th
  66. The Cloud Forest; Peter Matthiessen; 320pp; Travel; August 12th
  67. Redemption Ark; Alastair Reynolds; 656pp; Science Fiction; August 17th
  68. Year of Wonders; Geraldine Brooks; 336pp; Historical Fiction; August 18th
  69. Orpheus Rising; Bateman; 480pp; Fiction; August 20th
  70. The Catholic Church through the Ages: A History; John Vidmar; 384pp; History; August 24th
  71. Edward the Confessor; Frank Barlow; 408pp; History; August 28th
  72. The Places In Between; Rory Stewart; 320pp; Travel; August 30th
  73. The Forever War; Dexter Filkins; 384pp; Non-Fiction; September 5th
  74. Mogadishu!; Heroism and Tragedy; Kent DeLong; 144pp; Non-Fiction; September 10th
  75. The Temporal Void; Peter F. Hamilton; 736pp; Science Fiction; September 14th
  76. One Hundred Years of Solitude; Gabriel Garcia Marquez; 448pp; Fiction; September 15th
  77. Slaughterhouse Five; Kurt Vonnegut; 288pp; Fiction; September 17th
  78. The Name of the Rose; Umberto Eco; 552pp; Historical Fiction; September 21st
  79. People of the Book; Geraldine Brooks; 400pp; Historical Fiction; September 27th
  80. The Elegance of the Hedgehog; Muriel Barbery; 336pp; Fiction; September 29th
  81. The Lost Heart of Asia; Colin Thubron; 400pp; Travel; October 2nd
  82. So Young, Brave, and Handsome; Leif Enger; 272pp; Fiction; October 4th
  83. The Mysterious Flame of Queen Loana; Umberto Eco; 480pp; Fiction; October 7th
  84. A Time of Gifts: On Foot To Constantinople; Patrick Leigh Fermor; 344pp; Travel; October 10th
  85. Sea of Poppies; Amitav Ghosh; 560pp; Historical Fiction; October 19th
  86. A Conspiracy of Paper; David Liss; 480pp; Historical Fiction; October 23rd
  87. Breakfast of Champions; Kurt Vonnegut; 303pp; Fiction; October 27th
  88. Dogs of God: Columbus, the Inquisition, and the Defeat of the Moors; James Reston Jr; 400pp; History; October 27th
  89. A Spectacle of Corruption; David Liss; 396pp; Historical Fiction; November 5th
  90. Absolution Gap; Alastair Reynolds; 704pp; Science Fiction; November 15th
  91. Spawn Collection, Volume 4; Todd MacFarlane; 480pp; Comics; November 27th
  92. Parallel Worlds; Michio Kaku; 448pp; Non-Fiction; December 1st
  93. The Meaning of Night: A Confession; Michael Cox; 720pp; Historical Fiction; December 7th
  94. Fine Just the Way It Is: Wyoming Stories 3; Annie Proulx; 240pp; Fiction; December 8th
  95. The Shadow Lines; Amitav Ghosh; 256pp; Fiction; December 10th
  96. The Lemon Table; Julian Barnes; 256pp; Fiction; December 16th
  97. Jackson Pollock; Leonhard Emmerling; 96pp; Non-Fiction; December 21st
  98. Fire and Steam: How the Railways Transformed Britain; Christian Wolmar; 384pp; History; December 22nd
  99. The Bedford Hours: A Medieval Masterpiece; Eberhard Konig; 144pp; History; December 23rd
  100. Scotland: The Story of a Nation; Magnus Magnusson; 752pp; History; December 29th

Categories:
Books | Personal

Earlier today there was a thread on Twitter asking about what degrees and academic background people have who work on SQL Server. I volunteered to put together a reading list for those wanting to know more of the theory behind a relational database management system, rather than just how to use one.

Here I present a reading list that will take you from how to program well up to how to architect multi-threaded database servers. I've read all of these at some point between getting my CS/EE degree in Edinburgh and stopping dev work in 2005, and they're sitting on my bookshelf as I type this. They're all the best books I could find on the subject at the time, and they're all absolutely excellent. I've included Amazon.com links to the most up-to-date editions (because I'm nice like that Smile).

Programming

Underneath the RDBMS

Concepts

RDBMS architecture

You should also checkout the ACM Special Interest Group on Management of Data (SIGMOD), and the VLDB Conference - these are the premier academic conferences to do with database management systems.

This should keep you busy.. happy reading!

Categories:
General

Earlier today there was a question on SQL Server Central where someone wanted to know what could be causing so many reads on their transaction log. I was asked to chime in by fellow MVP Jonathan Kehayias (who also sent me some questions that I've answered in this post - thanks Jon!), so I did, with a list of everything I could think of. I thought it would make for a good post, so here it is, with a few more things I remembered while writing the post.

Before I start, if you're not comfortable talking log records and transaction log architecture, see my TechNet Magazine article on Understanding Logging and Recovery, which explains everything clearly, including how having too many VLFs can affect operations on the log that have to scan VLFs.

Each of these things can cause reads of the log:

  • Transaction rollback: when a transaction has to roll back (either because you say ROLLBACK TRAN or something goes wrong and SQL Server aborts the transaction), the log records describing what happened in the transaction have to be read so that their effects can be removed from the database. This is explained in the TechNet Magazine article. Note that it doesn't matter if you're using explicit transactions or not (i.e. BEGIN TRAN), SQL Server always starts a transaction for you (called an implicit transaction) so that it can put a boundary on what needs to be rolled back in case of a failure.
  • Crash recovery: crash recovery must read the transaction log to figure out what to do with all the log records in the active portion of the log (all the way back to the earlier of the most recent checkpoint or the start of the oldest active transaction). The log is read twice - once going forward from that oldest point (called the REDO phase) and then going backwards (called the UNDO phase). Again, this is explained in great depth in the article.
  • Creating a database snapshot: a database snapshot is a point-in-time view of a database. What's more, it's a transactionally consistent point-in-time view of a database - which means that, essentially, crash recovery must be run on the real database to create the transactionally consistent view. The crash recovery is run into the database snapshot, the real database isn't affected - apart from having all the active transaction log read so that crash recovery can run.
  • Running DBCC CHECKDB: creates a database snapshot by default on 2005 onwards, and runs the consistency checks on the snapshot. See above. There's a much more detailed description, including how this worked in 2000, in the first part of the 10-page blog post CHECKDB From Every Angle: Complete description of all CHECKDB stages.
  • Transaction log backups: this one's kind of obvious. A transaction log backup contains all the transaction log records generated since the last log backup finished (or since the log backup chain was established). To back up the log it has to read it. What's not so obvious is that a log backup will also scan through all the VLFs in the log to see if any active ones can be made inactive (called clearing or truncating the log - both misnomers as nothing is cleared and nothing is truncated). See my TechNet Magazine article on Understanding SQL Server Backups and in the blog post Importance of proper transaction log size management.
  • Any kind of data backup: (full/differential backup of a file/filegroup/database). Yup - data backups always include transaction log - so the backup can be restored and give you a transactionally consistent view of the database. See Debunking a couple of myths around full database backups and More on how much transaction log a full backup includes for details if you don't believe me.
  • Transactional replication: transactional replication works by harvesting committed transactions from the transaction log of the publication database (and then sending them to the subscriber(s) via the distribution database - beyond the scope of this post). This is done by the Log Reader Agent job, running from the Distributor. It needs to read all the log records generated in the publication database, even if they're nothing to do with the publications. More log equals more reads. My whitepaper on combining database mirroring and transactional replication in 2008 has more details on this stuff, as does Books Online.
  • Change data capture (in 2008): CDC uses the transactional replication log reader agent to harvest changes from the transaction log. See above. This means the CDC can cause the log to not be able to clear properly, just like transactional replication or database mirroring - see my blog post Search Engine Q&A #1: Running out of transaction log space for more details. Note the I didn't say Change Tracking - it uses a totally different mechanism - see my TechNet Magazine article on Tracking Changes in Your Enterprise Database for more details.
  • Database mirroring: DBM works by sending physical log records from the principal to the mirror database. If the mirroring sessions drops out of the SYNCHRONIZED state, then the log records won't be able to be read from memory and the mirroring subsystem will have to get them from disk - causing log reads. This can happen if you're running asynchronous mirroring (where you're specifically allowing for this), or if something went wrong while running synchronous mirroring (e.g. the network link between the principal and mirror dropped out, and a witness wasn't configured or the principal could still see the witness - again, beyond the scope of this post). Regardless, this is called having a SEND queue on the principal.
  • Restoring a backup: whenever backups are restored, even is you've said WITH NORECOVERY, the REDO portion of recovery is run for each restore, which reads the log.
  • Restoring a log backup using WITH STANDBY: in this case, you've essentially said you'd like recovery to run, but not to affect the transaction log itself. Running recovery has to read the log. For more info on using WITH RECOVERY, NORECOVERY, or STANDBY, see my latest TechNet Magazine article on Recovering from Disasters Using Backups, which explains how restores work.
  • A checkpoint, in the SIMPLE recovery mode only: see my blog post How do checkpoints work and what gets logged for a description of what checkpoints are and what they do. In the SIMPLE recovery mode, checkpoints are responsible for clearing the log (described with links above) so must read through all the VLFs to see which can be marked inactive.
  • When processing a DML trigger (on 2000): (thanks to Clay Lenhart for the comment that reminded me of this). In SQL Server 2000, the before and after tables that you can process in a DML trigger body are actually found from looking at the log records generated by the operation that caused the trigger to fire. My dev team changed this in 2005 to store the before and after tables using the version store, giving a big perf boost to DML trigger processing. 
  • Manually looking in the log (with DBCC LOG or the table-valued function fn_dblog): this one's pretty obvious.

Phew - a lot of things can cause log reads, the trick is knowing which one it is!

As you can see, there could be a lot of activity reading from your log as well as writing to it, which could cause an IO bottleneck. Make sure that the IO subsystem on which you place the log file (note: you don't get ANY performance benefit from having multiple log files) can handle the read and write workload the log demands. RAID 1 or RAID 10 with a bunch of spindles to spread the IOs out (note/warning/achtung: that's a big generalization - don't reply with a comment saying it's wrong because you've seen something different - different scenarios have different demands), and a proper RAID configuration (64k multiple for a stripe size, NTFS allocation unit size, volume partition alignment).

The first of our public classes (what we call Immersion Events) in 2010 is now officially open for registrations!

We've teamed up with our good friend Adam Machanic to bring a week-long custom class to the Boston area. In this class Kimberly and I take turns teaching modules and we're both on hand to answer questions, do research, and try things out (and banter too!). The tag-team approach works *really well* and make the class very enjoyable and relaxed for those attending (and for us!).

The class will be April 12-16 2010 in the Le Meridien hotel in Cambridge, MA and is $3100 for the week. If you use the discount code EARLYBIRD you'll get $600 off for registering before February 1st.

The class will cover:

  • On-disk structures: how the data is stored
  • Index internals: how the data is organized
  • Logging and recovery: how the data is protected
  • Choosing the RIGHT Data Type
  • Table & Index Partitioning Strategies
  • Data Access
  • Indexing Strategies
  • Data and log file provisioning and management
  • Tempdb
  • Index and statistics maintenance
  • Using backup and restore (plus internals)
  • Consistency checking and disaster recovery

You can see a much more detailed course outline here and full details including how to register at Adam's Boston SQL Training website.

We hope to see you there!

Categories:
Classes

I made them up. Yup.

I'm talking about the guidance which is:

  • if an index has less than 1000 pages and is in memory, don't bother removing fragmentation
  • if the index has:
    • less than 10% logical fragmentation, don't do anything
    • between 10% and 30% logical fragmentation, reorganize it (using DBCC INDEXDEFRAG or ALTER INDEX ... REORGANIZE)
    • more than 30% logical fragmentation, rebuild it (using DBCC DBREINDEX or ALTER INDEX ... REBUILD)

These numbers are made up. They can and will vary for you, but they're a good starting point to work from.

There's been some discussion since PASS, when I confessed publicly on Twitter (during Grant Fritchey's session) to making them up, about whether I really said that etc etc. Yes - I really did make them up.

Back in 1999/2000 when I wrote DBCC INDEXDEFRAG and DBCC SHOWCONTIG for SQL Server 2000, customers wanted *some* guidance on what the thresholds should be where they should care about fragmentation or not, and how to remove it. We had to put *something* into Books Online (my favorite "it depends!" wouldn't have been too helpful), so I talked to some customers, inside and outside Microsoft, and chose these numbers as most appropriate at the time.

They're not set in stone - they're a big generalization, and there are a ton of other factors that may affect your choice of threshold and fragmentation removal method (e.g. recovery model, high-availability technologies in use, log backup schedule, query workload, disk space, buffer pool memory, and so on). I wish Microsoft would update the old whitepaper on fragmentation - they keep promising me they'll get around to it.

In the meantime, take those numbers with a pinch of salt and don't treat them as absolute.

Categories:
Fragmentation

This is an interesting performance bug concerning a broken query optimizer rule in 2008. Thanks to Dan Shargel (Twitter) for sending me info on this and letting me use some of the stuff he sent.

The scenario involves using MIN or MAX in a query against a partitioned view. In 2005 the query plan includes a TOP (1) operator which uses the right index, but in 2008 the optimizer rule was broken and the plan turned into a stream aggregate, much more expensive in this case.

Here's the 2005 query plan:

 

and here's the 2008 query plan (before the bug fix):

 

You can get the fix in CU4 for 2008 SP1 (or later) and read a bit more about it in KB 973255.

Note that you have to turn on trace flag 4199 to enable the fix - that requirement will be removed in SQL11.

Theme design by Nukeation based on Jelle Druyts