10 years ago today…

…the second stage of my life ended, when I left Microsoft on August 31st, 2007.

The first stage of my life was education, ending in June 1994, with a Batchelor of Engineering (Hons.) degree in Computer Science and Electrical Engineering from the University of Edinburgh.

The second stage of my life was being an employee.

The first company I worked for was Digital Equipment Corporation (a.k.a. DEC, a.k.a. Digital), working on the OpenVMS/VMS file system (including the VMS chkdsk equivalent, disk/tape initialization, and general mag tape functionality) from August 1994 to January 1999. Working for DEC taught me how to be a real software engineer, and I’m very thankful that my first boss and mentor was Duncan McLaren – a damn fine engineer, a sincere geek, and good friend. I consider myself extremely lucky to have worked at DEC with Duncan to start my career and I really loved working there.

And that led me to the second and last company I worked for: Microsoft, from February 1999 to August 2007, including the jump from Scotland to Redmond, WA. See here for an explanation of how that happened. It was at Microsoft that I grew into a principal software engineer and learned how to be an engineering manager, both of software engineers and of program managers, and to lead product-wide projects. But the most important things I learned at Microsoft were how to lead and mentor people, and the value of a technical community.

Of all the people I worked with and got to know at Microsoft, the person who stands out the most is my great friend Bob Ward, who I first met in early 1999. We worked very closely over the years, and labored tirelessly together on the supportability (and PSS training) of the entire SQL Server box for the SQL Server 2005 release. You won’t find a more passionate advocate for solving customer problems, a more capable SQL Server debugger, or a more honest and sincere human being.

I absolutely *adored* working at Microsoft but in the end I couldn’t do what I wanted to do in 2007: work with Kimberly and spend my time teaching and consulting, so I made the decision to leave in June 2007 and gave the SQL Server group leaders 10 weeks notice, so I could transition everything properly.

Now I’m in the third stage of my life – owning and running a company: SQLskills, with my wonderful wife Kimberly, who’d been running the company by herself since she started it in 1995 until I joined in 2007. Kimberly taught me how to be a great presenter, how to run a successful small business, and how to be more empathetic in the ‘real world’ outside Microsoft.

This has been, and continues to be, a blast and I wouldn’t trade my life now for anything else. I’ve lost count of the number of classes I’ve taught, conferences I’ve presented at, and blog posts I’ve done here (ok, that I can put a number to: 947, including this one), plus I naturalized as a U.S. citizen in 2012. I must say that part of what makes this job so enjoyable is the superb team we have: Jon (since 3/15/11…), Glenn (5/1/12…), Erin (8/1/12…), Tim (1/19/15…), and our long-suffering assistant Libby (since forever…).

However, the thing I enjoy the most about my job now is helping people – through my blog, email, Twitter, my waits library, our classes and conferences, and more – and I don’t see that changing any time soon!

What will the next ten years bring, I wonder? Stay tuned to find out :-)

 

SQLskills SQL101: Readable secondary performance problems

As Kimberly blogged about earlier this year, SQLskills has an ongoing initiative to blog about basic topics, which we’re calling SQL101. We’re all blogging about things that we often see done incorrectly, technologies used the wrong way, or where there are many misunderstandings that lead to serious problems. If you want to find all of our SQLskills SQL101 blog posts, check out SQLskills.com/help/SQL101.

Yesterday I blogged about log shipping performance issues and mentioned a performance problem that can be caused by using availability group readable secondaries, and then realized I hadn’t blogged about the problem, only described it in our Insider newsletter. So here’s a post about it!

Availability groups (AGs) are pretty cool, and one of the most useful features of them is the ability to read directly from one of the secondary replicas. Before, with database mirroring, the only way to access the mirror database was through the creation of a database snapshot, which only gave a single, static view of the data. Readable secondaries are constantly updated from the primary so are far more versatile as a reporting or non-production querying platform.

But I bet you didn’t know that using this feature can cause performance problems on your primary replica?

As with most things in life, you don’t get anything for free. Readable secondaries are really useful, but there is a performance trade off you need to be aware of. All queries that are executed against a readable secondary are automatically run using read-committed snapshot isolation. This means they do not require share locks and so will not block any database changes being replayed from the primary replica (i.e. the constant redo of log records on the secondary replica that have been sent from the primary replica).

To do this requires the use of the versioning system, where (simplistically) pre-change versions of records are copied into the version store in tempdb and queries work out which version of the record is the correct one for them to process, based on the query’s starting time. All records that change get a 14-byte tag added on the end of the record that allows a query to see if this is the correct record, and if not to follow a pointer to the previous version of the record in the version store. This has been the mechanism since snapshot isolation and read-committed snapshot isolation were introduced in SQL Server 2005.

Now consider this: all AG replicas are exact copies of the primary replica. So how can versioning work on the readable secondary, adding 14-byte tags to some records? That must break the ‘exact copy’ rule, right?

Well, yes, it would… if the primary replica didn’t also change.

When a readable secondary is configured in an AG environment, all changing records on the primary replica start getting empty 14-byte versioning tags added to them. This is so that the 14-bytes of extra space on the record is noted in the transaction log and replayed on the secondary replicas, allowing the readable secondary to make use of the empty 14-byte space to store the versioning tag it needs.

This doesn’t break the ‘exact copy’ rule because the 14-bytes isn’t used for anything to do with recovery, there just has to be 14-bytes there.

So versioning tags start getting added to changing records on the primary (to be clear, it doesn’t turn on versioning on the primary) so table and index records start to get 14-bytes longer. And what happens when records get longer on pages where there isn’t enough space? Page splits in your indexes (and forwarded records in heaps – but I’ll concentrate on indexes here) leading to low page densities (wasted disk space and buffer pool memory), logical fragmentation (poor scan performance), and a bunch of extra, expensive log record generation from the page splits themselves.

To counteract this, you’ll need to implement (and/or possibly lower existing) fill factors on your indexes and even potentially start doing index maintenance on indexes that may not have required it previously. Quite an insidious problem that can be hard to figure out unless you know what’s going on under the covers!

See the following blog posts for more info:

This MSDN page has more general information and this whitepaper from Microsoft explains in more depth the various performance impacts from using readable secondaries: AlwaysOn Solution Guide: Offloading Read-Only Workloads to Secondary Replicas.

If you’re implementing readable secondaries in your AG configuration, make sure that you also investigate and implement index fill factors in the database so that the versioning tags that are added under the covers don’t start causing page splits and fragmentation.

SQLskills SQL101: Log shipping performance problems

As Kimberly blogged about earlier this year, SQLskills has an ongoing initiative to blog about basic topics, which we’re calling SQL101. We’re all blogging about things that we often see done incorrectly, technologies used the wrong way, or where there are many misunderstandings that lead to serious problems. If you want to find all of our SQLskills SQL101 blog posts, check out SQLskills.com/help/SQL101.

One question I’m asked regularly is this:  When our log shipping secondary is applying log backups, sometimes it takes a lot longer than usual. Any idea why this might be the case?

Log shipping has been around forever, and it’s still a hugely applicable and useful feature for very simply maintaining one or more secondary copies of a database. You can also use a secondary copy for reporting, where the restore of the log backup uses the WITH STANDBY option, leaving the secondary database in an accessible, but read-only state (when the logs aren’t being applied).

This works as follows:

  1. Make sure all users are disconnected from the secondary database
  2. Write all the log records from the log backup into the secondary database’s log file
  3. Perform the REDO part of recovery (ensuring that all operations from committed transactions are present in the secondary database)
  4. Perform the UNDO part of recovery (ensuring that all operations from uncommitted transactions are not present in the secondary database)

Step 4 writes all the log records generated by the UNDO operations into a special file called the undo file. This means that the secondary database is in read-only mode and is transactionally-consistent so that users can access it. The reason the log records are written into the undo file is so that the transaction log of the secondary database is not altered in any way, allowing subsequent log backups to be restored. If this weren’t the case, the UNDO log records would advance the secondary database’s LSN (Log Sequence Number), meaning that subsequent log backup restore operations would fail.

When the restore process begins on the secondary database, if an undo file exists, there is another step that is performed before steps 2-4 above. This additional step needs to take all the log records in the undo file and undo the effects of them – essentially putting the secondary database back into the state as of the end of step 3 from the previous restore. This database state is the same as if the previous log backup had been restored using WITH NORECOVERY instead of WITH STANDBY.

The occasional long-running restore problem happens when a log backup is restored that contains a long-running transaction that does not commit before the end of the log backup. This means that it must be completely undone as part of restoring the log backup (step 4), resulting in a very large undo file. This in itself can make restoring a log backup take a lot longer than usual. When the next log backup is restored, the additional step that undoes all the log records in the undo file has a very large undo file to process and takes much, much longer than usual. And if the log backup being restored also has an uncommitted, long-running transaction then it’s the perfect storm as the step 4 will also take a long time. These steps are all made even longer still if the log file has too many VLFs (called VLF fragmentation).

The situation where I’ve seen this most often is when the primary database is undergoing index maintenance and a log backup finishes near the end of a very long-running index rebuild operation of a large clustered index. The initial restore of that log backup on the secondary database takes much longer than usual to complete because of step 4 in the restore process. The next log backup on the primary also completes just before an index rebuild completes. When it is restored on the secondary, the whole of the large undo file has to be undone again, then the log restore occurs, and then another large undo file is generated to undo the second uncommitted index rebuild.

This is a possibility you have to be aware of if the secondary database must be available 24×7 for reporting, with only minimal downtime when each log backup is restored. In that case I would carefully augment the index maintenance operations on the primary with log backups to ensure that only complete, committed index rebuilds are present in the log backups being restored on the secondary database. Similar precautions should be taken if you have other, occasional, long-running operations.

An alternative would be to move from log shipping to database mirroring or availability groups, where the log records are continually being sent from the principal to the mirror database (or primary to secondary replica databases, in availability group terms) and there are no extra steps involving undoing log operations multiple times. With database mirroring, the drawback of this is that reporting would have to use database snapshots, so there’s a complexity trade-off involved. With availability groups, the reporting would have to use a readable secondary, which can lead to index fragmentation on the primary replica, but that can be compensated for with index fill factors (see here for more details).

So there you have it. Another example where understanding how SQL Server performs common operations can make it much easier to diagnose performance problems.