Are I/O latencies killing your performance?

In this post I explain some methods for investigating and reducing high tempdb and transaction log I/O latencies that can severely hamper the performance of your workload. Back at the end of August I kicked off a survey asking you to run some code to calculate average I/O latencies and send me the results. I […]

Cool free tool to parse and analyze SQLIO results

During every one of our Immersion Events, we designate Thursday evening as 'open mic' night where anyone can do a 15-minute presentation on anything they want (to do with SQL Server) to the class. We usually have 4 or 5 people who entertain us with interesting talks, and our recent classes in Chicago were no […]

Performance issues from wasted buffer pool memory

Back in April I kicked off a survey where I asked you all to send me some information about your buffer pools – how much memory is being used for data file pages and how much of that memory is storing empty space. I got back data from 1394 servers around the world – thanks! […]

How to examine IO subsystem latencies from within SQL Server

Over the last few months I’ve been lecturing at classes and conferences about getting SQL Server’s view of the I/O subsystem and what latencies it is experiencing, so time for a blog post to help everyone else. Most SQL Server’s today are I/O bound – that’s a generally agreed-on statement by DBAs and consultants in […]

Unbelievable tale of disaster and recovery

A few days ago one of my new blog readers (a pretty smart cookie, as you'll see) sent me a tale of database catastrophe and an excellent recovery that I’d like to share with you. The story’s been made anonymous and is published with full permission of the author (highlights in bold are mine). Hey […]

TechNet Magazine: January 2011 SQL Q&A column

The January edition of TechNet Magazine is available on the web now and has the latest installment of my regular SQL Q&A column. This month's topics are: Diagnosing I/O subsystem bottlenecks Capacity planning for transaction logs Why there are no non-logged operations in user databases Check it out at http://technet.microsoft.com/en-us/magazine/gg552991.aspx.

Survey results around purchase and use of SSDs

Back at the start of July I kicked off a survey around your plans for SSDs (see here) and now I present the results to you. There's not much to editorialize here, but the numbers are interesting to see.   The "other" answers were (verbatim): 3 x 'have bought and am trying them out' 3 […]

Benchmarking: do multiple data files make a difference?

Many times I'm asked whether having multiple data files can lead to an improvement in performance. The answer, as with all things SQL (except concerning auto-shrink) is a big, fat "it depends." It depends on what you're using the database for, and the layout of the files on the IO subsystem, and the IO subsystem […]

TechNet Magazine: March 2010 SQL Q&A column

The March edition of TechNet Magazine is available on the web now and has the latest installment of my regular SQL Q&A column. This month's topics are: Distributed transactions and database mirroring – why they don't work together Background processes that can cause I/Os even with no connections to the server How to restore backups […]

New hardware to play with: Fusion-io SSDs

Christmas comes but once a year… really? Then mine just came early on this afternoon's UPS truck. The very nice folks at Fusion-io just sent me two of their fully-loaded top-of-the-line ioDrive Duos with 640GB of solid-state flash memory in each. This is really extra-nice of them because on Dell's Small Business website they're currently […]

Using diskpart and wmic to check disk partition alignment

(Be sure to join our community to get our monthly newsletter with exclusive content, advance notice of classes with discount codes, and other SQL Server goodies!)  A while ago I blogged about disk partition alignment, and how the default alignment of 31.5Kb on Windows Server 2003 can lead to enormous I/O performance problems (see Are […]

Benchmarking: 1-TB table population (part 1: the baseline)

(For the hardware setup I'm using, see this post.) As part of my new benchmarking series I first wanted to play around with different configurations of data files and backup files for a 1-TB database to see what kind of performance gains I can get invoking the parallelism possible when backing up and restoring the database. […]

Adding geo-redundancy to failover clustering

A few weeks ago I kicked off a survey on how you add geo-redundancy to a failover cluster (see here for the survey). The results as of 8/26/09 are as follows:   So why is this interesting? Well, many people will suggest failover clustering as the best way to provide high-availability for a database (or […]

SQL Quiz #5: SANs and mirroring

There's another SQL Quiz (from Chris Shaw) doing the rounds where people blog the answer and then tag someone. This I got tagged by two people (Jason Massie and Gail Shaw) in the same day for the same quiz (albeit over a week ago). They either think I'm going to say something profound or funny, […]

Weekly survey: how often do you run consistency checks?

In this week's survey I'd like to know how often you run consistency checks on your *most critical* production database, regardless of *how* you run them (we did that survery already – see Importance of how you run consistency checks). I'll report on the results around July 4th. I'd only like you to answer for […]

Issues around DBCC CHECKDB and the use of hidden database snapshots

There are a couple of issues that I’ve heard of in the last few weeks (one while onsite at a customer) and I think they might bite some people so I’d like to share them with you. DBCC CHECKDB in 2005 onwards uses a hidden database snapshot to create the transactionally-consistent point-in-time view of the database that […]

Are your disk partition offsets, RAID stripe sizes, and NTFS allocation units set correctly?

I've just been setting up some of our new hardware, and wanted to do some background reading to ensure I use the correct disk partition offset, RAID stripe size, and NTFS allocation unit size to enable the best possible performance for the volumes I'm creating. You may not of heard about this (or your disk […]

Misconceptions around corruptions: can they disappear?

This has come up a few times now, most recently in an email question this morning - subsequent runs of DBCC CHECKDB show varying numbers of corruptions, and sometimes no corruptions – what's going on? Even more strange – a maintenance job runs a DBCC CHECKDB, which shows errors, but then in the morning – no […]

New research paper: Migrating Server Storage to SSDs: Analysis of Tradeoffs

At the last few conferences I've presented at, there have been questions about using SSDs (Solid-State Drives) for enterprise storage and whether that will change some of the database maintenance practices. My answer to that is "I don't know" (ha – bet you thought I was going to say "It depends!") because adoption of SSDs […]

A little-known sign of impending doom: error 825

There are two pretty well-known I/O errors – 823, and 824 – but there's also one called 825 which most DBAs do*not* know about, and definitely should. From SQL Server 2005 onwards, if you ever see an 823 or 824, SQL Server has actually tried that I/O a total of 4 times before it finally […]

Search Engine Q&A #26: Myths around causing corruption

Every so often I’ll see posts on the various data corruption forums discussing causes of corruption. In this post I want to debunk some of the myths around what can cause corruption. There are really two types of corruption to deal with, physical corruption and logical corruption. Physical corruption This is where something has altered […]

Conference Questions Pot-Pourri #7: How to create Agent alerts

Many times I've been asked to do a blog post about creating Agent alerts, and given that today I demo'd it as part of our Accidental DBA workshop at Connections, it seemed a good time to do the blog post too! I demo this in the context of alerting a DBA when an 823 or […]

Inside The Storage Engine: Does turning on page checksums discard any torn-page protection?

This is a really interesting question that came up in the Microsoft Certified Architect class I’m teaching at present – if a database has torn-page protection enabled, and page checksums are enabled, is all the existing torn-page detection lost? This is an important question, because enabling page checksums doesn’t suddenly make all allocated pages be […]

Search Engine Q&A #14: Beware 3rd party file-system drivers with DBCC CHECKDB

Here’s an issue that I thought was a one-off but it just popped up again over the weekend so I want to publicize it. DBA runs a DBCC CHECKDB and gets output like the following: Wow! Looks like something’s seriously wrong with that database. Until you find that this happens on *all* the databases on […]

Search Engine Q&A #12: Should you create multiple files for a user DB on a multi-core box?

There’s been a very interesting discussion going on over at SQLServerCentral.com about whether to create multiple files for a user database because the server has multiple CPUs – see the thread here. I wrote a couple of long responses during the thread that I wanted to duplicate here as I think it’s of broad interest. My […]

Whitepaper on Predeployment I/O Best Practices, plus a few urban legends around SQL Server IO

There’s a new whitepaper on TechNet that I’ve just come across (even though it was published 6 months ago!) called Predeployment I/O Best Practices. It’s really good – discussing the following: Guidelines for determining I/O capacity Disk configuration best practices and common pitfalls Using SQLIO to determine capacity and interpreting its results Using System Monitor […]

SQL Server 2008: Automatic Page Repair with Database Mirroring

One of the hottest features in SQL Server 2005 is database mirroring, and it's helped many companies implement successful and relatively inexpensive high-availability strategies. In SQL Server 2008, Database Mirroring has been enhanced in several ways – one of which is the ability to automatically repair corrupt pages! This feature is based on the fact that the […]

How to tell if the IO subsystem is causing corruptions?

  One of the comments I received recently is below: Hi Paul, If the corruption happens to be related to I/O Erros and there is nothing in the Event log or anywhere that points to I/O related issues, is there any Trace flag that we can enable when performing checkdb or checktable operations that can […]

Example 2000/2005 corrupt databases and some more info on backup, restore, page checksums and IO errors

This was originally posted as two posts on the SQL Server Storage Engine site. It was very popular so I’ve combined the two posts together and added a bunch more commentary – especially on page checksums and IO errors. You may also notice some color differences between the scripts from earlier in the year and today’s post […]