Memory Error Recovery in SQL Server 2012

One under publicized new feature in SQL Server 2012 is called Memory Error Recovery. This feature allows SQL Server 2012 to repair clean pages in the buffer pool by reading the pages again from disk. These “soft” errors are caused by electrical or magnetic interference inside a server that cause single bits inside of DRAM chips to flip to an opposite state. The main cause of this is background radiation from cosmic rays.

There was a presentation at TechEd 2012, called “The Path to Continuous Availability with Windows Server 2012” that talked about this being a new feature in Windows Server 2012, which implies that you will need to be running SQL Server 2012 on top of Windows Server 2012 to get this functionality.

In Windows Server 2012, the feature is called Application Assisted Memory Error Recovery, and it requires the application (such as SQL Server 2012) to register for notifications of bad memory page events using CreateMemoryResourceNotification(). It also requires SQL Server 2012 to use the API QueryWorkingSetEx() to scan the memory for bad pages.

It is likely an Enterprise Edition-only feature, but I have not confirmed this assumption yet.

You will also need ECC RAM, and a processor with a memory controller that supports this. I don’t have a list of processors that support this feature yet, but I am working on it. If I had to guess, I would assume that Intel Nehalem and newer, and AMD Magny-Cours and newer will probably be required.

If you have the hardware support, along with both Windows Server 2012 and SQL Server 2012, you will see a message like this in your SQL Server error log:

Machine supports memory error recovery. SQL memory protection is enabled to recover from memory corruption.

There are a few prerequisites that you must satisfy, but this is still an interesting feature. It is one more argument that you can use when you are trying to make the case to upgrade to SQL Server 2012, on a new server with the latest version of Windows Server.

Hardware 101 Presentation in Bellevue, WA – August 14, 2012

I recently had the opportunity to give a one hour presentation called Hardware 101: An Introduction to Database Hardware during the evening, after a full day of SQLskills Immersion Event 2 (IE2) training.

Even though it was an evening event, after almost 10 hours of intense training that day, nearly all of the students stayed to listen to me speak, which was very nice to see. I got a lot of positive feedback after I was done, which is always a good thing.