A SQL Server DBA myth a day: (1/30) in-flight transactions continue after a failover

(Look in the Misconceptions blog category for the rest of the month’s posts and check out the 60-page PDF with all the myths and misconceptions blog posts collected together when you join our Insider list, plus my online Myths and Misconceptions training course on Pluralsight.)

I’ve got a session coming up at SQL Connections in a couple of weeks called DBA Mythbusters. And I’ve got a whole stack of DBA myths and misconceptions around SQL Server to choose from (I came up with 69 just in an hour on the plane last weekend) and I won’t be able to cover them all in the session, so I thought it would be cool to do a Debunk-A-Myth-A-Day blog post series in April. And so here we are.

Every day (hopefully!) I’ll pick a myth and lay it to rest. They won’t be long posts and they may be links to previous posts, but hopefully they’ll be useful.

Myth #1: After a failover, any in-flight transactions are continued.

FALSE

Any time a failover occurs, some form of crash recovery has to occur. If the transaction hadn’t committed on the server that the connection was to when the server/database went down, there’s no way for SQL Server to automatically recreate all the transaction context and resurrect the in-flight transaction on the failover server – whether the failover used clustering, mirroring, log-shipping, or SAN replication.

For failover clustering, when a failover occurs, the SQL instance is restarted on another cluster node. All the database go through crash recovery – which will roll-back all uncommitted transactions. (See my TechNet Magazine article from February 2009 for an explanation of crash recovery and how it works: Understanding Logging and Recovery in SQL Server.)

For database mirroring, the transaction log on the mirror database is constantly being used to perform redo operations (of the log records coming from the principal). When the mirror switches to being the principal, the transaction log switches into crash-recovery mode and recovers the database as if a crash had occured – and then lets connections into the database.

For log shipping, transaction log backups are being replayed on the log shipping secondary database(s) periodically. When a failure on the primary occurs, the DBA brings the log shipping secondary online by completing the restore sequence (either bringing the database immediately online, or replaying all possible transaction log backups to bring the database as up-to-date as possible). The final part of any restore sequence is to run the undo portion of crash recovery – rolling back any uncommitted transactions.

With SAN replication, I/Os to the local SAN are shipped to the remote SAN to be replayed. When a failure occurs, the system connected to the remote SAN comes online and the databases go through crash recovery, in just the same way as for failover clustering (this is an overly simplistic explanation – but you get the idea).

The *only* technology that allows unbroken connections to a database when a failure occurs, is using virtualization with a live migration feature, where the VM comes up and the connections don’t know they’re talking to a different physical host server of the VM.

But whatever mechanism you’re using – if a *connection* is broken, the in-flight transaction is lost. So your application needs to be coded to gracefully cope with broken connections and some form of transaction retry.

1 down – 29 to go!

13 thoughts on “A SQL Server DBA myth a day: (1/30) in-flight transactions continue after a failover”

Adam Machanic says:

April 1, 2010 at 10:53 am

But, but, but!!! The marketing people told us that mirroring would allow for "seamless" failover! Are you trying to say they didn’t tell us the entire truth???

Reply
Jonathan Gardner says:

April 1, 2010 at 10:56 am

The entire post all I could think to my self I hope he talks about how developers should develop to handle broken connection so I can send this article to my developers and say, "See someone way smarter than myself said it too". Thanks for the ammo.

Reply
Ozziemedes says:

April 20, 2010 at 7:49 pm

Hi Paul,

I know SQL Server itself can’t preserve transaction state, but it’s not particularly difficult to catch an exception client side, wait until the SQL Server is available again, then replay the transaction. Isn’t this why MSDTC gets enrolled in the SQL Server resource group? i.e. To support the clients that do implement this persist and retry pattern?

Cheers

JH

Reply
Dan Fugett says:

July 30, 2013 at 2:24 pm

Paul – you seem to be recommending virtual on production system. What is your response to this guy who says he can prove virtual machines can lose data (high availability doesnt mean much if I can lose data)?? http://www.sqlsolutions.com/articles/articles/SQL_Server_and_VMware-A_Potentially_Fatal_Combination.htm

Reply
1. Paul Randal says:
  
  August 13, 2013 at 1:50 pm
  
  Nonsense – VMs are used in production very commonly in the SQL Server world.
  
  Reply
Gabriele Del Giovine says:

September 3, 2013 at 6:54 am

Sure, yoy can use a virtualized SQLServer in production and rely only in Vmotion….But what happen when you need to update the OS or the SQLServer instance? Do you offer to your clients a cup of tea, a big one :-)

Reply
1. Paul Randal says:
  
  September 3, 2013 at 7:19 am
  
  You use a virtual cluster – or virtualize mirroring or AG. No tea required.
  
  Reply
tslate says:

October 17, 2014 at 8:51 am

If a connection is reset by peer or the connection socket is closed by the server (in our case a random Winsock reset which seems to be a known issue) how exactly does one recover? I was hoping there was a socket level solution but apparently not. This is not a typical OLTP where get pool connection, one row is updated, commit, put connection back to pool. These are processes that accumulate 1000’s of transactions easily in seconds but randomnly although infrequently we can get socket closed on the connection. Do you know of any way to recover those uncommitted CUD operations on a closed socket connection?

Thanks

Reply
1. Paul Randal says:
  
  October 17, 2014 at 12:39 pm
  
  No – once the connection closes, SQL Server has no choice but to roll back any uncommitted transactions.
  
  Reply
Martyn Lawrenson says:

January 27, 2018 at 1:00 am

Thanks Paul, is there anyway to failover an ag manually in order to perform maintenance on one of the nodes, but wait until uncommitted transactions have completed?

Reply
1. Paul Randal says:
  
  January 27, 2018 at 1:52 pm
  
  No – I don’t believe so.
  
  Reply
Satinder says:

July 26, 2018 at 9:53 am

Thank you. Paul – for all this

Will in-flight transactions continue after a AAG failover ?
Same DB backup jobs are scheduled on both Primary & Secondary ( where backup preference is Primary only )
DB backup job is running on the Primary Instance, and after few minutes AG failover to Secondary will that Backup job continue to the new Primary and it is not then you to handle this ….

Reply
1. Paul Randal says:
  
  July 29, 2018 at 3:23 pm
  
  No – in-flight transactions are always rolled back after any failover provided by SQL Server technology.
  
  Reply

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

A SQL Server DBA myth a day: (1/30) in-flight transactions continue after a failover

13 thoughts on “A SQL Server DBA myth a day: (1/30) in-flight transactions continue after a failover”

Leave a Reply Cancel reply

Other articles

The Curious Case of… missing 832 errors during memory corruption

The Curious Case of… recovering from a ransomware attack

30%-off health checks and self-service via credit card – no waiting!

The Curious Case of…. alcoholism

The Mentoring Class of 2023

Mentoring 2023 – want some help from me?

Imagine feeling confident enough to handle whatever your database throws at you.