[Edit 3/12/2019: it looks like this bug, or a similar one, is also in SQL Server 2016. I’ll post an update once I have it.]
[Edit 1/20/2020: still in 2016 SP2 CU8 as of today.]
Over the years I’ve discussed log space reservation, which is when SQL Server automatically reserves some free space in the transaction log so that in-flight transactions can always be rolled back without the log having to grow. This is because rolling back a transaction requires generating more log records, and so there needs to be guaranteed space for them. If this did not happen, the log could fill up, in-flight transactions would begin to roll back, a log grow attempt might fail, and the database then goes suspect or into recovery because the in-flight transactions are essentially stuck.
The algorithm is pretty conservative, and I fixed a few bugs in it before SQL Server 2005 shipped.
There hasn’t been a case of it failing to reserve enough space until SQL Server 2012, when a bug was introduced. That bug was discovered by someone I was working with in 2015 (which shows just how rare the circumstances are), and at the time it was thought that the bug was confined to the log of tempdb filling up, rollback failing, and the server shutting down.
However, just last week I was contacted by someone running SQL Server 2012 SP3 who’d seen similar symptoms but for a user database this time, and the user database went into recovery. An example of the error log messages is below (altered for anonymity):
During under of a logged operation in database 'mydb', an error occurred at log record ID (2445:89001:23). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database. The log for database 'mydb' is not available. Check the event log for related error messages. Resolve any errors and restart the database. Error during rollback, shutting down database (location: 1) Database mydb was shutdown due to error 3314 in routine 'XdesRMReadWrite::RollbackToLsn'. Restart for non-snapshot databases will be attempted after all connections to the database are aborted. The transaction log for database 'mydb' is full due to 'ACTIVE_TRANSACTION'. The transaction log for database 'mydb' is full due to 'ACTIVE_TRANSACTION'. The transaction log for database 'mydb' is full due to 'ACTIVE_TRANSACTION'. 'D:\Logs\mydb.ldf: Operating system error 112(There is not enough space on the disk.) encountered.
I suggested that they’d hit the known bug, and they confirmed that with Microsoft.
And if you hit it for tempdb, the server will shut down, as tempdb being unavailable means SQL Server has no choice but to stop.
The bug is described in KB article 2963384 and is included in SQL Server 2012 SP4 and SQL Server 2014 SP1. If you’re running 2012 SP3 then you should install SP4, and if you’re running 2014 RTM then you should install the latest 2014 SP.
I didn’t blog about the bug back in 2015 as only one person had hit it and the circumstances seemed incredibly rare, but now that seems to not be the case.
Stay safe out there!
24 thoughts on “Bug: database/server ‘shutdown due to error 3314’”
I read many times the algorithm will estimate how much log space must be reserved and it is pretty conversative about it, but I don’t get why it is a hard thing. I suppose the reservation is calculated by the time the log record is already created, so I thought it will be possible to know how much space it will need to rollback.
Is there any kind of difficulty which makes the calculation impossible, or is it just too expensive to be accurate that it’s better to aim high and reserve more than is actually needed?
Thanks a lot,
Because the undo can make decisions at run-time about log records, so sometimes it’s impossible to know *exactly* how much space is required for undo during the forward phase of a transaction. All failures to use enough are bugs.
Thanks a lot for the additional info. Any change you meant “All failures to *reserve* enough are bugs” in the last sentence?
Use for the reservation, or you could say reserve.
Thanks for this, just a small correction:
Microsoft confirmed that this has been fixed in 2012 SP2 but you mentioned as SP4
Here is the URL:
Yep hit this multiple times on SQL 2012 in 2015. Worst it was production and the log was multiple TBs so took about 6 hours to come online (compounded by slow disk issues). Then failed again. And again. That was a bad few weeks.
There were a few contributing factors and I don’t remember which were specifically involved.
One was the developers deleting TBs of data in massive transactions, and increased the number of concurrent bulk inserts 4x beyond the 26 core count because everything was so slow and they just wanted stuff faster…
The other was poor disk capacity planning where both the log drive and the backup share were fixed in size, slow, and would repeatedly and easily fill. Sigh.
I don’t remember if I logged a Connect item about it because it was documented somewhere at the time and seemed as-designed to me but I did talk pretty sternly to the Infrastructure team and devs. After that it went away.
We just had this very instance occur with our SQL Server 2017 Standard Edition server. We thought it was a dev wreaking havoc on our server and locking up tables with some uncommitted transaction but the exact issue happened (at the same time) in both our UAT and PROD environments.
I’m wondering if reducing the autogrowth property on the tran log would benefit here.
Correction 2016 box – not 2017. (Have a totally unrelated regression bug on our 2017 instance…)
You should definitely contact Product Support to let them know about this.
Hi Paul, we recently had this on server running with version 11.0.7462 which is already in sql 2012 SP4, so this issue still pertain to 2012? Please advise.
In that case you should call Microsoft CSS for assistance as it could be different issue, or the bug isn’t fixed in all cases.
The bug is still alive and kicking on Microsoft SQL Server 2016 (SP2-CU8) (KB4505830) – 13.0.5426.0 (X64).
Just opened a case to MS Support
I’ll update the post.
Our problem was on a user database and we ran out of space on the transaction log disk (disk full).
The MS CSS told us that is normal after a situation like this for the database to go in Recovery.
That’s not correct – there should be enough transaction log reserved space for all active transactions to roll back without the database going into recovery.
Same happened to me in SQL server 2016 RTM.
We are on Microsoft SQL Server 2012 (SP4-GDR) and still facing the bug.
2020-03-15 12:37:06.83 spid60 The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x000080f242a000 in file ‘E:\Data2\XXX.mdf:MSSQL_DBCC17’. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
2020-03-15 12:37:06.85 spid60 Error: 3314, Severity: 17, State: 3.
2020-03-15 12:37:06.85 spid60 During undoing of a logged operation in database XXX, an error occurred at log record ID (1487705:1847695:92). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.
2020-03-15 12:37:06.86 spid60 Database XXX was shutdown due to error 3314 in routine ‘XdesRMReadWrite::RollbackToLsn’. Restart for non-snapshot databases will be attempted after all connections to the database are aborted.
We are running Checkdb with physical only as the database is approx. 2TB and Data2 throwing the error has over 600GB free space.
Looks like you need to upgrade from 2012.
We are getting similar error and we update our sql server 2014 version with SP3 and CU4 but still we could see the issue. Can you please suggest us what we have to do??
It’s a bug – you’ll need to contact Microsoft for assistance – nothing else you can do.
Hi, the bug is still alive on Microsoft SQL Server 2016 (SP2-CU12)
We just upgraded a windows 2016 server with SQL 2019 to cu8 and started getting this error with our SSISDB (130 compat). Do you know if it was ever fixed and or if its a confirmed bug now in 2019 as well?
I never heard if it was fixed, but if you’re seeing it, there’s still a bug, because 100% of these failures are a logging bug. I would open a support case with Microsoft.
We just hit this bug on SQL2016 SP3 (13.0.6419.1).