Earlier this morning I noticed a discussion on the SQL MCM distribution list (that all the original MCM instructors are part of) that was trying to make sense of a huge disparity between tempdb data file usage and log file usage. I explained the answer and thought I'd share it with you all too.
The situation was a tempdb where the data files grew from 30GB to 120GB, running out of space on the disk, but the tempdb log file did not grow at all from its initial size of 1GB! How could that be?
One of the things to consider about tempdb is that logging in tempdb is ultra-efficient. Log records for updates in tempdb, for instance, only log the before image of the data instead of logging both before and after images. There is no need to log the after image – as that is only used for the REDO portion of crash recovery. As tempdb never gets crash-recovered, REDO never occurs. The before image *is* necessary, however, because transactions can be rolled back in tempdb, just like other databases, and so the before image of an update must be available to be able to successfully roll back the update.
Getting to the question though, I can easily explain the observed behavior by considering how a sort spill happens with tempdb.
I can simulate this using a gnarly query on the SalesDB database you can download from our resources page (see the top of the page for the sample databases to download).
I'm going to do a join of my Sales and Products tables and then sort the multi-million row result set by product name:
SELECT S.*, P.* from Sales S
JOIN Products P ON P.ProductID = S.ProductID
ORDER BY P.Name;
The query plan for this is (using Plan Explorer):
I know that the sort is going to spill out of memory into tempdb in this case. First I checkpoint tempdb (to clear out the log) and then after running the query, I can analyze the transaction log for tempdb.
Looking at the operation in the log:
[Log Record Length],
FROM fn_dblog (null, null);
Current LSN Operation Context Transaction ID Len Description
———————- ————— ——– ————– — ———————————————————-
000000c0:00000077:0001 LOP_BEGIN_XACT LCX_NULL 0000:00005e4d 120 sort_init;<snip>
000000c0:00000077:0002 LOP_BEGIN_XACT LCX_NULL 0000:00005e4e 132 FirstPage Alloc;<snip>
000000c0:00000077:0003 LOP_SET_BITS LCX_GAM 0000:00005e4e 60 Allocated 1 extent(s) starting at page 0001:0000aa48
000000c0:00000077:0004 LOP_MODIFY_ROW LCX_PFS 0000:00005e4e 88 Allocated 0001:0000aa48;Allocated 0001:0000aa49;
000000c0:00000077:0005 LOP_MODIFY_ROW LCX_PFS 0000:00005e4d 80 Allocated 0001:00000123
000000c0:00000077:0006 LOP_FORMAT_PAGE LCX_IAM 0000:00005e4d 84
000000c0:00000077:0007 LOP_SET_BITS LCX_IAM 0000:00005e4e 60
000000c0:00000077:0009 LOP_COMMIT_XACT LCX_NULL 0000:00005e4e 52
000000c0:00000077:000a LOP_BEGIN_XACT LCX_NULL 0000:00005e4f 128 soAllocExtents;<snip>
000000c0:00000077:000b LOP_SET_BITS LCX_GAM 0000:00005e4f 60 Allocated 1 extent(s) starting at page 0001:0000aa50
000000c0:00000077:000c LOP_MODIFY_ROW LCX_PFS 0000:00005e4f 88 Allocated 0001:0000aa50;Allocated 0001:0000aa51;<snip>
000000c0:00000077:000d LOP_SET_BITS LCX_IAM 0000:00005e4f 60
000000c0:00000077:000e LOP_SET_BITS LCX_GAM 0000:00005e4f 60 Allocated 1 extent(s) starting at page 0001:0000aa58
000000c0:00000077:000f LOP_MODIFY_ROW LCX_PFS 0000:00005e4f 88 Allocated 0001:0000aa58;Allocated 0001:0000aa59;<snip>
000000c0:00000077:0010 LOP_SET_BITS LCX_IAM 0000:00005e4f 60
000000c0:00000077:0011 LOP_SET_BITS LCX_GAM 0000:00005e4f 60 Allocated 1 extent(s) starting at page 0001:0000aa60
000000c0:00000077:0012 LOP_MODIFY_ROW LCX_PFS 0000:00005e4f 88 Allocated 0001:0000aa60;Allocated 0001:0000aa61;<snip>
000000c0:00000077:0013 LOP_SET_BITS LCX_IAM 0000:00005e4f 60
000000c0:00000077:0014 LOP_COMMIT_XACT LCX_NULL 0000:00005e4f 52
000000c0:00000077:0015 LOP_BEGIN_XACT LCX_NULL 0000:00005e50 128 soAllocExtents;<snip>
000000c0:00000077:0016 LOP_SET_BITS LCX_GAM 0000:00005e50 60 Allocated 1 extent(s) starting at page 0001:0000aa68
000000c0:00000077:0017 LOP_MODIFY_ROW LCX_PFS 0000:00005e50 88 Allocated 0001:0000aa68;Allocated 0001:0000aa69;<snip>
000000c0:00000077:0018 LOP_SET_BITS LCX_IAM 0000:00005e50 60
000000c0:00000077:0019 LOP_SET_BITS LCX_GAM 0000:00005e50 60 Allocated 1 extent(s) starting at page 0001:0000aa70
000000c0:00000077:001a LOP_MODIFY_ROW LCX_PFS 0000:00005e50 88 Allocated 0001:0000aa70;Allocated 0001:0000aa71;<snip>
000000c0:00000077:001b LOP_SET_BITS LCX_IAM 0000:00005e50 60
000000c0:00000077:001c LOP_SET_BITS LCX_GAM 0000:00005e50 60 Allocated 1 extent(s) starting at page 0001:0000aa78
000000c0:00000077:001d LOP_MODIFY_ROW LCX_PFS 0000:00005e50 88 Allocated 0001:0000aa78;Allocated 0001:0000aa79;<snip>
000000c0:00000077:001e LOP_SET_BITS LCX_IAM 0000:00005e50 60
000000c0:00000077:001f LOP_SET_BITS LCX_GAM 0000:00005e50 60 Allocated 1 extent(s) starting at page 0001:0000aa80
000000c0:00000077:0020 LOP_MODIFY_ROW LCX_PFS 0000:00005e50 88 Allocated 0001:0000aa80;Allocated 0001:0000aa81;<snip>
000000c0:00000077:0021 LOP_SET_BITS LCX_IAM 0000:00005e50 60
000000c0:00000077:0022 LOP_COMMIT_XACT LCX_NULL 0000:00005e50 52
000000c0:00000077:0023 LOP_BEGIN_XACT LCX_NULL 0000:00005e51 128 soAllocExtents;<snip>
000000cd:00000088:01d3 LOP_SET_BITS LCX_GAM 0000:000078fc 60 Deallocated 1 extent(s) starting at page 0001:00010e50
000000cd:00000088:01d4 LOP_COMMIT_XACT LCX_NULL 0000:000078fc 52
000000cd:00000088:01d5 LOP_BEGIN_XACT LCX_NULL 0000:000078fd 140 ExtentDeallocForSort;<snip>
000000cd:00000088:01d6 LOP_SET_BITS LCX_IAM 0000:000078fd 60
000000cd:00000088:01d7 LOP_MODIFY_ROW LCX_PFS 0000:000078fd 88 Deallocated 0001:00010e68;Deallocated 0001:00010e69;<snip>
000000cd:00000088:01d8 LOP_SET_BITS LCX_GAM 0000:000078fd 60 Deallocated 1 extent(s) starting at page 0001:00010e68
000000cd:00000088:01d9 LOP_COMMIT_XACT LCX_NULL 0000:000078fd 52
000000cd:00000088:01da LOP_MODIFY_ROW LCX_PFS 0000:00005fac 80 Deallocated 0001:00000109
000000cd:00000088:01db LOP_SET_BITS LCX_SGAM 0000:00005fac 60 ClearBit 0001:00000108
000000cd:00000088:01dc LOP_SET_BITS LCX_GAM 0000:00005fac 60 Deallocated 1 extent(s) starting at page 0001:00000108
000000cd:00000088:01dd LOP_COMMIT_XACT LCX_NULL 0000:00005fac 52
(I snipped out a few extraneous log records plus the 6 extra 'Allocated' and 'Deallocated' for each of the PFS row modifications.)
One of the things I notice is that the sort spill space is allocated in extents, and almost the entire sort – from initialization, through allocating all the extents, to deallocating them – is contained in a few very large transactions. But the transactions aren't actually that large.
Look at the soAllocExtents transaction with Transaction ID 00005e50. It's allocating 4 extents – i.e. 256KB – in a single system transction (4 x mark an extent as unavailable in the GAM, 4 x bulk set the 8 PFS bytes for the 8 pages in the extent, 4 x mark an extent allocated in the IAM). The total size of the log records for this transaction is 1012 bytes. (The first soAllocExtents system transaction only allocates 3 extents, all the others allocate 4 extents.)
When the sort ends, the extents are deallocated one-at-a-time in system transactions called ExtentDeallocForSort. An example is the transaction with Transaction ID 000078fd. It generates log records totalling 400 bytes. This means each 256KB takes 4 x 400 = 1600 bytes to deallocate.
Combining the allocation and deallocation operations, each 256KB of the sort that spills into tempdb generates 2612 bytes of log records.
Now let's consider the original behavior that I explained. If the 90GB was all sort space:
90GB is 90 x 1024 x 1024 = 94371840KB, which is 94371840 / 256 = 368640 x 256KB chunks.
Each 256KB chunk takes 2612 bytes to allocate and deallocate, so our 90GB would take 368640 x 2612 = 962887680 bytes of log, which is 962887680 / 1024 / 1024 = ~918MB of log.
And this would explain the observed behavior. 90GB of tempdb space can be allocated and used for a sort spill with roughly 918MB of transaction log, give or take a bit from my rough calculations.
Tempdb logs things very efficiently – especially things that spill out of memory. The next stop in debugging such a problem would be regularly capturing the output of sys.dm_db_task_space_usage to figure out who is using all the space and then digging in from there.
Hope this helps explain things!
12 thoughts on “Understanding data vs log usage for spills in tempdb”
Excellent post! Thanks a lot for taking the the time in writing this out
Very Informative. Paul, Can you point to some article or a reference on how to read the fn_dblog output?
Thanks Paul! You are really a SQL Server guru. I’m a fan of you!
Hi Adithya – there’s nothing out there except what people have blogged about (like me) – as it’s an undocumented tool.
very informative post, many thank’s !
I encoutered on a SQL SERVER 2008 SP2 instance a TempDB Tlog full, with error message into the sql server log like ‘The transaction log for database ‘tempdb’ is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases’.
And then error number 3449 fired into sql server log with the following message :
2012-11-25 16:36:56.27 spid289 Error: 3449, Severity: 21, State: 1.
2012-11-25 16:36:56.27 spid289 SQL Server must shut down in order to recover a database (database ID 2). The database is either a user database that could not be shut down or a system database. Restart SQL Server. If the database fails to recover after another startup, repair or restore the database.
Microsoft support explaind me that when the tempdb Tlog becomes full, the sql server instance had to shutdown, and this is “by design”. It’s hard for me to believe it has I can’t reproduce the case. I know there was an issue with hotfix on similar problem in SQL Server 2005 (http://support.microsoft.com/kb/916086), but I’ve found out nothing regarding SQL Server 2008.
Do you have any idea about this ?
Thank you for your advices
This is absolutely not true. The only time you need to bounce the instance is if something has caused corruption in tempdb such that open transactions cannot roll back. That happened in your case.
I can reproduce the case with following script, on a very small Tlog for testing purpose (TLog is 2MB withour autogrowth, on a Microsoft SQL Server 2008 SP3) :
create table #test (id int, test1 char(4000))
set nocount on
declare @count int
set @count = 0
While @count <80000
insert into #test VALUES (@count, Replicate('a', 4000))
set @count = @count + 1
drop table #test
2013-04-12 14:48:01.60 spid16s Error: 9002, Severity: 17, State: 4.
2013-04-12 14:48:01.60 spid16s The transaction log for database 'tempdb' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases
2013-04-12 14:48:01.66 spid16s Error: 3314, Severity: 21, State: 4.
2013-04-12 14:48:01.66 spid16s During undoing of a logged operation in database 'tempdb', an error occurred at log record ID (284:349:270). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.
2013-04-12 14:48:01.76 spid16s Error: 3449, Severity: 21, State: 1.
2013-04-12 14:48:01.76 spid16s SQL Server must shut down in order to recover a database (database ID 2). The database is either a user database that could not be shut down or a system database. Restart SQL Server. If the database fails to recover after another startup, repair or restore the database.
2013-04-12 14:48:01.76 spid16s SQL Trace was stopped due to server shutdown. Trace ID = '1'. This is an informational message only; no user action is required.
It's interesting to notice that with a smaller char variable, tlog can be full with 9002 error into error log, but without the 3314 error and so without stopping instance.
It's just like that sometimes, the engine doesn't have enough place into the tlog for the rollback purpose and decide to stop….
Hi Olivier – yes, I was discussing this with Bob Ward last week at SQLintersection. You’ve discovered a bug – a 9002 in tempdb should never result in a shutdown.
Hi. I appreciate your always nice posts, thank you!
I did and recommend your Transaction Log virtual training but I would like to ask you a question.
Im sorry, Im not able to reproduce it right now but, please, is it correct to say that because of the first record of the sort “000000c0:00000077:0001 LOP_BEGIN_XACT LCX_NULL 0000:00005e4d 120 sort_init;” the min LSN stay in the first VLF so the automatic checkpoint doesn’t truncate when TLog get 70% full? The Log cannot be truncated because of this active transaction, that is commited just at the end, right?
Before read your post at all I though that was because of the automatic checkpoints – lol.
Thank you so much!
Yes, correct. If there is an active transaction, all the log from the VLF containing the active transaction’s LOP_BEGIN_XACT log record onward cannot be truncated until that active transaction commits or rolls back.
Nice post Paul. I have an ongoing issue where tempdb log file alone grows up to 40 GB in a day (there is no growth for tempdb data file). I noticed some open transaction running against tempdb during business hours. Occupied space by tempdb log is not getting released after the business hours though there are no transactions against it at that time. I tried to track the SQL statement that is running by the open transaction, unfortunately it shows as “SET TEXTSIZE -1” all the time. Current SQL server version is 2012 SP2 standard. We have logged a case with Application vendor but still its in vain.
Tempdb log file data is as followed. Can you please throw some light on this issue and help me in understanding this.
Current LSN Operation Context Transaction ID Log Record Length Description
0000079d:00024460:0173 LOP_BEGIN_XACT LCX_NULL 0000:09d1be36 144 user_transaction;0x010500000000000515000000f269f05a0f557862f375d651b2550000
0000079d:00024460:0174 LOP_SHRINK_NOOP LCX_NULL 0000:09d1be36 32
0000079d:00024460:0175 LOP_INSERT_ROWS LCX_CLUSTERED 0000:09d1be36 384
0000079d:00024460:0176 LOP_INSERT_ROWS LCX_INDEX_LEAF 0000:09d1be36 344
0000079d:00024460:0177 LOP_INSERT_ROWS LCX_INDEX_LEAF 0000:09d1be36 344
0000079d:00024460:0178 LOP_INSERT_ROWS LCX_INDEX_LEAF 0000:09d1be36 84
0000079d:00024460:0179 LOP_BEGIN_XACT LCX_NULL 0000:09d1be37 132 TVQuery;0x010600000000000550000000dca88f14b79fd47a992a3d8943f829a726066357
0000079d:00024460:017a LOP_BEGIN_XACT LCX_NULL 0000:09d1be38 164 AllocHeapPageSysXactDML;0x010600000000000550000000dca88f14b79fd47a992a3d8943f829a726066357
0000079d:00024460:017b LOP_MODIFY_ROW LCX_PFS 0000:09d1be38 80 Allocated 0001:00002e68
0000079d:00024460:017c LOP_MODIFY_ROW LCX_IAM 0000:09d1be38 92
0000079d:00024460:017d LOP_HOBT_DELTA LCX_NULL 0000:09d1be38 64 Action 0 (HOBT_PAGE_COUNT) on rowset 576461717281177600. Leaf page delta: 1, Reserved page delta: 1, Used page delta: 1
0000079d:00024460:017e LOP_FORMAT_PAGE LCX_HEAP 0000:09d1be38 84
0000079d:00024460:017f LOP_COMMIT_XACT LCX_NULL 0000:09d1be38 84
0000079d:00024460:0180 LOP_INSERT_ROWS LCX_HEAP 0000:09d1be37 72
0000079d:00024460:0181 LOP_COMMIT_XACT LCX_NULL 0000:09d1be37 84
0000079d:00024460:0182 LOP_BEGIN_XACT LCX_NULL 0000:09d1be39 136 TVTruncate;0x010600000000000550000000dca88f14b79fd47a992a3d8943f829a726066357
0000079d:00024460:0183 LOP_MODIFY_ROW LCX_IAM 0000:09d1be39 92
0000079d:00024460:0184 LOP_MODIFY_ROW LCX_PFS 0000:09d1be39 80 Deallocated 0001:00002e68
0000079d:00024460:0185 LOP_HOBT_DELTA LCX_NULL 0000:09d1be39 64 Action 0 (HOBT_PAGE_COUNT) on rowset 576461717281177600. Leaf page delta: -1, Reserved page delta: -1, Used page delta: -1
0000079d:00024460:0186 LOP_HOBT_DELTA LCX_NULL 0000:09d1be39 64 Action 2 (ROWSET_ROW_COUNT) on rowset 576461717281177600. Row count delta: -1.
0000079d:00024460:0187 LOP_COMMIT_XACT LCX_NULL 0000:09d1be39 84
0000079d:00024460:0188 LOP_MODIFY_ROW LCX_CLUSTERED 0000:09d1be36 92
0000079d:00024460:0189 LOP_INSERT_ROWS LCX_CLUSTERED 0000:09d1be36 140
0000079d:00024460:018a LOP_INSERT_ROWS LCX_INDEX_LEAF 0000:09d1be36 104
Could be anything. Use my script to track open transactions with text and plans and that’ll tell you the spid etc.