[Edit January 2015: Everything in this post applies to SQL Server 2012 and 2014 still.]
This blog post is part of the monthly T-SQL Tuesday series that fellow-MVP Adam Machanic (twitter|blog) organizes. This month’s T-SQL Tuesday is being run by Michael Coles (twitter|blog) and is on the subject of reporting – see this blog post for details.
The theme of this month’s T-SQL Tuesday is LOB data so I’m going to stretch things a little and explain why LOB data makes shrink performance really suck (that’s a technical term :-)
You all know that I really don’t like data file shrinking (although it can sometimes be absolutely necessary. See these posts for my soap-box diatribes:
The way that data file shrink works is that it picks up pages from the end of the data file and tries to move them as far towards the beginning of the data file as it can. As part of doing this, it has to fix up any structures that page is part of.
For example, if the page is part of an index, there will be other pages in the index that have physical links to the page being moved – these linkages have to be fixed up. The left- and right-hand neighboring pages in the same level of the index are easy to fix, as the page being moved points to them.
However, the page in the next level up towards the root page in the index is more tricky, as there’s no back-link pointing up to it – a scan of the next level in the index is required to find the page that points down to the page being moved. This should be pretty fast as the index levels get much smaller pretty quickly as you move up towards the root page (the magnitude change in size depends on the fanout of the index – I explain this in the post Inside sys.dm_db_index_physical_stats).
Important point: Whenever a page is being moved and there’s some portion of a structure on the page that doesn’t have a backlink to what is pointing to it, a scan is involved.
LOB values can be stored in-row or off-row (i.e. as part of the data or index record, or in a tree of text records on different pages altogether). Legacy LOB types (text, ntext, image) are stored off-row by default. New LOB types (varchar(max), nvarchar(max), XML, varbinary(max)) are stored in-row by default, up to a maximum size of 8000 bytes as long as there is space in the record.
When a LOB value is stored off-row, there is a complex pointer stored in the data or index row (called a blob root) that contains a pointer to the physical location of the top of the tree of text records that make up the LOB value, the size of the first record being pointed at, and a timestamp (not a time, a timestamp data-type value). The text record being pointed to also contains the same timestamp.
Let’s take a look:
CREATE TABLE TextTest (c1 INT, c2 TEXT);
INSERT INTO TextTest VALUES (1, ‘a’);
Now figure out the first page in the table using DBCC IND or my sp_AllocationMetadata script, or whatever, and run DBCC PAGE on it:
DBCC TRACEON (3604);
DBCC PAGE (TestDB, 1, 737, 3);
Slot 0 Column 0 Offset 0x4 Length 4
c1 = 1
c2 = [Textpointer] Slot 0 Column 1 Offset 0xf Length 16
TextTimeStamp = 131137536 RowId = (1:152:0)
If we follow the pointer to page (1:152) and run DBCC PAGE on it, we see:
DBCC TRACEON (3604);
DBCC PAGE (foo, 1, 152, 3);
Blob row at: Page (1:152) Slot 0 Length: 84 Type: 0 (SMALL_ROOT)
Small Blob Id: 131137536 Size:1
6855C074: 61 a
Notice that the two numbers in bold match. Notice also that there’s no backlink from the off-row LOB value back to the ‘owning’ data record.
When a text page is moved by data file shrink, a table scan must be performed for each LOB value to find the ‘owning’ data record so the pointer can be updated. (If the LOB value is from an INCLUDEd column in a nonclustered index, then a similar scan must be performed of the nonclustered index. If the LOB value is lower down in the text tree for a >8000 byte LOB value, all text pages for that table or index must be scanned.)
Very slow. Very, very slow.
There’s a similar issue if DBCC CHECKDB finds an orphaned text record – it can’t tell who is supposed to own it so it has to go back and rescan all tables in its current batch to figure it out – which can make DBCC CHECKDB take a lot longer than usual too.
During SQL Server 2005 development when the new LOB types were implemented, I pushed for a backlink to alleviate these problems, but the engineering effort wasn’t worth it. So we’re stuck with it.
Yet one more reason that shrink is nasty!