TechNet Magazine: July 2010 SQL Q&A column

The July edition of TechNet Magazine is available on the web now and has the latest installment of my regular SQL Q&A column (and I forgot to blog about it a couple of weeks back).

This month's topics are:

  • Why shrink runs slower on some databases than others
  • Is it true that autogrow should always be turned off?
  • LOB storage considerations 
  • How long consistency checks will take 

Check it out at http://technet.microsoft.com/en-us/magazine/ff808322.aspx.

Why LOB data makes shrink run slooooowly (T-SQL Tuesday #006)

[Edit January 2015: Everything in this post applies to SQL Server 2012 and 2014 still.]

This blog post is part of the monthly T-SQL Tuesday series that fellow-MVP Adam Machanic (twitter|blog) organizes. This month’s T-SQL Tuesday is being run by Michael Coles (twitter|blog) and is on the subject of reporting – see this blog post for details.

The theme of this month’s T-SQL Tuesday is LOB data so I’m going to stretch things a little and explain why LOB data makes shrink performance really suck (that’s a technical term :-)

You all know that I really don’t like data file shrinking (although it can sometimes be absolutely necessary. See these posts for my soap-box diatribes:

The way that data file shrink works is that it picks up pages from the end of the data file and tries to move them as far towards the beginning of the data file as it can. As part of doing this, it has to fix up any structures that page is part of.

For example, if the page is part of an index, there will be other pages in the index that have physical links to the page being moved – these linkages have to be fixed up. The left- and right-hand neighboring pages in the same level of the index are easy to fix, as the page being moved points to them.

However, the page in the next level up towards the root page in the index is more tricky, as there’s no back-link pointing up to it – a scan of the next level in the index is required to find the page that points down to the page being moved. This should be pretty fast as the index levels get much smaller pretty quickly as you move up towards the root page (the magnitude change in size depends on the fanout of the index – I explain this in the post Inside sys.dm_db_index_physical_stats).

Important point: Whenever a page is being moved and there’s some portion of a structure on the page that doesn’t have a backlink to what is pointing to it, a scan is involved.

LOB values can be stored in-row or off-row (i.e. as part of the data or index record, or in a tree of text records on different pages altogether). Legacy LOB types (text, ntext, image) are stored off-row by default. New LOB types (varchar(max), nvarchar(max), XML, varbinary(max)) are stored in-row by default, up to a maximum size of 8000 bytes as long as there is space in the record.

When a LOB value is stored off-row, there is a complex pointer stored in the data or index row (called a blob root) that contains a pointer to the physical location of the top of the tree of text records that make up the LOB value, the size of the first record being pointed at, and a timestamp (not a time, a timestamp data-type value). The text record being pointed to also contains the same timestamp.

Let’s take a look:

CREATE TABLE TextTest (c1 INT, c2 TEXT);

GO

INSERT INTO TextTest VALUES (1, ‘a’);

GO

Now figure out the first page in the table using DBCC IND or my sp_AllocationMetadata script, or whatever, and run DBCC PAGE on it:

DBCC TRACEON (3604);

DBCC PAGE (TestDB, 1, 737, 3);

GO

<snip>

Slot 0 Column 0 Offset 0x4 Length 4

c1 = 1                              

c2 = [Textpointer] Slot 0 Column 1 Offset 0xf Length 16

TextTimeStamp = 131137536            RowId = (1:152:0)

If we follow the pointer to page (1:152) and run DBCC PAGE on it, we see:

DBCC TRACEON (3604);

DBCC PAGE (foo, 1, 152, 3);

GO

<snip>

Blob row at: Page (1:152) Slot 0 Length: 84 Type: 0 (SMALL_ROOT)

Small Blob Id: 131137536 Size:1

6855C074:        61                               a

Notice that the two numbers in bold match. Notice also that there’s no backlink from the off-row LOB value back to the ‘owning’ data record.

When a text page is moved by data file shrink, a table scan must be performed for each LOB value to find the ‘owning’ data record so the pointer can be updated. (If the LOB value is from an INCLUDEd column in a nonclustered index, then a similar scan must be performed of the nonclustered index. If the LOB value is lower down in the text tree for a >8000 byte LOB value, all text pages for that table or index must be scanned.)

Very slow. Very, very slow.

There’s a similar issue if DBCC CHECKDB finds an orphaned text record – it can’t tell who is supposed to own it so it has to go back and rescan all tables in its current batch to figure it out – which can make DBCC CHECKDB take a lot longer than usual too.

During SQL Server 2005 development when the new LOB types were implemented, I pushed for a backlink to alleviate these problems, but the engineering effort wasn’t worth it. So we’re stuck with it.

Yet one more reason that shrink is nasty!

2005 corruption bug with XML index rebuild fixed in latest CU

2005 SP3 CU6 contains a fix for a corruption bug that can happen after rebuilding an XML index that contains XML elements greater than 8000 bytes.

The bug was fixed in 2008 but hadn’t made it back to 2005 until the last CU was released. I had email from someone back in August experiencing corruption and I thought it looked like a bug – so I advised them to call PSS, who told them they were the first to report it on 2005 and it was a bug. Cool! (if you see what I mean).

If you run a DBCC CHECKDB or a DBCC CHECKTABLE after rebuilding an XML index, you’ll see 8964 errors like the one below:

DBCC results for 'sys.xml_index_nodes_980198542_32000'.
Msg 8964, Level 16, State 1, Line 2
Table error: Object ID 909246294, index ID 1, partition ID 72057594130530304, alloc unit ID 72057594149535744 (type LOB data). The off-row data node at page (1:621), slot 0, text ID 1302003712 is not referenced.

If the XML index contains XML elements that are larger than 8000 bytes, they will be pushed off-row as row overflow columns, stored in LOB pages. The XML index rebuild isn’t cleaning up its LOB pages properly so leaves orphaned ones lying around. Not really nasty corruption, but enough to set off alarm bells in your monitoring.

You can get the CU at http://support.microsoft.com/kb/974648/, and a high-level description of the bug at http://support.microsoft.com/kb/974985.