This is a question that comes up every so often, most recently this morning while teaching a private class (and Kimberly's teaching now): how large is the forwarded record back-pointer? (And I haven't posted anything geeky for a while…)
In a heap it is possible to get forwarding and forwarded records. They occur when a record in a heap expands such that it no longer fits on the page it currently resides on. In this case, the record is moved to a new page, and a small forwarding record is left in the original location. The forwarding record points to the new location of the record, which is known as a forwarded record. This is done as a performance optimization so that all the nonclustered indexes on the heap do not have to be altered with the new location of the heap record.
As an aside, when a query uses a nonclustered index to satisfy a query, but needs more columns from the table record to fill the result set column list, it must go to the actual data record to retrieve the extra columns. It does this by using the table record locator that is stored in the nonclustered index record. If the table is a heap, the record locator is the physical location of the data record in the heap.
If the table is a clustered index (remember that a table can be organized as a heap OR a clustered index, not both), the record locator is the set of cluster keys of the data record. Both of these record locators are guaranteed to be unique. For a heap record locator, the record lookup (commonly called a bookmark lookup) goes directly to the physical location of the record. For a clustered index record locator, the record lookup uses the cluster keys to navigate down through the clustered index to the leaf level.
If a forwarding record occurs in a heap, when the record locator points to that location, the Storage Engine gets there and says Oh, the record isn't really here – it's over there! And then it has to do another (potentially physical) I/O to get to the page with the forwarded record on. This can result in a heap being less efficient that an equivalent clustered index.
There has been lots of discussion about whether it's better to have a heap or a clustered index – generally we recommend a clustered index, but there are special cases where a heap may be fine. This post isn't about that and I won't get drawn into a Comments argument about it. Go bug Kimberly :-)
Back to the point of the post. So the record has moved to a new location and there's a small record left in the original location which helps bookmark lookups.
Oh yeah, one more aside. The Storage Engine, when scanning the heap, will not process the forwarded record UNLESS it has reached it by following the forwarding record. This prevents race conditions where the record could be processed twice – if the forwarding operation occurs during a large table scan.
Ok, really back to the point of the post. What happens if the original record grows again and has to move again? Does it leave ANOTHER forwarding record when it moves to the second new location – creating a chain of forwarding records?
The answer is no. The *original* forwarding record is updated with the new location of the forwarded record. This can only be done if the forwarded record points *back* to the forwarding record – which it does.
The question becomes- how big is the forwarding record back-pointer? It's not dumped out by DBCC PAGE (sorry, when I rewrote DBCC PAGE for SQL 2005 I forgot to put that in there). Let's work it out using a script.
First off I'm going to create a database and table to play with.
CREATE DATABASE DBMaint2008;
CREATE TABLE DbccPageTest (intCol1 INT IDENTITY, intCol2 INT, vcharCol VARCHAR (8000), lobCol VARCHAR (MAX));
INSERT INTO DbccPageTest VALUES (1, REPLICATE ('Row1', 600), REPLICATE ('Row1Lobs', 1000));
INSERT INTO DbccPageTest VALUES (2, REPLICATE ('Row2', 600), REPLICATE ('Row2Lobs', 1000));
Using DBCC IND ('DBMaint2008', 'DbccPageTest', -1) , I find that the data page is page ID (1:154). If I dump out that page using DBCC PAGE, I can see both of the records fully contained on the page.
Now I'm going to update the second row to make it 8000+ bytes – forcing it to move to a new page.
UPDATE DbccPageTest SET vcharCol = REPLICATE ('LongRow2', 1000) WHERE intCol2 = 2;
And now looking at page (1:154) with DBCC PAGE again, I can see that the second row has been replaced with:
Slot 1 Offset 0x137a Length 9
Record Type = FORWARDING_STUB Record Attributes =
Memory Dump @0x66F4D37A
00000000: 049d0000 00010000 00†††††††††††††††††………
Forwarding to = file 1 page 157 slot 0
The record has been replaced with a forwarding record – pointing to the new location of the record on page (1:157).
Now DBCC PAGE doesn't dump out the back-pointer in the forwarded record – so how can we tell how large it is? I'm going to see how large the record is, and then create a clustered index on the table. This will remove the back-pointer from the record without changing anything else about the record. The difference in size will be the back-pointer size.
Doing DBCC PAGE on page (1:157) I get (partial results):
Slot 0 Offset 0x60 Length 8057
Record Type = FORWARDED_RECORD Record Attributes = NULL_BITMAP VARIABLE_COLUMNS
Now creating a clustered index and then dumping the page with the second row on it (left as an exercise for the reader :-)):
CREATE CLUSTERED INDEX Dbcc_CL ON DbccPageTest (intCol1);
<figure out which page to look at>
DBCC PAGE ('DBMaint2008', 1, 169, 3);
Slot 0 Offset 0x60 Length 8047
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP VARIABLE_COLUMNS
[Edit 062210 – added a more in-depth explanation]
The difference in size is 10 bytes, which is the size of the back-pointer. This isn't entirely obvious, but here's the explanation for the size change:
- The back-pointer in the heap record is a certain size, plus two bytes for the offset in the variable-length column offset array (see Search Engine Q&A #27: How does the storage engine find variable-length columns? for more info)
- The clustered index doesn't have the back-pointer at all, but it's a nonunique clustered index, so does have an empty uniquifier column in every record, which takes at least two bytes for the offset in the variable-length column offset array
- Given that changing from a heap to a nonunique clustered index gives a record size change of 10 bytes, with the removal of one column and addition of another, the 10-byte difference must be the back-pointer
Note: if you create a unique clustered index above, you'll see a record size change of 12 bytes – as the uniquifier column is not generated.
And it is. And here's how it breaks down:
2 bytes for the special column ID (1024) at the start of the back-pointer signifying that this is a back-pointer
8 bytes for the record location (2-byte file ID, 4-byte page-in-file, 2-byte slot ID)
Hope this helps!