Search Engine Q&A #27: How does the storage engine find variable-length columns?

This question came up in a class I was teaching a few weeks ago: given that the variable-length column offset array in a record only stores offsets, how does the storage engine find a particular variable-length column?

The question comes because there's nothing in the record to say which variable-length column is which – so how does it work? The answer is a combination of the null bitmap in the record, plus the metadata for the table/index stored in the system tables. All variable-length columns have a fixed 'location' within the variable-length portion of the record when they are non-null. Let's see what I mean.

First off I'll create a test table with a single record, with all variable-length columns null, and dump out the record using DBCC PAGE:

CREATE TABLE vartest (c1 INT, c2 VARCHAR (100), c3 VARCHAR (100), c4 varchar (100));
GO
INSERT INTO vartest VALUES (1, NULL, NULL, NULL);
GO

DBCC TRACEON (3604);
DBCC PAGE (test, 1, 152, 1);
GO

Slot 0, Offset 0x60, Length 11, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP
Memory Dump @0x66F4C060

00000000: 10000800 01000000 0400fe†††††††††††††………..

The 0x0400fe in bold-underlined is the null bitmap. The 0x0400 reverses to 0x0004 – which is the count of columns in the record. The 0xfe is the actual bitmap, which is 11111110 in binary. So all columns in the record except the first are null (although the null bitmap only needs to store bits for 4 columns, all bits in the null bitmap that aren't used are set to indicate a null). You can also see that there's no variable-length column offset array as *all* the variable-length columns are null.

Now I'll make the middle variable-length column non-null and dump out the record using DBCC PAGE:

UPDATE vartest SET c3 = 'c3c3c3c3';
GO
DBCC PAGE (test, 1, 152, 1);
GO

Slot 0, Offset 0x60, Length 25, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP VARIABLE_COLUMNS
Memory Dump @0x67FEC060

00000000: 30000800 01000000 0400fa02 00110019 †0……………
00000010: 00633363 33633363 33†††††††††††††††††.c3c3c3c3

We can see the null bitmap (bold-underlined) has changed to 0xfa in hex, or 11111010 in binary, to reflect that the first and third columns are non-null. As soon as a single variable-length column in the record is non-null, the variable-length column offset array is populated for all variable-length columns up to and including the last non-null column. The array has the count of entries in the array (the 0200, which reverses to 0x0002), and then for each column it stores the offset to the start of the following column, to avoid storing the length too. The difference between successive start-of-column offsets is the length, with the first length calculated using the end of the array itself as the starting point of the first column. In the record above, the array has two entries, ending at offsets 0x0011-1 and 0x0019-1 in the record. The array itself finishes at 0x0010, so the first entry in the array is essentially pointing at an empty value, which we know to be actually a null value (instead of a non-null empty value) using the null bitmap.

Now if I make the first variable-length column non-null, watch what happens to the record:

UPDATE vartest SET c2 = 'c2c2c2c2';
DBCC PAGE (test, 1, 152, 1);
GO

Slot 0, Offset 0x60, Length 33, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP VARIABLE_COLUMNS
Memory Dump @0x6714C060

00000000: 30000800 01000000 0400f802 00190021 †0…………..!
00000010: 00633263 32633263 32633363 33633363 †.c2c2c2c2c3c3c3c
00000020: 33†††††††††††††††††††††††††††††††††††3

The null bitmap has changed from 0xfa to 0xf8 to reflect the newly non-null column. The variable-length portion is re-ordered so that the newly non-null column is in its correct place (you can see the c2c2c2c2c2 comes before the c3c3c3c3 in the hex dump of the record) and the offsets have been updated accordingly. The array still only hold two values though. Now if I set the last column to be non-null:

UPDATE vartest SET c4 = 'c4c4c4c4';
DBCC PAGE (test, 1, 152, 1);
GO

Slot 0, Offset 0x81, Length 43, DumpStyle BYTE
Record Type = PRIMARY_RECORD Record Attributes = NULL_BITMAP VARIABLE_COLUMNS
Memory Dump @0x66FCC081

00000000: 30000800 01000000 0400f003 001b0023 †0…………..#
00000010: 002b0063 32633263 32633263 33633363 †.+.c2c2c2c2c3c3c
00000020: 33633363 34633463 346334†††††††††††††3c3c4c4c4c4

The null bitmap now indicates all 4 columns are non-null. The count of columns in the variable-length column offset array has increased to 3 and the new entry is added on the end, as it is the last variable-length column.

You can conceptually think of the algorithm to obtain a variable-length column value as merging the relevant parts of the null bitmap with the variable-length column offset array, and then returning the value if it's non-null.

Hope this explains things!

One thought on “Search Engine Q&A #27: How does the storage engine find variable-length columns?

  1. Thanks for explanation!!! It confirmed my knowledge about var-length columns.

    >> and then for each column it stores the offset to the start of the following column, to avoid storing the length too.
    Could it be that var-length column offset array stores the offset to the start of the current column? I have such cases, and I think this is due to th edifferent number of columns in the table’s description and in the record.

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.