A SQL Server DBA myth a day: (6/30) three null bitmap myths

(Look in the Misconceptions blog category for the rest of the month’s posts and check out the 60-page PDF with all the myths and misconceptions blog posts collected together when you join our Insider list, plus my online Myths and Misconceptions training course on Pluralsight.)

Today’s myth is a multi-parter especially for the folks in the Designing for Performance class that Kimberly’s teaching on the MS campus today – hellooooo!

The null bitmap keeps track of which columns in a record are null or not. It exists as a performance optimization to allow the Storage Engine to avoid having to read all of a record into the CPU when null columns are part of the SELECT list – thus minimizing CPU cache line invalidations (checkout this link for details of how CPU memory caches work and the MESI protocol). There are three pervasive myths to debunk here:

Myth #6a: The null bitmap isn’t always present.

TRUE

The null bitmap is *always* present in data records (in heaps or the leaf-level of clustered indexes) – even if the table has no nullable columns – except in the special case where the record is only made up of non-NULL SPARSE columns. The null bitmap is *not* always present in index records (leaf level of nonclustered indexes, and non-leaf levels of clustered and nonclustered indexes).

Here’s a simple script to prove it:

CREATE TABLE [NullTest] ([c1] INT NOT NULL);
CREATE NONCLUSTERED INDEX [NullTest_NC] ON [NullTest] ([c1]);
GO
INSERT INTO [NullTest] VALUES (1);
GO

EXEC sp_allocationMetadata N'NullTest';
GO

You can get my sp_allocationMetadata script from my post Inside The Storage Engine: sp_AllocationMetadata – putting undocumented system catalog views to work.

Use the page IDs in the output from the script in the First Page column. Do the following:

DBCC TRACEON (3604);
DBCC PAGE (foo, 1, 152, 3); -- page ID from SP output where Index ID = 0
DBCC PAGE (foo, 1, 154, 1); -- page ID from SP output where Index ID = 2
GO

From the first DBCC PAGE dump of the heap data record:

Slot 0 Offset 0x60 Length 11

Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP
Memory Dump @0x685DC060

From the second DBCC PAGE dump of the nonclustered index record:

Slot 0, Offset 0x60, Length 13, DumpStyle BYTE

Record Type = INDEX_RECORD           Record Attributes = No null bitmap
Memory Dump @0x685DC060

Myth #6b: The null bitmap only contains bits for nullable columns.

FALSE

The null bitmap, when present, contains bits for all columns in the record, plus ‘filler’ bits for non-existent columns to make up complete bytes in the null bitmap. I already debunked this one with a long internals blog post last May – see Misconceptions around null bitmap size.

Myth #6c: Adding another column to the table always results in an immediate size-of-data operation.

FALSE

The only times that adding a column to a table results in a size-of-data operation (i.e. an operation that modifies every row in a table) are:

  • When the new column has a non-null default and is not nullable
  • When the new column has a non-null default, IS nullable, and you’re using SQL Server 2008 R2 or earlier

In all other cases, the Storage Engine remembers that there are one or more additional columns that may not actually be present in the records themselves. I explained this in a little more depth in the blog post Misconceptions around adding columns to a table.

Back to a single myth-a-day for tomorrow – today was a bulk-bin special!

6 thoughts on “A SQL Server DBA myth a day: (6/30) three null bitmap myths

  1. Hi Paul,

    Can you please explain why the non-clustered index record is taking 13 bytes? shouldn’t it take 16 bytes?

    4 bytes for the record header
    4 bytes for the fixed length column (C1)
    8 bytes for the RID

    Also, is it true that the index record will take 0 bytes for null bitmap and variable length columns since there is no null bitmap and no variable length columns?

  2. Hi Paul,

    I have a question about the NULL bitmaps & Sparse columns

    “The null bitmap is *always* present in data records (in heaps or the leaf-level of clustered indexes) – even if the table has no nullable columns – except in the special case where the record is only made up of non-NULL SPARSE columns”

    Since sparse columns have to be NULLable, I am guessing that the values in the sparse columns itself need to be not NULL?

    So if the record of a nullable sparse column has to be without NULLs to get no NULL_BITMAP on the record, and I see this happening for the DATA PAGES, but not for the index pages of the NC index, why is that?

    CREATE TABLE dbo.NonNullSparse( VAL INT SPARSE NULL)
    CREATE NONCLUSTERED INDEX IX_NonNullSparse_VAL
    on dbo.NonNullSparse(Val);

    INSERT INTO dbo.NonNullSparse(VAL)
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
    FROM MASTER..spt_values spt1;

    Data page:
    Record Type = PRIMARY_RECORD Record Attributes = VARIABLE_COLUMNS

    Index page:
    Record Type = INDEX_RECORD Record Attributes = NULL_BITMAP

  3. Paul,
    Given that the null bitmap always contains all columns, what are the pros/cons of making a CHAR(2) (not varchar) column nullable?

    In my app, empty string & null are treated the same, so that’s not a deciding concern.

    Since the overhead of a varchar(2) is 2bytes+2char(always 2 char or empty in this data set)=4bytes, would I be better off switching to varchar if the table has over 50% of the rows with null?

    Besides the storage consideration, what about CPU/execution of queries/indexes, the table has 13 columns (only ints, numeric, small chars) but 10s of millions of rows.

    Thoughts?

    1. It’s extremely hard to say without prototyping both methods along with load testing of the usual range of queries, but I would be tempted to leave it as a char(2).

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.