How are default column values stored?

An interesting question came up in class yesterday: how is a default column value stored, and what if some rows exist when a column is added and then the default value changes?

An example scenario is this:

  • Step 1: Create a table with two columns
  • Step 2: Add 10 rows
  • Step 3: Add a third column to the table with a non-null default
  • Step 4: Drop the default for the third column
  • Step 5: Add a new default for the third column

And selecting the initial 10 rows can be demonstrated to return the 3rd column using the initial default set in step 3. (It makes no difference if any rows are added between steps 3 and 4.)

This means that there *must* be two default values stored when a new column is added: one for the set of already-existing rows that don’t have the new column and one for any new rows. Initially these two default values will be the same, but the one for new rows can change (e.g. in steps 4 and 5 above) with breaking the old rows. This works because after the new column is added (step 3 above), it’s impossible to add any more rows that *don’t* have the new column.

And this is exactly how it works. Let’s investigate!

Firstly I’ll create a simple database and test table and insert 10 rows. I’ll use the simple recovery model so I can clear the log by doing a checkpoint:

CREATE DATABASE [Company];
ALTER DATABASE [Company] SET RECOVERY SIMPLE;
GO
USE [Company];
GO

-- Create a test table to use
CREATE TABLE [Test] ([c1] INT IDENTITY, [c2] INT AS ([c1]));
GO
INSERT INTO [Test] DEFAULT VALUES;
GO 10
SELECT * FROM [Test];
GO
c1          c2
----------- -----------
1           1
2           2
3           3
4           4
5           5
6           6
7           7
8           8
9           9
10          10

Now I’ll clear the log, add the third column with the default, and look to see which system tables were added to because of addition:

CHECKPOINT;
GO

-- Add column with default value
ALTER TABLE [Test] ADD [c3] CHAR (6) NOT NULL CONSTRAINT [OriginalDefault] DEFAULT 'BEFORE';
GO

SELECT [AllocUnitName] FROM fn_dblog (NULL, NULL)
WHERE [Operation] = 'LOP_INSERT_ROWS';
GO
AllocUnitName
-----------------------
sys.syscolpars.clst
sys.syscolpars.nc
sys.sysrscols.clst
sys.sysseobjvalues.clst
sys.sysschobjs.clst
sys.sysschobjs.nc1
sys.sysschobjs.nc2
sys.sysschobjs.nc3
sys.sysobjvalues.clst

Cool. These system tables have the following functions:

  • sys.syscolpars: table column definitions (relational metadata)
  • sys.sysrscols: rowset column definitions (Storage Engine metadata – info to allow interpretation of record structures on pages)
  • sys.seobjvalues: various Storage Engine values of different uses
  • sys.sysschobjs: relational objects (e.g. tables, constraints)
  • sys.sysobjvalues: various relational values of different uses

I’m going to look in these to find how the inserted rows relate to our table. I’m going to need a few IDs first (using my handy procedure to do that):

EXEC sp_allocationmetadata N'Test';
GO
Object Name Index ID Partition ID      Alloc Unit ID     Alloc Unit Type First Page Root Page First IAM Page
----------- -------- ----------------- ----------------- --------------- ---------- --------- --------------
Test        0        72057594040549376 72057594045792256 IN_ROW_DATA     (1:247)    (0:0)     (1:288)

And now I can query those system tables. Note that they’re ‘hidden’ system tables so you can’t query them unless you connect using the Dedicated Admin Connection. Easiest way to do that is to prefix your SSMS connection string with ‘admin:’ (and if you’re connecting to a remote server, the server needs to have sp_configure option remote admin connections enabled). Make sure you use the correct database after connecting, as the DAC drops you into master.

SELECT * FROM sys.syscolpars WHERE [id] = OBJECT_ID (N'Test');
GO
id        number colid name xtype utype length prec scale collationid status maxinrow xmlns dflt        chk idtval
--------- ------ ----- ---- ----- ----- ------ ---- ----- ----------- ------ -------- ----- ----------- --- ---------
245575913 0      1     c1   56    56    4      10   0     0           5      4        0     0           0   0x0A000000010000000100000000
245575913 0      2     c2   56    56    4      10   0     0           209    4        0     0           0   NULL
245575913 0      3     c3   175   175   6      0    0     872468488   3      6        0     261575970   0   NULL

These are the relational definitions of the columns in the table, and you can see that c3 is listed as having a default constraint, with ID 261575970.

SELECT * FROM sys.sysschobjs WHERE [id] = 261575970;
GO
id          name            nsid nsclass status type pid       pclass intprop created                 modified                status2
----------- --------------- ---- ------- ------ ---- --------- ------ ------- ----------------------- ----------------------- -------
261575970   OriginalDefault 1    0       131072 D    245575913 1      3       2017-04-26 13:37:42.463 2017-04-26 13:37:42.463 0

This is the constraint named OriginalDefault with type D (default) and the default value has ID 245575913.

SELECT * FROM sys.sysobjvalues WHERE [objid] = 261575970;
GO
valclass objid     subobjid valnum value imageval
-------- --------- -------- ------ ----- ----------------------
1        261575970 0        0      2     0x28274245464F52452729

And the imageval column has the default value as hex-encoded ASCII values. Using the ASCII map on Wikipedia, the value is (‘BEFORE’), including the parentheses.

So that’s the default value for new rows. What about the default value for rows that already exist?

SELECT * FROM sys.sysrscols WHERE [rsid] = 72057594040549376;
GO
rsid              rscolid hbcolid rcmodified ti   cid       ordkey maxinrowlen status offset nullbit bitpos colguid
----------------- ------- ------- ---------- ---- --------- ------ ----------- ------ ------ ------- ------ -------
72057594040549376 1       1       10         56   0         0      0           128    4      1       0      NULL
72057594040549376 2       2       10         56   0         0      0           128    8      2       0      NULL
72057594040549376 3       3       0          1711 872468488 0      0           640    12     3       0      NULL

These are the Storage Engine definitions of the columns in the table. The status value indicates that the value may not be in the row, and where to get the default value from.

SELECT * FROM sys.sysseobjvalues WHERE [id] = 72057594040549376;
GO
valclass id                subid valnum value  imageval
-------- ----------------- ----- ------ ------ --------
1        72057594040549376 3     0      BEFORE NULL

And there is the Storage Engine storage for the default value for the c3 column for those rows that existed before c3 was added.

Now I’ll checkpoint, drop the default constraint, and look to see what happened in the log:

CHECKPOINT;
GO

ALTER TABLE [Test] DROP CONSTRAINT [OriginalDefault];
GO

SELECT [AllocUnitName] FROM fn_dblog (NULL, NULL)
WHERE [Operation] = 'LOP_DELETE_ROWS';
GO
AllocUnitName
---------------------
sys.sysobjvalues.clst
sys.sysschobjs.nc1
sys.sysschobjs.nc2
sys.sysschobjs.nc3
sys.sysschobjs.clst

So that’s the relational default value being deleted, in the reverse order from how it was added. Note that the Storage Engine default value wasn’t deleted.

Now I’ll create a new default constraint for the c3 column:

CHECKPOINT;
GO

ALTER TABLE [Test] ADD CONSTRAINT [NewDefault] DEFAULT 'AFTER' FOR [c3];
GO

SELECT [AllocUnitName] FROM fn_dblog (NULL, NULL)
WHERE [Operation] = 'LOP_INSERT_ROWS';
GO
AllocUnitName
---------------------
sys.sysschobjs.clst
sys.sysschobjs.nc1
sys.sysschobjs.nc2
sys.sysschobjs.nc3
sys.sysobjvalues.clst

And doing the various queries again gets us to the new relational column stored default value of (‘AFTER’), including the parentheses.

So just to prove what I said before investigating, I’ll add ten new rows, which will have the c3 value AFTER, and then query the table and I’ll see that the initial ten rows that don’t have c3 in will be given the original default value of BEFORE:

INSERT INTO [Test] DEFAULT VALUES;
GO 10

SELECT * FROM [Test];
GO
c1          c2          c3
----------- ----------- ------
1           1           BEFORE
2           2           BEFORE
3           3           BEFORE
4           4           BEFORE
5           5           BEFORE
6           6           BEFORE
7           7           BEFORE
8           8           BEFORE
9           9           BEFORE
10          10          BEFORE
11          11          AFTER 
12          12          AFTER 
13          13          AFTER 
14          14          AFTER 
15          15          AFTER 
16          16          AFTER 
17          17          AFTER 
18          18          AFTER 
19          19          AFTER 
20          20          AFTER 

Hope you found this interesting! (And don’t forget to drop your DAC connection.)

4 thoughts on “How are default column values stored?

  1. Hello Paul,
    Had a quick question, but not exactly related to this topic.From a table, we have deleted huge amounts of data and unused space seems to be over 200 GB. Is there a way we can reclaim this space ?

    1. If the unused space is in a table, rebuild the clustered index. If it’s in a database, you could shrink or rebuild everything into another filegroup and drop the old one.

    1. Things like default column values – just like in regular databases. My guess is you’re creating and dropping lots of temp tables through concurrent connections. There was also a bug that was fixed in 2017 CU5 that you may be hitting.

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.