How are per-column modification counts tracked?

Earlier today there was a question on the MVP mailing list asking how SQL Server keeps track of per-column modification counts.

From 2008 onwards, the hidden system table sys.sysrscols tracks modifications to table columns using the rcmodified column. The hidden system tables (introduced in 2005 when we rewrote the entire metadata management system) are only accessible when connected using the Dedicated Administrator Connection (DAC), as I've blogged about many times before. This means connecting using SQLCMD -A or with the prefix 'admin:' on your connection string.

It's also exposed in the sys.system_internals_partition_columns catalog view too, which doesn't require the DAC. (I'd forgotten this – thanks to my friend Remus Rusanu for reminding me!)

Bear in mind, however, that this is inferred behavior based on my background knowledge and observations and may well change completely in future versions – it's all undocumented and so you really shouldn't build anything that relies on this programmatically. If I don't say that then my good friend Conor Cunningham will slap me next time he sees me :-)

Here's an example using a simple table:

CREATE TABLE t1 (c1 INT, c2 INT, c3 INT);
GO

And we can look at the per-column modification counters using the DAC query:

SELECT
    p.[object_id],
    p.[index_id],
    rs.[rscolid],
    rs.[rcmodified]
FROM sys.sysrscols rs
JOIN sys.partitions p
    ON rs.[rsid] = p.[partition_id]
WHERE p.[object_id] = OBJECT_ID ('t1');
GO

And we get:

object_id   index_id    rscolid     rcmodified
———– ———– ———– ———–
277576027   0           1           0
277576027   0           2           0
277576027   0           3           0

And you can use the sys.system_internals_partition_columns view as well:

SELECT
    p.[object_id],
    p.[index_id],
    pc.[partition_column_id],
    pc.[modified_count]
FROM sys.system_internals_partition_columns pc
JOIN sys.partitions p
    ON pc.[partition_id] = p.[partition_id]
WHERE p.[object_id] = OBJECT_ID ('t1');
GO 

I'll continue with the DAC query against sysrscols directly.

If we do a modification and run the DAC query again:

INSERT INTO t1 VALUES (1, 1, 1);
GO

object_id   index_id    rscolid     rcmodified
———– ———– ———– ——————–
277576027   0           1           0
277576027   0           2           0
277576027   0           3           0

Huh? It didn't show up yet. This is because updates to some system tables are not flushed from the in-memory metadata cache until a checkpoint occurs.

Let's try that, and then run the DAC query again:

CHECKPOINT;
GO

object_id   index_id    rscolid     rcmodified
———– ———– ———– ——————–
277576027   0           1           1
277576027   0           2           1
277576027   0           3           1

And to prove it, I'll update just c2 twice, checkpoint, and run the DAC query again.

UPDATE t1 SET c2 = 2;
UPDATE t1 SET c2 = 3;
CHECKPOINT;
GO

object_id   index_id    rscolid     rcmodified
———– ———– ———– ——————–
277576027   0           1           1
277576027   0           2           3
277576027   0           3           1

Cool eh?

What about the sysindexes column rowmodctr? How is that tracked/calculated?

It's calculated as the delta of the sysrscols.rcmodified counter for the leading column in the index since the index column statistics were last rebuilt (or initially created).

Here are some simple indexes on our table, plus a sysindexes query to get the rowmodctr for it:

CREATE NONCLUSTERED INDEX t1_c1_c2 ON t1 (c1, c2);
CREATE NONCLUSTERED INDEX t1_c3 ON t1 (c3);
GO

SELECT
    [name],
    [rowmodctr]
FROM sysindexes
WHERE [id] = OBJECT_ID ('t1');
GO

name             rowmodctr
—————- ———–
NULL             3
t1_c1_c2         0
t1_c3            0

The first row is for the heap, as I didn't created a clustered index. 

Let's make some changes and see how the sysindexes.rowmodctr and sysrscols.rcmodified columns change:

UPDATE t1 SET c1 = 4;
UPDATE t1 SET c1 = 5;
UPDATE t1 SET c1 = 6;
UPDATE t1 SET c2 = 2;
UPDATE t1 SET c2 = 3;
UPDATE t1 SET c3 = 2;
CHECKPOINT;
GO

object_id   index_id    rscolid     rcmodified
———– ———– ———– ——————–
277576027   0           1           4
277576027   0           2           5
277576027   0           3           2
277576027   2           1           0
277576027   2           2           0
277576027   2           3           0
277576027   3           1           0
277576027   3           2           0

name             rowmodctr
—————- ———–
NULL             5
t1_c1_c2         3
t1_c3            1

Since the nonclustered indexes were created, I performed 3 updates to c1, 2 updates to c2, and 1 update to c3. The sysrscols.rcmodified counter for each of these columns increased by the correct number, but you can see that it isn't tracked for the columns in the nonclustered indexes themselves. Btw, the final column in each of the nonclustered indexes is referring to the hidden physical RID 'column' in the index that points back to the corresponding heap data record.

However, the sysindexes.rowmodctr changed unintuitively. I performed 5 separate modifications to the columns contained in the t1_c1_c2 index and yet its rowmodctr is only set to 3. This is because the algorithm for rowmodctr is to take the per-column delta of sysrscols.rcmodified for the leading index column since the index column statistics were last rebuilt (or initially created).

I'll prove it by updating the statistics, doing 2 modifications to c1, 4 modifications to c2 and checkpointing. We should see the sysrscols.rcmodified for c1 go to 6, for c2 go to 9, and the sysindexes.rowmodctr for t1_c1_c2 changes to 2.

UPDATE STATISTICS t1;
GO

UPDATE t1 SET c1 = 7;
UPDATE t1 SET c1 = 8;
UPDATE t1 SET c2 = 4;
UPDATE t1 SET c2 = 5;
UPDATE t1 SET c2 = 6;
UPDATE t1 SET c2 = 7;
CHECKPOINT;
GO

object_id   index_id    rscolid     rcmodified
———– ———– ———– ——————–
277576027   0           1           6
277576027   0           2           9
277576027   0           3           2
277576027   2           1           0
277576027   2           2           0
277576027   2           3           0
277576027   3           1           0
277576027   3           2           0

name             rowmodctr
—————- ———–
NULL             9
t1_c1_c2         2
t1_c3            0

Like so. Even though we updated c2 4 times, the sysindexes.rowmodctr for t1_c1_c2 is only set to 2, clearly using the sysrscols.rcmodified delta for c1.

You can play around with these examples to further convince yourself of the current behavior.

Hope this has helped remove some confusion about this. There are unfortunately many, many twists to this and when counts are modified…

Enjoy playing around!

18 thoughts on “How are per-column modification counts tracked?

  1. Hi Kimberly,
    I’ve been using the sysindexes.rowmodctr unware that the column level mod counter was also exposed in sys.system_internals_partition_columns.
    When we last talked about auto-update stats you pointed out how the sysrscols.rcmodified is the source for the auto-update stats threshold, but any custom calculations on that figure had to be done in the DAC, defeating my purpose.
    So great stuff! I’ll put it to good use!

  2. 1000 apologies Paul!
    I know why I thought it was Kimberly, I saw this shared with her pic next to it.
    When I read thru, I didn’t notice the blog header. Sorry. Still, the thanks remain! :-)

  3. Thank you very much. I have been using rowmodctr, but now I guess I can use system_internals_partition_columns to find if a column is modified.

  4. Just keep in mind that like any undocumented feature, it may be removed in a future version (as Paul stated above).

  5. Not sure how the sysindexes.rowmodctr gets modified for the heap. in your final example, you modify 4 times, however the value only jumps from 5 to 9.
    When I run the example, i notice that the values change every time I update the any record.
    Just trying to find how rowmodctr would work for any modifications of a heap.

  6. Hello Paul, i need urgent Response!
    In sql server 2005 we have Column Id in the “sys.columns” view, but the same column id is not referenced in “sys.system_internals_partition_columns” view, my goal is to find the “leaf_offset” and “leaf_null_bit” for a particular column, i can’t get it through column’s id. This problem is only in sql 2005, in onward version of sql server the column’s leaf_offset can be found through its ID.
    Please i need assistance in this scenario, any body?? thanks in advance!

  7. Hi Paul,

    Sql Server internals book mentions
    Statement Changes to colmodctr Values
    UPDATE If the update is to nonkey columns: colmodctr values for modified columns
    are increased by 1 for each row updated. If the update is to key columns: colmodctr
    values are increased by 2 for all the columns in the table, for each row updated.
    but this is not observed.

    Regards,

    1. I’m not surprised – there are lots of things in all the internals books that are incorrect. I can only speak to the correctness of the DBCC chapters in the SQL Server 2008/2012 Internals books that I wrote.

  8. Hi Paul,

    Would I be correct in stating that for a Heap the row returned shown in the RowModCtr column from sysindexes is the number of rows in the heap – it is not indicative of how many rows have changed since the last time Stats were updated?

    I was running some tests today comparing the rows returns from sysindexes and the new sys.dm_db_stats_properties DMV.

    When I update all of the Statistics in a table and run the SELECT * FROM sysindexes WHERE id = OBJECT_ID(”); command I can see 0’s in the rowmodctr column for all of the indexes but for the indid = 0 row (Heap) I see the row count.

    Thanks!

    1. No – it should be the highest rowmodctr for all columns in the table, as per the example. I’d need to play around with it to confirm as it’s been a while since I’ve done that.

  9. Hi Paul,

    I just have 1 question regarding the value that is returned from sys.sysindexes in the rowmodctr column for HEAPS.

    I have been doing some testing and I can see when I update all of the statistics on a HEAP (that is not being used) the value in this column goes to 0 for all Statistics rows for the table but never for the Heap record itself (indid = 0). The value returned is either the same as the rowcnt or near enough.

    I was passed a script to use to check when statistics on a table need updating which uses this column and I noticed then when I ran it even after updating all statistics on a heap table it always brings back the indid = 0 record as the value in rowmodctr is always close to the number of rows in the table.

    I prefer to use the sys.dm_db_stats_properties dmv anyways but would just still would like to find out out how this is defined – couldn’t find the answer online.

    Thanks as always for your time.

  10. Just to follow up I think I can see why stepping through the example the value of rowmodctr is 3 after the the Insert and 2 Updates but then it goes to 5 after the 6 Update statements and then it goes to 9 after you do the Update Stats and then next 6 Inserts.

    I guess I am just looking for clarification on how this 9 figure is generated.

    Thanks again.

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.