sqlskills-logo-2015-white.png

Which LOB pages are associated with a specific columnstore segment?

Chalk this post up as an “exploratory” mission based on the following question…

How do I track which LOB pages are associated with a specific columnstore index segment?

Jonathan Kehayias and I discussed this the other day and hashed out a few options for tracking this.

To illustrate this topic and keep it at a granular level – I used a simple table named FactIDDemo with a bigint FactID column that had a unique clustered index on it.  Again – exploratory and not intended to be a realistic implementation pattern.

I loaded the table with 1,048,576 rows. And the segment statistics were as follows:

SELECT  [partition_id],
[segment_id],
[row_count],
[min_data_id],
[max_data_id]
FROM    sys.[column_store_segments]
WHERE   [partition_id] = 72057594043236352 ;

image

How many LOB pages were allocated for this single segment?

SELECT  [partition_id],
[object_id],
[index_id],
[partition_number],
[in_row_used_page_count],
[lob_used_page_count],
[used_page_count],
[row_count]
FROM sys.[dm_db_partition_stats]
WHERE [partition_id] = 72057594043236352;

image

We see 351 used LOB pages and executing DBCC IND confirms this as well, outputting the page ids accordingly:

DBCC IND('BigFactTable', 'FactIDDemo', 2);

That command returned 351 rows – one of which was the IAM page and the remainder text pages.

I also created the following XE session to validate page access (testing on a cold cache using my session ID as well):

CREATE EVENT SESSION [Columnstore Page Access] ON SERVER
ADD EVENT sqlserver.physical_page_read(
ACTION(sqlserver.session_id)
WHERE ([sqlserver].[session_id]=(57)))
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,
MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,
TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)
GO

And I used the following test query to initiate the physical page read events:

SELECT  COUNT([FactID]) AS [FactIDCount]
FROM [dbo].[FactIDDemo]
WHERE [FactID] BETWEEN 1 AND 1048576;
GO

This query resulted in 349 events related directly to the columnstore index access.  That is different from the 351 page count from sys.[dm_db_partition_stats] and DBCC IND output.  The XE event didn’t capture the IAM page reference (in my case, page 1870780) – and it also didn’t retrieve page 1870777 which was a pagetype 3 (LOB page) and when I looked at it via DBCC PAGE, didn’t show BLOB_FRAGMENT sections.

Segments are the unit of transfer for columnstore index access.  While this is the logical unit of transfer, a segment is stored as one or more LOB pages – and to track that physical access, it seems that the sqlserver.physical_page_read is a viable way of doing so.

2 thoughts on “Which LOB pages are associated with a specific columnstore segment?

  1. Hi Joe,

    I created an XE Session for the physical read event using SSMS and monitored the physical page reads for a table in which there is a columnstore index. This table has a LOB_Used_Page_Count of 1018. There are no other large objects other than ColumnStore Index data.

    After cleaning the buffers, When I ran a query which selects all the columns in the columnstore index, and watch the live data, the physical read events vary anywhere between 788 to 1148. I filtered to made sure that no other session’s data is coming in and could clearly see that these events are created due to the execution of the query using columnstore index.

    Can you please explain why are the physical reads varying each time I execute the query after cleaning the buffers, if the same data residing in each page is read. Am I missing something?

    Thanks,
    Jayashree

    1. Hi Jayashree,

      A few questions:

      Is there concurrent activity on this SQL Server instance or is this an isolated environment?

      Is the variation easily reproducible? And if so – can you share your test framework and I can look at it firsthand?

      Cheers,

      Joe

Comments are closed.

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.