I was interested in exploring various questions about columnstore indexing this morning – and I realized I should probably blog about what I observed. This truly was just an exploration and not a formal test (call it semi-structured fiddling). Also, some of my “questions” were really just confirmations. While I believe what I read, I like to see things firsthand whenever possible.
Today I thought I would take a few minutes to explore columnstore index behavior on a 123,695,104 row fact table. My previous tests had been limited to < 10 million rows, which doesn’t really showcase columnstore index capabilities. So I thought I would experiment with higher numbers – although I still want to start the billion row testing scenarios at some point.
The experimentations took place on a SQL Server instance capped at 8GB of RAM max server memory and 8 logical processors. I used the FactInternetSales table from AdventureWorksDWDenali – increasing it to 123,695,104 rows.
I covered the entire fact table for my initial exploration (although as you’ll see, I made a few changes both to this index and also moving from a heap table to a clustered index):
CREATE NONCLUSTERED COLUMNSTORE INDEX [NonClusteredColumnStoreIndex-20120225-092018] ON [dbo].[FactInternetSales]
) WITH (DROP_EXISTING = OFF)
Remember that we can have up to 1,024 columns and that the concept of key columns and INCLUDE don’t apply here. Nor does the ASC or DESC keywords or creation of the columnstore index as clustered (only nonclustered is supported at this point).
Regarding the data types, we can cover the standard “business” ones – but not the “max”, LOB and xml flavors.
<quick aside> By the way – a shout out to Eric Hanson’s Columnstore Index references on the TechNet Wiki. These have been invaluable in understanding columnstore indexing and also revealing tools to help fish for your own answers. I love seeing this kind of high quality content coming directly from the Program Managers. I really hope that more PMs start updating their associated feature areas on TechNet Wiki in the future. When I first heard about TechNet Wiki I was more than a little dubious. We already have blogs, forums, BOL, papers, videos, etc. Why do we need another information channel? Seeing Eric’s content has totally shifted my opinion on this. The information is timely, fresh (you see it get updated frequently) and easily discoverable. It also allows for community participation and questions – which, if managed well – only helps the associated quality of the content. I don’t believe TechNet Wiki replaces BOL or other information channels, but I feel it has definitely earned a seat at the table. Again, I hope other PMs follow this lead – or alternatively use their blogs to heavily describe their feature areas (much like the great blog post series on the AlwaysOn Readable Secondary feature from Sunil Agarwal). Just doing periodic updates on their pet features can help do wonders for feature adoption and comprehension. </quick aside>
Back on task… The following are the various questions I asked and the various observations made….
Was tempdb used in creating the columnstore index on a heap table?
I intentionally had tempdb configured to an initial small size with auto-growth enabled in order to see if it was used at all during the CREATE COLUMNSTORE INDEX against the 123 million row table (I could have looked through counters and other ways too – this was just a side effect observation). The answer? It was not used (staying at a very small value of 10 MBs).
How do things look in sys.indexes and sys.index_columns?
Nothing terribly interesting – except that the is_included_column is a value of “1”:
Anything interesting in sys.column_store_dictionaries?
Most definitely… This was a pretty illuminating catalog to query. A few of the observations:
· There were 149 rows across the 26 columns defined in the index.
· 25 columns had one row – and 1 column had 125 rows. Not surprisingly – the SalesOrderNumber was the column with 125 entries. This is also the column I populated with a random GUID that I converted to nvarchar(20) for new entries to the table. The values for this column, while not constrained, were nearly unique in my test (with the exception of the base data).
· The on disk (dictionary) storage for SalesOrderNumber was by far the worst (not surprisingly). It was 2,328,411,337 in bytes. Compare that with 21,384 bytes for an integer column, 74,584 bytes for a datetime column and 1,088 bytes for a money data type column. So this would point to a design decision around not including 100% unique values as part of the index – or if you do, choosing a much more compression friendly data type mapped to an associated dimension table.
· The entry_count column_store_dictionaries column was also very interesting. For example, the int data type ProductKey had 158 entries. The ShipDate datetime column had 1,126 entries. The SalesOrderNumber had 123,662,365 entries. No magic here (and the difference between those entries and the 123,695,104 row count was that the initial data populated didn’t use my random GUID values).
· The type column from the results told me about the type of dictionary… For the columns with “1”, that maps to hash dictionary for integer based columns. A type of “4” represented my float columns. Type “3” represented the string values.
· The flags (“internal use only”) column from column_store_dictionaries was “0” for all columns except for UnitPriceDiscountPct (float), DiscountAmount (float), OrderDate/DueDate/ShipDate (datetime). The flags column was “2” for the CustomerPONumber (nvarchar(25)) and CarrierTrackingNumber (nvarchar(25)). Flags was 0 for everything else, so I’m definitely curious about this.
Before I continued, I dropped and recreated the columnstore index, this time without the SalesOrderNumber. It wasn’t going to provide any value for my next set of explorations.
With the SalesOrderNumber column removed, leaving the other columns on the columnstore index, what was the on-disk footprint?
310,400 bytes for the dictionary size – for 123 million rows (per sys.column_store_dictionaries). I double checked sys.dm_db_partition_stats as well and the lob_used_page_count was 304,579. So the page count is around 2.3 GB – and the dictionary size is 310 KB. Again, this is for 123,695,104 rows. The heap itself is 2,876,461 pages – or roughly 22.4 GB.
Anything interesting in sys.column_store_segments?
Another great catalog view to explore with many interesting ways to slice and dice this data:
· All 26 columns had 120 segments associated with them. But what is interesting is that this is for 26 columns – even though my columnstore index was recreated with just 25 columns. So it was segmenting based on each column of the heap (seemingly). If you enable trace flag 646 and 3605, you’ll see during index creation a message like “Column 25 was omitted from VertiPaq during column index build.” So it made me wonder if this really did represent 120 segments for the omitted column?
· The average row count per segment was 1,030,792 rows and a maximum of 1,048,576.
· The minimum rows for a segment was 2,416.
· The 2,416 row count was for all columns on segment number 118. This was not the last segment. Segment 119 was the last segment (first segment is 0-based) for each column and each one of these had 1,040,352. Eric Hanson gave an explanation on a forum which talks about segments being built on parallel threads and when hitting the end of the table, not having enough rows to completely fill segments.
· As for encoding_type, that was also interesting. Some columns had both an encoding_type of 1 AND 2. Only the first five columns of the table had an encoding type of “1” and then columns 1 through 25 had an encoding type of “2”. Columns 21 and 22 (Freight and CarrierTrackingNumber) had an encoding of 3. Freight is money data type and CarrierTrackingNumber is nvarchar(25). Column 26 (ShipDate datetime) had an encoding type of “4”. In BOL – encoding_type is defined as “type of encoding used for that segment”. Sigh.
· The min_data_id and max_data_id value rangers were also interesting. Sometimes they reflected the actual column data ranges (for example in the date columns) and sometimes not. Looking at “Understanding Segment Elimination” – Eric Hanson describes this behavior – saying that the values in these ranges can also reference “into a dictionary”.
Anything interesting in sys.column_store_index_stats?
The only noteworthy value was the number_of_segments (3,120).
Can I see some segment elimination for my queries? (for non-string types on a heap table)
Remember that I created a columnstore index on a heap. Let’s say we look at the segment ranges for the ProductKey column (column id 1). Here are the associated ranges:
SELECT column_id, min_data_id, max_data_id, segment_id
WHERE column_id = 1
ORDER BY column_id, min_data_id, max_data_id, segment_id;
All 120 segments have identical min and max ranges:
What about other columns – like the DueDate (datetime)?
Same thing applies – each segment, with the exception of a trailing row, have the same ranges.
What about segment elimination on a table with a clustered index?
So next I dropped the columnstore index, and created a clustered index on ProductKey. I’m not saying this is a great choice for the clustered index key – but rather, I’m just trying to understand the behavior in comparison to a heap and also look at any potential segment elimination.
In the following query, I’m pulling total order quantity for three products:
— Segment elimination written to error log
DBCC TRACEON(3605, –1);
DBCC TRACEON(646, –1);
GROUP BY ProductKey, DueDateKey
HAVING ProductKey IN
(478, 343, 574);
I first validated that batch mode was indeed being used:
I then looked at the SQL Error Log (per the trace flag) – and I saw the storage engine skipped 28 “row group” (segments):
If I look at the segment meta data though for column_id 1, and I see blocks of segments covering the same ranges (instead of the same ranges across all segments):
SELECT segment_id, min_data_id, max_data_id
WHERE column_id = 1
ORDER BY segment_id
By the way, the 28 segment elimination I saw in the error log for column 1 translated to 27,226,112 rows across (row_count from sys.column_store_segments). The query itself takes less than 1 second to return 569 rows – from a 120 million+ row table.
Will I get segment elimination on other columns not part of the clustered index key?
I tried the following query that filtered on DueDateKey:
GROUP BY ProductKey, DueDateKey
HAVING DueDateKey = 20040522
Nope. While the query still ran in less than a second – using batch instead of row mode for the columnstore index scan, segment elimination event did not occur. Looking at the min and max ranges from sys.column_store_segments helps answer why segment elimination wasn’t possible:
SELECT column_id, min_data_id, max_data_id, segment_id
WHERE column_id = 3 AND
20040522 BETWEEN min_data_id AND max_data_id
This returns all 120 rows:
What if the DueDateKey is a secondary key column on the clustered index?
I dropped the columnstore index and recreated the clustered index with ProductKey and DueDateKey. I then recreated the columnstore index.
Checking the DueDateKey range in sys.column_store_segments, I see the ranges are no longer identical:
That seemed promising, so I executed the query filtering by DueDateKey:
Sure enough – I had segment elimination, even though this was defined as the second column in the clustered index key. And what’s more, my ProductKey column segment elimination was still working (I tested the earlier queries again). This raises some interesting questions around new clustered index key strategies when you know your table will primarily be used with a columnstore index.
Okay – that’s enough for today. I still have many other scenarios I’d like to try out, and I’ll share here when I get the opportunity.
3 Responses to Exploring Columnstore Index Metadata, Segment Distribution and Elimination Behaviors
Great write up. We just implemented columnstore indexes and I’ve been scouring the web to understand exactly what is going on behind the scenes, tuning opportunities, etc. In our environment we have the columnstore index on a partitioned table. When I review the execution plan I noticed that the partitions scanned is now 0. I’m wondering why this is the case. Does it have something to do with partition elimination working with segmentation elimination that a scan is just not necessary any longer because now SQL will know exactly where to go. Or is the batch vs row processing? The same query using the cluster without the columnstore will show how many partitions it had to scan. Do you have any feedback on this behavior?
Per your observation of the partitions scanned being 0 – this doesn’t surprise me given that this is not a b-tree structure. The partition elimination concept as applied to table partitions, applies to traditional structures, not the columnstore index. If you are doing partition elimination on the B-Tree (not on the columnstore index), then I would expect stats to appear for this.
And I don’t believe that this is articulated specifically in any documentation, so I’m just deriving this from observation and assumption.