Warning: Constant WP_TEMP_DIR already defined in /var/www/html/blogs/joe/wp-config.php on line 93

Warning: Cannot modify header information - headers already sent by (output started at /var/www/html/blogs/joe/wp-config.php:93) in /var/www/html/blogs/joe/wp-includes/rest-api/class-wp-rest-server.php on line 1902

Warning: Cannot modify header information - headers already sent by (output started at /var/www/html/blogs/joe/wp-config.php:93) in /var/www/html/blogs/joe/wp-includes/rest-api/class-wp-rest-server.php on line 1902

Warning: Cannot modify header information - headers already sent by (output started at /var/www/html/blogs/joe/wp-config.php:93) in /var/www/html/blogs/joe/wp-includes/rest-api/class-wp-rest-server.php on line 1902

Warning: Cannot modify header information - headers already sent by (output started at /var/www/html/blogs/joe/wp-config.php:93) in /var/www/html/blogs/joe/wp-includes/rest-api/class-wp-rest-server.php on line 1902

Warning: Cannot modify header information - headers already sent by (output started at /var/www/html/blogs/joe/wp-config.php:93) in /var/www/html/blogs/joe/wp-includes/rest-api/class-wp-rest-server.php on line 1902

Warning: Cannot modify header information - headers already sent by (output started at /var/www/html/blogs/joe/wp-config.php:93) in /var/www/html/blogs/joe/wp-includes/rest-api/class-wp-rest-server.php on line 1902

Warning: Cannot modify header information - headers already sent by (output started at /var/www/html/blogs/joe/wp-config.php:93) in /var/www/html/blogs/joe/wp-includes/rest-api/class-wp-rest-server.php on line 1902

Warning: Cannot modify header information - headers already sent by (output started at /var/www/html/blogs/joe/wp-config.php:93) in /var/www/html/blogs/joe/wp-includes/rest-api/class-wp-rest-server.php on line 1902
{"id":912,"date":"2013-02-04T13:50:49","date_gmt":"2013-02-04T21:50:49","guid":{"rendered":"http:\/\/3.209.169.194\/blogs\/joe\/?p=912"},"modified":"2013-02-06T17:49:41","modified_gmt":"2013-02-07T01:49:41","slug":"columnstore-segment-population-skew","status":"publish","type":"post","link":"https:\/\/www.sqlskills.com\/blogs\/joe\/columnstore-segment-population-skew\/","title":{"rendered":"Columnstore Segment Population Skew"},"content":{"rendered":"

My last post about nonclustered Columnstore indexes was written back on February 25th, 2012 (Exploring Columnstore Index Metadata, Segment Distribution and Elimination Behaviors<\/a>).\u00a0 Amazed by how quickly time passes.<\/p>\n

Anyhow, this is a quick post on segment population skew based on parallel nonclustered Columnstore index creations.<\/p>\n

I\u2019ll use the same 123,695,104 row FactInternetSales table I used almost a year ago to demonstrate.\u00a0 I\u2019ll create the following nonclustered Columnstore index just on one column, to keep things simple:<\/p>\n

\r\nCREATE NONCLUSTERED COLUMNSTORE INDEX [NCSI_FactInternetSales]\r\nON [dbo].[FactInternetSales]\r\n(\r\n[ProductKey]\r\n);\r\n <\/pre>\n

The index takes 31 seconds to create on my laptop and it was created using 8 threads (which I can confirm via the SQL Server execution plan in, this case, SQL Sentry Plan Explorer):<\/p>\n

\"image\"<\/a><\/p>\n

Adding up the actual rows by thread, we get the 123,695,104 row count.<\/p>\n

Now if we look at sys.column_store_segments, we can see that the last few segments were populated with less than the maximum 1,048,576 rows:<\/p>\n

\r\nSELECT\u00a0 [partition_id],\r\n[column_id],\r\n[segment_id],\r\n[row_count]\r\nFROM sys.column_store_segments\r\nWHERE [row_count] = 1048576 AND\r\n[column_id] = 2;\r\n<\/pre>\n

\"image\"<\/a><\/p>\n

Now the purpose of this short post is to show what happens if we remove parallelism from the overall Columnstore index build (aside from increasing build time and reducing the memory grant):<\/p>\n

 \r\nDROP INDEX [NCSI_FactInternetSales]\u00a0 ON [dbo].[FactInternetSales];\r\nGO\r\n\r\nCREATE NONCLUSTERED COLUMNSTORE INDEX [NCSI_FactInternetSales]\r\nON [dbo].[FactInternetSales]\r\n(\r\n[ProductKey]\r\n)WITH (DROP_EXISTING = OFF, MAXDOP = 1);\r\nGO\r\n<\/pre>\n

Now instead of running in 31 seconds with 8 schedulers, this serial index build took (not surprisingly) 2 minutes and 10 seconds to build.<\/p>\n

How many segments fell beneath the 1,048,576 row count?<\/p>\n

\"image\"<\/a><\/p>\n

This time, just one segment, the last one to be populated.\u00a0 With 117 segments (segment_id 0 through segment_id 117) populated at 1,048,576 per segment, and 123,695,104 rows \u2013 our 118th segment has the remaining 1,011,712 rows.<\/p>\n

Should the more tightly packed segments provide meaningful performance gains versus the parallel-built, partially filled version?\u00a0 I haven\u2019t tested this yet, but I will at some point.\u00a0 Let me know if you get a chance to do so before I do.\u00a0 My wild guess would be that the benefit would be minor, at best \u2013 but as with most things I would like to see for myself.<\/p>\n","protected":false},"excerpt":{"rendered":"

My last post about nonclustered Columnstore indexes was written back on February 25th, 2012 (Exploring Columnstore Index Metadata, Segment Distribution and Elimination Behaviors).\u00a0 Amazed by how quickly time passes. Anyhow, this is a quick post on segment population skew based on parallel nonclustered Columnstore index creations. I\u2019ll use the same 123,695,104 row FactInternetSales table I […]<\/p>\n","protected":false},"author":7,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16,24,28],"tags":[],"class_list":["post-912","post","type-post","status-publish","format-standard","hentry","category-columnstore-indexes","category-indexing","category-performance"],"yoast_head":"\nColumnstore Segment Population Skew - Joe Sack<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.sqlskills.com\/blogs\/joe\/columnstore-segment-population-skew\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Columnstore Segment Population Skew - Joe Sack\" \/>\n<meta property=\"og:description\" content=\"My last post about nonclustered Columnstore indexes was written back on February 25th, 2012 (Exploring Columnstore Index Metadata, Segment Distribution and Elimination Behaviors).\u00a0 Amazed by how quickly time passes. Anyhow, this is a quick post on segment population skew based on parallel nonclustered Columnstore index creations. I\u2019ll use the same 123,695,104 row FactInternetSales table I […]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.sqlskills.com\/blogs\/joe\/columnstore-segment-population-skew\/\" \/>\n<meta property=\"og:site_name\" content=\"Joe Sack\" \/>\n<meta property=\"article:published_time\" content=\"2013-02-04T21:50:49+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2013-02-07T01:49:41+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.sqlskills.com\/blogs\/joe\/wp-content\/uploads\/2013\/02\/image_thumb3.png\" \/>\n<meta name=\"author\" content=\"Joseph Sack\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Joseph Sack\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/joe\/columnstore-segment-population-skew\/\",\"url\":\"https:\/\/www.sqlskills.com\/blogs\/joe\/columnstore-segment-population-skew\/\",\"name\":\"Columnstore Segment Population Skew - Joe Sack\",\"isPartOf\":{\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/joe\/#website\"},\"datePublished\":\"2013-02-04T21:50:49+00:00\",\"dateModified\":\"2013-02-07T01:49:41+00:00\",\"author\":{\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/joe\/#\/schema\/person\/533eb0113a15fb5a6e8067a49e4ae648\"},\"breadcrumb\":{\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/joe\/columnstore-segment-population-skew\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.sqlskills.com\/blogs\/joe\/columnstore-segment-population-skew\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/joe\/columnstore-segment-population-skew\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.sqlskills.com\/blogs\/joe\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Columnstore indexes\",\"item\":\"https:\/\/www.sqlskills.com\/blogs\/joe\/category\/columnstore-indexes\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Columnstore Segment Population Skew\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/joe\/#website\",\"url\":\"https:\/\/www.sqlskills.com\/blogs\/joe\/\",\"name\":\"Joe Sack\",\"description\":\"SQL Server Performance Tuning, High Availability and Disaster Recovery Blog\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.sqlskills.com\/blogs\/joe\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/joe\/#\/schema\/person\/533eb0113a15fb5a6e8067a49e4ae648\",\"name\":\"Joseph Sack\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/joe\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/a4b39a7719a6bfff1add3ec00527810734579ee114d6d983e8e68f937b77be96?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/a4b39a7719a6bfff1add3ec00527810734579ee114d6d983e8e68f937b77be96?s=96&d=mm&r=g\",\"caption\":\"Joseph Sack\"},\"description\":\"Joe Sack is a Principal Consultant with SQLskills. He has worked as a SQL Server professional since 1997 and has supported and developed for SQL Server environments in financial services, IT consulting, manufacturing, retail and the real estate industry. Prior to joining SQLskills he worked at Microsoft as a Premier Field Engineer supporting very large enterprise customer environments. He was responsible for providing deep SQL Server advisory services, training, troubleshooting and ongoing solutions guidance. His areas of expertise include performance tuning, scalability, T-SQL development and high-availability. In 2006 Joe earned the \u201cMicrosoft Certified Master: SQL Server 2005\u201d certification and in 2008 he earned the \u201cMicrosoft Certified Master: SQL Server 2008\u201d certification. In 2009 he took over responsibility for the entire SQL Server Microsoft Certified Master program and held that post until 2011. He was given the SQL Server MVP award in 2013.\",\"sameAs\":[\"http:\/\/3.209.169.194\/blogs\/joe\",\"https:\/\/twitter.com\/https:\/\/twitter.com\/josephsack\"],\"url\":\"https:\/\/www.sqlskills.com\/blogs\/joe\/author\/joe\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Columnstore Segment Population Skew - Joe Sack","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.sqlskills.com\/blogs\/joe\/columnstore-segment-population-skew\/","og_locale":"en_US","og_type":"article","og_title":"Columnstore Segment Population Skew - Joe Sack","og_description":"My last post about nonclustered Columnstore indexes was written back on February 25th, 2012 (Exploring Columnstore Index Metadata, Segment Distribution and Elimination Behaviors).\u00a0 Amazed by how quickly time passes. Anyhow, this is a quick post on segment population skew based on parallel nonclustered Columnstore index creations. I\u2019ll use the same 123,695,104 row FactInternetSales table I […]","og_url":"https:\/\/www.sqlskills.com\/blogs\/joe\/columnstore-segment-population-skew\/","og_site_name":"Joe Sack","article_published_time":"2013-02-04T21:50:49+00:00","article_modified_time":"2013-02-07T01:49:41+00:00","og_image":[{"url":"https:\/\/www.sqlskills.com\/blogs\/joe\/wp-content\/uploads\/2013\/02\/image_thumb3.png"}],"author":"Joseph Sack","twitter_misc":{"Written by":"Joseph Sack","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.sqlskills.com\/blogs\/joe\/columnstore-segment-population-skew\/","url":"https:\/\/www.sqlskills.com\/blogs\/joe\/columnstore-segment-population-skew\/","name":"Columnstore Segment Population Skew - Joe Sack","isPartOf":{"@id":"https:\/\/www.sqlskills.com\/blogs\/joe\/#website"},"datePublished":"2013-02-04T21:50:49+00:00","dateModified":"2013-02-07T01:49:41+00:00","author":{"@id":"https:\/\/www.sqlskills.com\/blogs\/joe\/#\/schema\/person\/533eb0113a15fb5a6e8067a49e4ae648"},"breadcrumb":{"@id":"https:\/\/www.sqlskills.com\/blogs\/joe\/columnstore-segment-population-skew\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.sqlskills.com\/blogs\/joe\/columnstore-segment-population-skew\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.sqlskills.com\/blogs\/joe\/columnstore-segment-population-skew\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.sqlskills.com\/blogs\/joe\/"},{"@type":"ListItem","position":2,"name":"Columnstore indexes","item":"https:\/\/www.sqlskills.com\/blogs\/joe\/category\/columnstore-indexes\/"},{"@type":"ListItem","position":3,"name":"Columnstore Segment Population Skew"}]},{"@type":"WebSite","@id":"https:\/\/www.sqlskills.com\/blogs\/joe\/#website","url":"https:\/\/www.sqlskills.com\/blogs\/joe\/","name":"Joe Sack","description":"SQL Server Performance Tuning, High Availability and Disaster Recovery Blog","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.sqlskills.com\/blogs\/joe\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.sqlskills.com\/blogs\/joe\/#\/schema\/person\/533eb0113a15fb5a6e8067a49e4ae648","name":"Joseph Sack","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.sqlskills.com\/blogs\/joe\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/a4b39a7719a6bfff1add3ec00527810734579ee114d6d983e8e68f937b77be96?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/a4b39a7719a6bfff1add3ec00527810734579ee114d6d983e8e68f937b77be96?s=96&d=mm&r=g","caption":"Joseph Sack"},"description":"Joe Sack is a Principal Consultant with SQLskills. He has worked as a SQL Server professional since 1997 and has supported and developed for SQL Server environments in financial services, IT consulting, manufacturing, retail and the real estate industry. Prior to joining SQLskills he worked at Microsoft as a Premier Field Engineer supporting very large enterprise customer environments. He was responsible for providing deep SQL Server advisory services, training, troubleshooting and ongoing solutions guidance. His areas of expertise include performance tuning, scalability, T-SQL development and high-availability. In 2006 Joe earned the \u201cMicrosoft Certified Master: SQL Server 2005\u201d certification and in 2008 he earned the \u201cMicrosoft Certified Master: SQL Server 2008\u201d certification. In 2009 he took over responsibility for the entire SQL Server Microsoft Certified Master program and held that post until 2011. He was given the SQL Server MVP award in 2013.","sameAs":["http:\/\/3.209.169.194\/blogs\/joe","https:\/\/twitter.com\/https:\/\/twitter.com\/josephsack"],"url":"https:\/\/www.sqlskills.com\/blogs\/joe\/author\/joe\/"}]}},"_links":{"self":[{"href":"https:\/\/www.sqlskills.com\/blogs\/joe\/wp-json\/wp\/v2\/posts\/912","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.sqlskills.com\/blogs\/joe\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sqlskills.com\/blogs\/joe\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/joe\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/joe\/wp-json\/wp\/v2\/comments?post=912"}],"version-history":[{"count":0,"href":"https:\/\/www.sqlskills.com\/blogs\/joe\/wp-json\/wp\/v2\/posts\/912\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.sqlskills.com\/blogs\/joe\/wp-json\/wp\/v2\/media?parent=912"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/joe\/wp-json\/wp\/v2\/categories?post=912"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/joe\/wp-json\/wp\/v2\/tags?post=912"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}