You may have already heard that SQL Server 2012 adds automatic generation of statistics for AlwaysOn availability group readable secondaries, storing the statistics in tempdb.  This functionality was also extended to other areas beyond just AlwaysOn availability groups as I’ll demonstrate next…

Using the AdventureWorksDWDenali database, I’ll start by checking the current statistics on the FactInternetSales table:

USE [AdventureWorksDWDenali];

GO

-- Confirm statistics on FactInternetSales

EXEC sp_helpstats 'FactInternetSales';

/*

statistics_name statistics_keys

_WA_Sys_00000008_1273C1CD  SalesTerritoryKey

_WA_Sys_0000000A_1273C1CD  SalesOrderLineNumber

_WA_Sys_0000000B_1273C1CD  RevisionNumber

_WA_Sys_0000000C_1273C1CD  OrderQuantity

_WA_Sys_0000000D_1273C1CD     UnitPrice

_WA_Sys_0000000E_1273C1CD  ExtendedAmount

_WA_Sys_0000000F_1273C1CD  UnitPriceDiscountPct

_WA_Sys_00000010_1273C1CD  DiscountAmount

_WA_Sys_00000011_1273C1CD  ProductStandardCost

_WA_Sys_00000012_1273C1CD  TotalProductCost

_WA_Sys_00000013_1273C1CD  SalesAmount

_WA_Sys_00000014_1273C1CD  TaxAmt

_WA_Sys_00000015_1273C1CD  Freight

_WA_Sys_00000016_1273C1CD  CarrierTrackingNumber

_WA_Sys_00000017_1273C1CD  CustomerPONumber

_WA_Sys_00000018_1273C1CD  OrderDate

_WA_Sys_00000019_1273C1CD  DueDate

_WA_Sys_0000001A_1273C1CD  ShipDate

*/

Next, I’m going to add a UnitPrice value to an existing row (creating just one row out of the table with this value):

UPDATE TOP (1) dbo.FactInternetSales

SET UnitPrice = 777.77;

After that, I’ll drop the statistics for the UnitPrice column.  Why?  I want demonstrate a scenario where automatic statistics may not have already been generated due to incoming query requests – but may still be needed for queries against the database snapshot:

-- UnitPrice stats

DROP STATISTICS FactInternetSales._WA_Sys_0000000D_1273C1CD;

GO

Now I’ll create a database snapshot on the database:

-- Create the database snapshot

CREATE DATABASE AdventureWorksDWDenali_AWDW_S1

ON

( NAME = AdventureWorksDWDenali_Data,

FILENAME =

'C:\Program Files\Microsoft SQL Server\MSSQL11.CAESAR\MSSQL\Data\AdventureWorksDWDenali_AWDW_S1.ss' )

AS SNAPSHOT OF AdventureWorksDWDenali;

GO

What happens if I use the database snapshot to query for a specific UnitPrice value?  There are no statistics on the read-write database and now that I’m using the database snapshot, I’m not triggering automatic-statistics on the source database:

USE AdventureWorksDWDenali_AWDW_S1;

GO

SELECT ProductKey, ProductStandardCost

FROM dbo.FactInternetSales

WHERE UnitPrice = 777.77;

Looking at the actual plan for this query, I see that the estimated rows vs. actual do indeed match (and as an aside – the optimizer “guess” for this predicate would have been 3,852.72 without the supporting stats):

clip_image001

But I can also check to see if there are automatic statistics generated by qualifying is_temporary = 1:

SELECT name, stats_id

FROM sys.stats

WHERE OBJECT_ID = object_id('FactInternetSales')

     AND is_temporary = 1;

This returns:

clip_image003

The stats name even gave me an additional hint of its origins via the “readonly” naming convention.  If I look up the column for the automatically generated temporary stats, I see the following:

SELECT     object_name(object_id) tblname,

           COL_NAME(object_id, column_id) columnname

FROM sys.stats_columns

WHERE stats_id = 512000;

clip_image005

If you need a supporting index, you still need to provide this structure in the read-write database, but when a query can be optimized using statistics alone - this is definitely a welcome new feature.

I gave a presentation on columnstore indexing last week and one question I received was regarding the performance of a (hypothetical) narrow, supporting nonclustered index versus a columnstore index.  We discussed how nonclustered indexes were still going to be more effective for specific types of queries (for example – singleton operations, smaller range scans, etc.).  But what about a situation where we’re aggregating data across the entire fact table? Was there any way that a narrow nonclustered index could measure up against a columnstore index?

The ability to contrast the performance is of course very much dependent on the query construction, table design, data distribution and other factors.  With that said, I thought I would test out a scenario as follows…

First of all, I used FactInternetSales - pumped up to 123,695,104 rows.  The query I used for comparing performance was as follows (with a few hint-modifications to force index and parallelism caps that I used later):

SELECT     p.ProductLine,

           SUM(f.SalesAmount) TotalSalesAmount

FROM [dbo].[FactInternetSales] f

INNER JOIN [dbo].[DimProduct] p ON

     f.ProductKey = p.ProductKey

GROUP BY p.ProductLine

ORDER BY p.ProductLine;

 

I created the following nonclustered indexes (index on joining keys – and then an INCLUDE on the SalesAmount which is being aggregated):

CREATE NONCLUSTERED INDEX [NCI_FactInternetSales_ProductKey]

ON [dbo].[FactInternetSales]

([ProductKey] ASC)

INCLUDE ( [SalesAmount]) ON [PRIMARY];

GO

 

CREATE NONCLUSTERED INDEX [NCI_DimProduct_ProductKey_ProductLine] ON [dbo].[DimProduct]

(    [ProductKey] ASC,

     [ProductLine] ASC

)ON [PRIMARY];

GO

 

I also created a columnstore index that covered all columns in the fact table.  To force my query to ignore that index for the comparison test, I added the following line:

OPTION (IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX);

 

So how did the performance compare between a supporting nonclustered index versus the columnstore index? (By the way, I used a warm cache for both tests – executing twice):

Index Support

Elapsed Time

CPU Time

Columnstore

256 ms.

1045 ms.

Nonclustered index

34189 ms.

33322 ms

 

The columnstore index plan was as follows:

clip_image002

The nonclustered index plan was as follows:

clip_image004

One obvious difference is that the columnstore index execution leveraged parallelism.  The query also benefited from batch execution mode:

clip_image006

If I hobble the columnstore index by capping max degree of parallelism, the results (in comparison) are as follows (third row represents the test):

Intended Index Support

Elapsed Time

CPU Time

Mode

Columnstore

256 ms.

1045 ms.

Batch

Nonclustered index

34189 ms.

33322 ms.

Row

Columnstore index (MAXDOP 1) – ending up using NCI

33540 ms.

34560 ms.

Row

 

Looking at the plan, we see the columnstore index wasn’t used at all:

clip_image008

If I force use of the columnstore index with MAXDOP 1, I see:

Index Support

Elapsed Time

CPU Time

Mode

Columnstore

256 ms.

1045 ms.

Batch

Nonclustered index

34189 ms.

33322 ms.

Row

Columnstore index (MAXDOP 1) – ending up using NCI

33540 ms.

34560 ms.

Row

Columnstore index (MAXDOP 1) – forced columnstore index

29859 ms.

30107 ms.

Row

 

The plan shows the index is used, but it is row mode (due to capping parallelism):

clip_image010

clip_image012

Looping back to the inciting discussion – even with a supporting nonclustered index, this specific query which scanned all 123 million rows and returned a smaller result set with aggregated values performed significantly better when using a columnstore index in conjunction with batch execution mode.

When moving from SQL Server 2008+ to SQL Server 2012, be aware that the sampling algorithm has changed.

For example, I created a partitioned clustered index on two identical tables (same schema, rows and distribution), one in SQL Server 2008 R2 SP1 and the other in SQL Server 2012 RC0. 

Below shows the partial output (STAT_HEADER) of a DBCC SHOW_STATISTICS on SQL Server 2008 R2 SP1:

clip_image001

All rows were evaluated for the statistics generation.

Now compare this to SQL Server 2012 RC0:

clip_image002

SQL Server 2012 now uses default sampling for the partition creation and rebuild.

Is this good or bad?

Certainly if you’ve required a full scan in the past in order to get higher quality stats, this may not be a good thing.  Otherwise you may not notice anything at all from a query performance perspective if the default is good (or accurate) enough.

You can see the number of steps decreased in my examples in this post between 2008 R2 SP1 to SQL Server 2012.  My histograms for this identical table also changed of course.  For example, moving from average rows on one step of 5504 down to 5305, distinct range rows from 343 up to 361... The real question is, will these changes be enough to turn a good plan bad. 

Post-upgrade, if you see changes in query performance against partitioned tables, evaluate the histograms and test performance of the workload after the default sampling versus after a full scan update of the statistics (via UPDATE STATISTICS for example).  

I was interested in exploring various questions about columnstore indexing this morning – and I realized I should probably blog about what I observed.  This truly was just an exploration and not a formal test (call it semi-structured fiddling).  Also, some of my “questions” were really just confirmations.  While I believe what I read, I like to see things firsthand whenever possible.

Today I thought I would take a few minutes to explore columnstore index behavior on a 123,695,104 row fact table.  My previous tests had been limited to < 10 million rows, which doesn’t really showcase columnstore index capabilities.  So I thought I would experiment with higher numbers – although I still want to start the billion row testing scenarios at some point.

The experimentations took place on a SQL Server instance capped at 8GB of RAM max server memory and 8 logical processors.  I used the FactInternetSales table from AdventureWorksDWDenali – increasing it to 123,695,104 rows. 

I covered the entire fact table for my initial exploration (although as you’ll see, I made a few changes both to this index and also moving from a heap table to a clustered index):

CREATE NONCLUSTERED COLUMNSTORE INDEX [NonClusteredColumnStoreIndex-20120225-092018] ON [dbo].[FactInternetSales]

(

       [ProductKey],

       [OrderDateKey],

       [DueDateKey],

       [ShipDateKey],

       [CustomerKey],

       [PromotionKey],

       [CurrencyKey],

       [SalesTerritoryKey],

       [SalesOrderNumber],

       [SalesOrderLineNumber],

       [RevisionNumber],

       [OrderQuantity],

       [UnitPrice],

       [ExtendedAmount],

       [UnitPriceDiscountPct],

       [DiscountAmount],

       [ProductStandardCost],

       [TotalProductCost],

       [SalesAmount],

       [TaxAmt],

       [Freight],

       [CarrierTrackingNumber],

       [CustomerPONumber],

       [OrderDate],

       [DueDate],

       [ShipDate]

) WITH (DROP_EXISTING = OFF)

GO

Remember that we can have up to 1,024 columns and that the concept of key columns and INCLUDE don’t apply here.  Nor does the ASC or DESC keywords or creation of the columnstore index as clustered (only nonclustered is supported at this point). 

Regarding the data types, we can cover the standard “business” ones – but not the “max”, LOB and xml flavors.

<quick aside> By the way – a shout out to Eric Hanson’s Columnstore Index references on the TechNet Wiki.  These have been invaluable in understanding columnstore indexing and also revealing tools to help fish for your own answers.  I love seeing this kind of high quality content coming directly from the Program Managers.  I really hope that more PMs start updating their associated feature areas on TechNet Wiki in the future.  When I first heard about TechNet Wiki I was more than a little dubious.  We already have blogs, forums, BOL, papers, videos, etc.  Why do we need another information channel?  Seeing Eric’s content has totally shifted my opinion on this.  The information is timely, fresh (you see it get updated frequently) and easily discoverable.  It also allows for community participation and questions – which, if managed well – only helps the associated quality of the content.   I don’t believe TechNet Wiki replaces BOL or other information channels, but I feel it has definitely earned a seat at the table.  Again, I hope other PMs follow this lead – or alternatively use their blogs to heavily describe their feature areas (much like the great blog post series on the AlwaysOn Readable Secondary feature from Sunil Agarwal).   Just doing periodic updates on their pet features can help do wonders for feature adoption and comprehension. </quick aside>

Back on task… The following are the various questions I asked and the various observations made….

Was tempdb used in creating the columnstore index on a heap table?

I intentionally had tempdb configured to an initial small size with auto-growth enabled in order to see if it was used at all during the CREATE COLUMNSTORE INDEX against the 123 million row table (I could have looked through counters and other ways too – this was just a side effect observation). The answer? It was not used (staying at a very small value of 10 MBs).

How do things look in sys.indexes and sys.index_columns?

Nothing terribly interesting - except that the is_included_column is a value of “1”:

clip_image002

Anything interesting in sys.column_store_dictionaries?

Most definitely… This was a pretty illuminating catalog to query.  A few of the observations:

·        There were 149 rows across the 26 columns defined in the index.

·        25 columns had one row – and 1 column had 125 rows.  Not surprisingly – the SalesOrderNumber was the column with 125 entries.  This is also the column I populated with a random GUID that I converted to nvarchar(20) for new entries to the table.  The values for this column, while not constrained, were nearly unique in my test (with the exception of the base data).

·        The on disk (dictionary) storage for SalesOrderNumber was by far the worst (not surprisingly).  It was 2,328,411,337 in bytes.  Compare that with 21,384 bytes for an integer column, 74,584 bytes for a datetime column and 1,088 bytes for a money data type column.  So this would point to a design decision around not including 100% unique values as part of the index – or if you do, choosing a much more compression friendly data type mapped to an associated dimension table.

·        The entry_count column_store_dictionaries column was also very interesting.  For example, the int data type ProductKey had 158 entries.  The ShipDate datetime column had 1,126 entries.  The SalesOrderNumber had 123,662,365 entries. No magic here (and the difference between those entries and the 123,695,104 row count was that the initial data populated didn’t use my random GUID values).

·        The type column from the results told me about the type of dictionary… For the columns with “1”, that maps to hash dictionary for integer based columns.  A type of “4” represented my float columns. Type “3” represented the string values.

·        The flags (“internal use only”) column from column_store_dictionaries was “0” for all columns except for UnitPriceDiscountPct (float), DiscountAmount (float), OrderDate/DueDate/ShipDate (datetime).  The flags column was “2” for the CustomerPONumber (nvarchar(25)) and CarrierTrackingNumber (nvarchar(25)).  Flags was 0 for everything else, so I’m definitely curious about this.

Before I continued, I dropped and recreated the columnstore index, this time without the SalesOrderNumber.    It wasn’t going to provide any value for my next set of explorations.

With the SalesOrderNumber column removed, leaving the other columns on the columnstore index, what was the on-disk footprint?

310,400 bytes for the dictionary size – for 123 million rows (per sys.column_store_dictionaries).  I double checked sys.dm_db_partition_stats as well and the lob_used_page_count was 304,579.  So the page count is around 2.3 GB – and the dictionary size is 310 KB.  Again, this is for 123,695,104 rows.  The heap itself is 2,876,461 pages – or roughly 22.4 GB.

Anything interesting in sys.column_store_segments?

Another great catalog view to explore with many interesting ways to slice and dice this data:

·        All 26 columns had 120 segments associated with them. But what is interesting is that this is for 26 columns – even though my columnstore index was recreated with just 25 columns.  So it was segmenting based on each column of the heap (seemingly).  If you enable trace flag 646 and 3605, you’ll see during index creation a message like “Column 25 was omitted from VertiPaq during column index build.” So it made me wonder if this really did represent 120 segments for the omitted column?

·        The average row count per segment was 1,030,792 rows and a maximum of 1,048,576. 

·        The minimum rows for a segment was 2,416. 

·        The 2,416 row count was for all columns on segment number 118. This was not the last segment.  Segment 119 was the last segment (first segment is 0-based) for each column and each one of these had 1,040,352.  Eric Hanson gave an explanation on a forum which talks about segments being built on parallel threads and when hitting the end of the table, not having enough rows to completely fill segments.

·        As for encoding_type, that was also interesting.  Some columns had both an encoding_type of 1 AND 2.  Only the first five columns of the table had an encoding type of “1” and then columns 1 through 25 had an encoding type of “2”.  Columns 21 and 22 (Freight and CarrierTrackingNumber) had an encoding of 3.  Freight is money data type and CarrierTrackingNumber is nvarchar(25).  Column 26 (ShipDate datetime) had an encoding type of “4”.  In BOL – encoding_type is defined as “type of encoding used for that segment”.  Sigh.

·        The min_data_id and max_data_id value rangers were also interesting.  Sometimes they reflected the actual column data ranges (for example in the date columns) and sometimes not.  Looking at “Understanding Segment Elimination” – Eric Hanson describes this behavior – saying that the values in these ranges can also reference “into a dictionary”.

Anything interesting in sys.column_store_index_stats?

The only noteworthy value was the number_of_segments (3,120).   

Can I see some segment elimination for my queries? (for non-string types on a heap table)

Remember that I created a columnstore index on a heap.  Let’s say we look at the segment ranges for the ProductKey column (column id 1).  Here are the associated ranges:

SELECT column_id, min_data_id, max_data_id, segment_id

FROM sys.column_store_segments

WHERE column_id = 1

ORDER BY column_id, min_data_id, max_data_id, segment_id;

All 120 segments have identical min and max ranges:

clip_image003

What about other columns – like the DueDate (datetime)?

clip_image004

Same thing applies – each segment, with the exception of a trailing row, have the same ranges. 

What about segment elimination on a table with a clustered index?

So next I dropped the columnstore index, and created a clustered index on ProductKey.  I’m not saying this is a great choice for the clustered index key – but rather, I’m just trying to understand the behavior in comparison to a heap and also look at any potential segment elimination.

In the following query, I’m pulling total order quantity for three products:

-- Segment elimination written to error log

DBCC TRACEON(3605, -1);

GO

DBCC TRACEON(646, -1);

GO

SELECT ProductKey,

             [DueDateKey],

             SUM(OrderQuantity) Total_OrderQuantity

FROM [dbo].[FactInternetSales]

GROUP BY ProductKey, DueDateKey

HAVING ProductKey IN

             (478, 343, 574);

I first validated that batch mode was indeed being used:

clip_image005

I then looked at the SQL Error Log (per the trace flag) – and I saw the storage engine skipped 28 “row group” (segments):

clip_image007

If I look at the segment meta data though for column_id 1, and I see blocks of segments covering the same ranges (instead of the same ranges across all segments):

SELECT segment_id, min_data_id, max_data_id

FROM sys.column_store_segments

WHERE column_id = 1

ORDER BY segment_id

clip_image009

By the way, the 28 segment elimination I saw in the error log for column 1 translated to 27,226,112 rows across (row_count from sys.column_store_segments).  The query itself takes less than 1 second to return 569 rows – from a 120 million+ row table. 

Will I get segment elimination on other columns not part of the clustered index key?

I tried the following query that filtered on DueDateKey:

SELECT ProductKey,

             [DueDateKey],

             SUM(OrderQuantity) Total_OrderQuantity

FROM [dbo].[FactInternetSales]

GROUP BY ProductKey, DueDateKey

HAVING DueDateKey = 20040522

Nope.  While the query still ran in less than a second – using batch instead of row mode for the columnstore index scan, segment elimination event did not occur.  Looking at the min and max ranges from sys.column_store_segments helps answer why segment elimination wasn’t possible:

SELECT column_id, min_data_id, max_data_id, segment_id

FROM sys.column_store_segments

WHERE column_id = 3 AND

       20040522 BETWEEN min_data_id AND max_data_id

This returns all 120 rows:

clip_image011

What if the DueDateKey is a secondary key column on the clustered index? 

I dropped the columnstore index and recreated the clustered index with ProductKey and DueDateKey.  I then recreated the columnstore index.

Checking the DueDateKey range in sys.column_store_segments, I see the ranges are no longer identical:

clip_image013

That seemed promising, so I executed the query filtering by DueDateKey:

clip_image015

Sure enough – I had segment elimination, even though this was defined as the second column in the clustered index key.  And what’s more, my ProductKey column segment elimination was still working (I tested the earlier queries again).  This raises some interesting questions around new clustered index key strategies when you know your table will primarily be used with a columnstore index.

Okay – that’s enough for today.  I still have many other scenarios I’d like to try out, and I’ll share here when I get the opportunity.

Last month I wrote a post called Hash Partitioning with SQL Server 2012’s SEQUENCE object and CYCLE argument and I was asked a good question by Eric Humphrey (twitter) about whether there was any advantage of using a sequence object for hash partitioning over an identity/modulus/computed-column technique. 

At that point I hadn’t compared the two techniques, but I wanted to perform a test eventually, so here is what I found…  This was a quick-and-dirty test, by the way, as I really should have been working on something else (first), but I needed a little “fun” time.

Before testing, I made a few minor modifications to the schema used in my original post:

1.      I split the range of the partition function as follows:

 

ALTER PARTITION FUNCTION pfFactInternetSales () SPLIT RANGE (0)

 

2.      I changed the sequence object to be zero-based instead of one-based:

 

CREATE SEQUENCE dbo.Seq_FactInternetSales

    AS int

               START WITH 0

               INCREMENT BY 1

    MINVALUE 0

    MAXVALUE 10

    CYCLE

    CACHE 10;

That is all that was changed with the sequence object technique I used before. As for a modulus technique, I created two additional columns to another version of the FactInternetSales table:

ProductID int NOT NULL IDENTITY(1,1),

               PartitionBucketKey AS ProductID %11  PERSISTED,

 

After adding the columns I referenced the PartitionBucketKey table as the partition key for the clustered index and imported the data from an INSERT…SELECT (as I did with the sequence object method).

 

So did the performance differ between the two techniques?

 

Here are the results of eight separate tests (each executed with a cold data cache for the SELECT, truncated destination table and pre-sized data and log files):

 

Test number

Rows loaded

Buckets

Hash Partition

Method

1

1,207,960 rows

11 buckets

26 seconds

Sequence

2

1,207,960 rows

11 buckets

26 seconds

Sequence

3

1,207,960 rows

11 buckets

29 seconds

Sequence

4

1,207,960 rows

11 buckets

26 seconds

Sequence

5

1,207,960 rows

11 buckets

27 seconds

Modulus

6

1,207,960 rows

11 buckets

24 seconds

Modulus

7

1,207,960 rows

11 buckets

25 seconds

Modulus

8

1,207,960 rows

11 buckets

26 seconds

Modulus

So the modulus method seems to perform ever-so-slightly faster than the sequence method.  Stylistically, the sequence method involved a more direct way to create X number of hash buckets (basically no calculation necessary).  With that said, the execution plans are nearly identical.

Sequence Plan (estimated cost – 17.4357)

clip_image002

Modulus Plan (estimated cost – 17.635)

clip_image004

I’m sure there are other variations of this test that I could try out, but that’s all for now. I’m curious about other techniques people may be using (like a CLR function to do the hashing), so please post them here if you have used an interesting variation on this post’s theme.

When I first heard about SQL Server 2012’s SEQUENCE object – I thought it was an interesting feature to be added and one that I have been asked about by customers in the past (from those who had worked on different database platforms).  But when I looked at the CYCLE argument of SEQUENCE, that’s when I really got interested.

I wondered if it could be used in the service of implementing hash partitioning (of sorts) – allowing me to evenly distribute rows across a set number of partitions based on a hash key.  In this scenario I want the distribution to be evenly spread out, but NOT partition based on other business keys (like a datetime column or other attribute that has business or application meaning).

So will a column with a sequence default also work as a partition key? 

I started off by creating a new table based on AdventureWorkDWDenali’s FactInternetSales table:

-- Create demonstration Fact table, no constraints, indexes, keys

-- Tested on version 11.0.1750 (SQL Server 2012 RC0)

USE [SequenceDemo];

GO

CREATE TABLE [dbo].[FactInternetSales](

       [ProductKey] [int] NOT NULL,

       [OrderDateKey] [int] NOT NULL,

       [DueDateKey] [int] NOT NULL,

       [ShipDateKey] [int] NOT NULL,

       [CustomerKey] [int] NOT NULL,

       [PromotionKey] [int] NOT NULL,

       [CurrencyKey] [int] NOT NULL,

       [SalesTerritoryKey] [int] NOT NULL,

       [SalesOrderNumber] [nvarchar](20) NOT NULL,

       [SalesOrderLineNumber] [tinyint] NOT NULL,

       [RevisionNumber] [tinyint] NOT NULL,

       [OrderQuantity] [smallint] NOT NULL,

       [UnitPrice] [money] NOT NULL,

       [ExtendedAmount] [money] NOT NULL,

       [UnitPriceDiscountPct] [float] NOT NULL,

       [DiscountAmount] [float] NOT NULL,

       [ProductStandardCost] [money] NOT NULL,

       [TotalProductCost] [money] NOT NULL,

       [SalesAmount] [money] NOT NULL,

       [TaxAmt] [money] NOT NULL,

       [Freight] [money] NOT NULL,

       [CarrierTrackingNumber] [nvarchar](25) NULL,

       [CustomerPONumber] [nvarchar](25) NULL,

       [OrderDate] [datetime] NULL,

       [DueDate] [datetime] NULL,

       [ShipDate] [datetime] NULL,

)ON [PRIMARY];

GO

Next I created the sequence object (increment by 1, with a min of 1, max of 10, caching of 10 at a time and a cycling of values):

CREATE SEQUENCE dbo.Seq_FactInternetSales

    AS int

    START WITH 1

    INCREMENT BY 1

    MINVALUE 1

    MAXVALUE 10

    CYCLE

    CACHE 10;

After that, I added a new column to the Fact table called PartitionBucketKey and associated it with the new sequence object:

ALTER TABLE [dbo].[FactInternetSales]

ADD PartitionBucketKey int DEFAULT

(NEXT VALUE FOR dbo.Seq_FactInternetSales);

Next, I created a partition function and scheme:

-- Create a new partition function

CREATE PARTITION FUNCTION pfFactInternetSales (int)

AS RANGE LEFT FOR VALUES (1,2,3,4,5,6,7,8,9);

-- Create a new partition scheme

-- And yes, being lazy about the FGs, as I just want to see whether the

-- individual partitions fan-out the way I want...

CREATE PARTITION SCHEME psFactInternetSales

AS PARTITION pfFactInternetSales

ALL TO ( [PRIMARY] );

Next up, I created a clustered index on the table referencing the PK columns used in the original version of this table but then referencing the PartitionBucketKey in partition scheme:

-- Create it on the new column referencing the sequence

CREATE CLUSTERED INDEX IX_FactInternetSales

ON  dbo.FactInternetSales(SalesOrderNumber, SalesOrderLineNumber)

ON psFactInternetSales (PartitionBucketKey);

It’s show time.  Now I went ahead and populated 60,398 rows from the original table.  Not much for this test I realize, but this was just an initial proof-of-concept:

INSERT dbo.FactInternetSales

(ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey,

PromotionKey, CurrencyKey, SalesTerritoryKey, SalesOrderNumber,

SalesOrderLineNumber, RevisionNumber, OrderQuantity, UnitPrice,

ExtendedAmount, UnitPriceDiscountPct, DiscountAmount, ProductStandardCost,

TotalProductCost, SalesAmount, TaxAmt, Freight, CarrierTrackingNumber,

CustomerPONumber, OrderDate, DueDate, ShipDate)

SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey,

PromotionKey, CurrencyKey, SalesTerritoryKey, SalesOrderNumber,

SalesOrderLineNumber, RevisionNumber, OrderQuantity, UnitPrice,

ExtendedAmount, UnitPriceDiscountPct, DiscountAmount, ProductStandardCost,

TotalProductCost, SalesAmount, TaxAmt, Freight, CarrierTrackingNumber,

CustomerPONumber, OrderDate, DueDate, ShipDate

FROM [AdventureWorksDWDenali].[dbo].[FactInternetSales];

Now I’ll check if the 60,398 rows were divided up evenly over the 10 partitions:

SELECT partition_number, row_count

FROM sys.dm_db_partition_stats

WHERE object_id = object_id('[dbo].[FactInternetSales]')

clip_image001

It worked.  And if you look at the individual rows, you’ll see the cycle of sequence values were defined based on the PK composite key (SalesOrderNumber, SalesOrderLineNumber):

SELECT SalesOrderNumber, SalesOrderLineNumber, PartitionBucketKey

FROM [dbo].[FactInternetSales]

ORDER BY SalesOrderNumber, SalesOrderLineNumber

clip_image003

Okay, so it works.  But is this a wise thing to do? 

I don’t know yet.  I have other questions about this technique and I’d like to do more testing on various scenarios.  But I do like the fact that I’m able to leverage a native engine feature in service of another native engine feature.  Time will tell if this is a viable pattern or a known anti-pattern.

Categories:
SQL Server 2012

The SQL Server 2012 contained database feature has an interesting behavior when it comes to collation considerations between the SQL Server instance default collation and a user database collation.  I see this new behavior as a benefit, but rather than tell you about it, I’ll step through a demonstration instead.

First of all, this demonstration is on SQL Server 11.0.1750 (SQL Server 2012 RC0).  I’ll start by executing the following in order to determine the default collation of the instance:

SELECT SERVERPROPERTY('Collation')

This returns SQL_Latin1_General_CP1_CI_AS.

Next I’ll create a database that does NOT allow containment, so you can see the pre-2012 behavior:

 

CREATE DATABASE [PCDBExample_No_CDB]

 CONTAINMENT = NONE

 COLLATE French_CS_AI

GO

 

Notice that in addition to designating CONTAINMENT = NONE, I used a collation that was different from the SQL Server instance default. 

And next, I’m going to create two tables – one regular table and one temporary in the newly created database, and then insert identical rows:

 

USE [PCDBExample_No_CDB]

GO

 

CREATE TABLE [DemoCollation]

 

               (DemoCollationNM varchar(100))

GO

 

CREATE TABLE #DemoCollation

               (DemoCollationNM varchar(100))

 

INSERT dbo.DemoCollation

(DemoCollationNM)

VALUES ('Test Join')

 

INSERT #DemoCollation

(DemoCollationNM)

VALUES ('Test Join')

 

Now I’ll execute a query that joins the two tables based on the column name:

 

SELECT p.DemoCollationNM

FROM dbo.DemoCollation p

INNER JOIN #DemoCollation d ON

               p.DemoCollationNM = d.DemoCollationNM

 

This returns the following error message:

 

Msg 468, Level 16, State 9, Line 4

Cannot resolve the collation conflict between "SQL_Latin1_General_CP1_CI_AS" and "French_CS_AI" in the equal to operation.

 

Next I’ll look at sp_help for each table (regular and temporary):

 

EXEC sp_help [DemoCollation]

 

USE tempdb

EXEC sp_help #DemoCollation

 

As our error suggested, the DemoCollation had a collation of French_CS_AI for the DemoCollationNM varchar data type column, but a SQL_Latin1_General_CP1_CI_AS collation for the DemoCollationNM column in the #DemoCollation table.

 

Now let’s see what happens for a contained database.  In order to demonstrate a partially contained database, I’ll execute sp_configure as follows:

EXEC sp_configure 'contained database authentication', 1

RECONFIGURE

 

The following code creates a partially contained database and sets up the same test (different database):

 

CREATE DATABASE [PCDBExample_CDB]

 CONTAINMENT = PARTIAL

 COLLATE French_CS_AI

GO

 

USE [PCDBExample_CDB]

GO

 

CREATE TABLE [DemoCollation]

               (DemoCollationNM varchar(100))

GO

 

CREATE TABLE #DemoCollation2

               (DemoCollationNM varchar(100))

 

INSERT dbo.DemoCollation

(DemoCollationNM)

VALUES ('Test Join')

 

INSERT #DemoCollation2

(DemoCollationNM)

VALUES ('Test Join')

 

Now I’ll test the join:

 

SELECT p.DemoCollationNM

FROM dbo.DemoCollation p

INNER JOIN #DemoCollation2 d ON

               p.DemoCollationNM = d.DemoCollationNM

 

This time I get results instead of a collation error:

 

clip_image001

 

If I execute sp_help for #DemoCollation2 in tempdb, I also see that the Collation is French_CS_AI.  So the containment setting changed the default collation to the user-database’s default instead of the SQL Server instance level default.

Categories:
SQL Server 2012

This blog post applies to SQL Server 2012 CTP3 (11.0.1440).

SQL Server 2012 introduces the “indirect checkpoints” feature.  At a high level, it allows you to adjust the target recovery time of a specific database, rather than relying entirely on the SQL Server instance-level ‘recovery interval (min)’ setting. To demonstrate the impact, I’ll walk through a scenario where we’re performing large batch inserts into a data warehousing Fact table.  In this demo I’m using the Codeplex “AdventureWorksDWDenali” database.

As a first step, I’ll back up the database so that I can reset the baseline in order to compare the default checkpoint behavior with the new indirect checkpoint setting:

USE master

GO

 

BACKUP DATABASE [AdventureWorksDWDenali]

TO  DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Backup\IndirectCheckpoint_AWDW.bak'

 WITH NOFORMAT, NOINIT,  NAME = N'AdventureWorksDWDenali-Full Database Backup', CHECKSUM

GO

Now with that out of the way I’ll check the recovery interval of the SQL Server instance:

SELECT value_in_use 

FROM sys.configurations

WHERE name = 'recovery interval (min)'

 

The value in use is “0” – the default.  So basically the estimated checkpoint will be every minute for an active database (although I’m over-simplifying things here – since automatic checkpoint frequency varies based on other factors as well - but for this demo you'll still see a clear difference in the before-and-after).

 

Now I’ll check the indirect checkpoint time for the AdventureWorksDWDenali database:

 

SELECT target_recovery_time_in_seconds

FROM sys.databases

WHERE name = 'AdventureWorksDWDenali'

 

This returns “0” – meaning we haven’t configured this new SQL Server 2012 database option (yet) and so the SQL Server instance level setting is in effect.

 

Now I’ll create an extended events session to track the individual checkpoint events for the database (database_id "5" happens to be the AdventureWorksDWDenali database on my instance): 

 

CREATE EVENT SESSION [track_checkpoints] ON SERVER

ADD EVENT sqlserver.checkpoint_end(

    ACTION(sqlserver.database_id,sqlserver.database_name,sqlserver.is_system)

    WHERE ([sqlserver].[database_id]=(5)))

ADD TARGET package0.ring_buffer(SET max_memory=(2048))

GO

 

 

(By the way - I updated max_memory down to 2MB from the 100MB per Jonathan Kehayias' heads up on potential limits - not to mention that this was more memory than needed for this example).

 

 Next I’ll turn it on:

 

ALTER EVENT SESSION [track_checkpoints] ON SERVER STATE=START

GO

 

And now I’m going to push some I/O by creating a new table and populating it multiple times from the FactInternetSales table:

 

USE AdventureWorksDWDenali

GO

 

SELECT *

INTO dbo.CheckPoint_Test_FactInternetSales

FROM dbo.FactInternetSales

GO

 

INSERT dbo.CheckPoint_Test_FactInternetSales

SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey, CurrencyKey, SalesTerritoryKey, LEFT(CAST(NEWID() AS NVARCHAR(36)),20), SalesOrderLineNumber, RevisionNumber, OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount, ProductStandardCost, TotalProductCost, SalesAmount,

TaxAmt, Freight, CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate

FROM dbo.FactInternetSales

GO 10

 

This shows the following rows affected:

 

(60398 row(s) affected)

Beginning execution loop

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

Batch execution completed 10 times.

 

Now I’ll query the track_checkpoints Extended Event session (** and thanks to my colleague Jonathan Kehayias for the following query which I just modified to query against track_checkpoints and then change a few column names **):

 

SELECT

    n.value('(event/@name)[1]', 'varchar(50)') AS event_name,

               n.value('(event/@timestamp)[1]', 'datetime') as event_datetime

FROM

(    SELECT td.query('.') as n

    FROM

    (

        SELECT CAST(target_data AS XML) as target_data

        FROM sys.dm_xe_sessions AS s   

        JOIN sys.dm_xe_session_targets AS t

            ON s.address = t.event_session_address

        WHERE s.name = 'track_checkpoints'

          AND t.target_name = 'ring_buffer'

    ) AS sub

    CROSS APPLY target_data.nodes('RingBufferTarget/event') AS q(td)

) as tab

GO

 

This returns the following checkpoint event data:

 

event_name       event_datetime

checkpoint_end 2011-10-18 20:54:46.200

checkpoint_end 2011-10-18 20:54:46.650

checkpoint_end 2011-10-18 20:54:47.063

checkpoint_end 2011-10-18 20:54:47.433

checkpoint_end 2011-10-18 20:54:47.647

checkpoint_end 2011-10-18 20:54:48.200

checkpoint_end 2011-10-18 20:54:48.597

checkpoint_end 2011-10-18 20:54:49.157

checkpoint_end 2011-10-18 20:54:49.620

checkpoint_end 2011-10-18 20:54:50.127

 

So we see 10 checkpoints for the default behavior.

 

And now I’m going to stop the event session and restore the database back to its original state to show you the comparison to the SQL Server 2012 indirect checkpoint option:

 

ALTER EVENT SESSION [track_checkpoints] ON SERVER STATE=STOP

 

-- Restoring the database back to the original format

USE master

GO

 

RESTORE DATABASE [AdventureWorksDWDenali]

FROM  DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Backup\IndirectCheckpoint_AWDW.bak'

WITH  FILE = 1

GO

 

Next I’m going to change the database’s TARGET_RECOVERY_TIME to 5 minutes (this is the SQL Server 2012 indirect checkpoints feature I was talking about):

 

ALTER DATABASE AdventureWorksDWDenali

SET TARGET_RECOVERY_TIME = 5 MINUTES

 

Just to confirm that it “took” – I’ll validate sys.databases:

 

SELECT target_recovery_time_in_seconds

FROM sys.databases

WHERE name = 'AdventureWorksDWDenali'

 

This returns 300 seconds instead of the 0 seconds we saw before – 300 representing the new target recovery time I just set in minutes.

And now I’m going to enable the event session again and re-populate the table:

 

ALTER EVENT SESSION [track_checkpoints] ON SERVER STATE=START

GO

 

USE AdventureWorksDWDenali

GO

 

SELECT *

INTO dbo.CheckPoint_Test_FactInternetSales

FROM dbo.FactInternetSales

GO

 

INSERT dbo.CheckPoint_Test_FactInternetSales

SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey, CurrencyKey, SalesTerritoryKey, LEFT(CAST(NEWID() AS NVARCHAR(36)),20), SalesOrderLineNumber, RevisionNumber, OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount, ProductStandardCost, TotalProductCost, SalesAmount,

TaxAmt, Freight, CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate

FROM dbo.FactInternetSales

GO 10

 

How many checkpoints will we see now?  With all things equal between tests, we now only see four checkpoints:

 

event_name       event_datetime

checkpoint_end 2011-10-18 21:06:52.697

checkpoint_end 2011-10-18 21:06:53.520

checkpoint_end 2011-10-18 21:06:54.717

checkpoint_end 2011-10-18 21:06:56.277

 

I tested this before-and-after scenario three times, and each time I saw the identical change in the number of checkpoints.  So I saw a consistent result given an identical test setup.

 

That’s all for now – but two closing thoughts:

 

·        I like how we’re able to adjust this target recovery time at the database level.  This allows us to take into account the recovery time requirements per application if we need to – adjusting upward or downward based on different requirements or issues.

·        I didn’t measure the actual performance impact against the load, but if I had I imagine we would have seen some performance improvement with the second scenario of less frequent checkpoints (but with a trade-off of longer recovery time - a nontrivial consideration).

This post covers examples from Denali  SQL Server 2012 CTP3, version 11.0.1440.

In my last post I hinted towards some interesting findings regarding parallelism and columnstore indexes.  I’ll talk about what I observed in this post.

Just as a quick level-set, columnstore indexing provides two new areas of functionality within SQL Server 2012 that can potentially improve query performance for typical relational data warehouse workloads.  The features contributing to this potential performance gain include the Vertipaq columnstore technology (which uses compression and stores the data by column instead of row) and the new method of query processing that processes blocks of column data in batches.  You can benefit from from the columnstore storage alone  – but you may also see further query performance improvements if the query runs in "batch" execution mode versus "row".  

One key point to underscore– just because you add a columnstore index to a table doesn’t mean that your query will use “batch” execution mode. 

Let’s walk through an example of row vs. batch using the AdventureWorksDWDenali database.  The test server used in this illustration has four visible schedulers (not including the DAC) and 8GB of RAM.

Also – in order to increase the cost of the queries against this table and make it worth the columnstore index's time, I put together the below query to bump up the row count in FactInternetSales from 60,398 rows to 483,184 rows (SalesOrderNumber was part of the primary key, so I populated it with NEWID() and kept the other column values static from the existing rows and then used "GO 3" to populate it a few times):

INSERT dbo.FactInternetSales
(ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey, CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber, OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount, ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight, CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate
)

SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey, CurrencyKey, SalesTerritoryKey, LEFT(CAST(NEWID() AS NVARCHAR(36)),20), SalesOrderLineNumber, RevisionNumber, OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount, ProductStandardCost, TotalProductCost, SalesAmount
,
TaxAmt, Freight, CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate,
ShipDate
FROM
dbo.FactInternetSales
GO 3


 

The following query is executed against a table with no columnstore index (I’ve also enabled the actual Execution Plan to show afterwards – and also execute the query twice in order to measure results with the data in cache):

SET STATISTICS IO ON
SET STATISTICS
TIME
ON


SELECT
c.CommuteDistance

       d.CalendarYear
,
       SUM(f.SalesAmount)
TotalSalesByCommuteDistance
FROM
dbo.FactInternetSales as f
INNER JOIN dbo.DimCustomer as c
ON
           f.CustomerKey =
c.CustomerKey
INNER JOIN dbo.DimDate d
ON
           d.DateKey =
f.OrderDateKey
GROUP BY c.CommuteDistance
,
         d.CalendarYear 

The results for STATISTICS IO were as follows:

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'FactInternetSales'. Scan count 1, logical reads 20289, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'DimCustomer'. Scan count 1, logical reads 979, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'DimDate'. Scan count 1, logical reads 66, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

And the results of the SET STATISTICS TIME ON were as follows:

SQL Server Execution Times:

   CPU time = 562 ms,  elapsed time = 625 ms.

 

As for the execution plan, if we hover over the Clustered Index Scan for FactInternetSales you’ll see the “Actual Execution Mode”.

clip_image002[4]

So we see a value of “Row” (the execution mode we’re used to) and “Actual Number of Batches” of 0.

 

Now let’s create a columnstore index:

 

CREATE NONCLUSTERED COLUMNSTORE INDEX [CSI_FactInternetSales] ON [dbo].[FactInternetSales]
(
[OrderDateKey]
,
[CustomerKey]
,
[SalesAmount]
)


GO

Re-executing the query after creating the index, I see the following STATISTICS IO output:

 

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'FactInternetSales'. Scan count 1, logical reads 597, physical reads 0, read-ahead reads 1167, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'DimCustomer'. Scan count 1, logical reads 979, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'DimDate'. Scan count 1, logical reads 66, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 

That’s a reduction in logical reads against FactInternetSales from 20289 down to 597.  Very likeable.

What about STATISTICS TIME?

 

SQL Server Execution Times:

   CPU time = 453 ms,  elapsed time = 485 ms.

 

Okay so CPU time was 562 ms now it is 453 ms.  Elapsed time was 625 ms and now it is 485 ms. 

 

What about the execution plan, were we using batch execution mode?

 

clip_image004[4]

No.  Still using “Row” with a columnstore index scan.   So things brings me to what I observed a few days ago. 

One thing I didn’t mention is that in my test environment -  my max degree of parallelism was set to “1”. 

 

What happens if I uncap the max degree of parallelism for my 4-scheduler server and re-execute the query?

clip_image005[4]

 

After lifting the cap on max degree of parallelism, the query executes using a parallel plan and also switches from “row” to “batch” mode.  Notice also the “Actual Number of Batches” – which shows a value of 1,482.

 

This is all nice – but what about the query performance and I/O.  Any further reductions?

 

The I/O results are as follows:

Table 'DimCustomer'. Scan count 5, logical reads 1072, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'DimDate'. Scan count 3, logical reads 56, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'FactInternetSales'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

What’s this? Zero for all stats related to FactInternetSales?  I even dropped clean buffers to double-check, and it still displayed all zeros.  (More on this for another day.)

 

What about the execution statistics?

 

SQL Server Execution Times:

 

                 CPU time = 78 ms,  elapsed time = 114 ms.

 

Now we're talking big improvements. CPU time was 562 ms without columnstore, then 453 ms with columnstore + row processing and then 78 ms for columnstore + batch processing.  Elapsed time was 625 ms without columnstore and then 485 ms for columnstore + row processing and then 114 ms for columnstore + batch processing.  So more non-trivial reductions.

 

So a few key findings in closing:

 

·        Columnstore + row processing can provide a performance boost (getting column based storage and compression)

·        Columnstore + batch processing can provide additional performance improvements on top of the columnstore index I/O benefits

·        When I capped parallelism, I didn’t see batch processing – so check your execution plan and be aware that there are other factors that impact whether batch processing vs. row processing as well

Theme design by Nukeation based on Jelle Druyts