Hash Partitioning with SQL Server 2012’s SEQUENCE object and CYCLE argument

When I first heard about SQL Server 2012’s SEQUENCE object – I thought it was an interesting feature to be added and one that I have been asked about by customers in the past (from those who had worked on different database platforms). But when I looked at the CYCLE argument of SEQUENCE, that’s when I really got interested.

I wondered if it could be used in the service of implementing hash partitioning (of sorts) – allowing me to evenly distribute rows across a set number of partitions based on a hash key. In this scenario I want the distribution to be evenly spread out, but NOT partition based on other business keys (like a datetime column or other attribute that has business or application meaning).

So will a column with a sequence default also work as a partition key?

I started off by creating a new table based on AdventureWorkDWDenali’s FactInternetSales table:

— Create demonstration Fact table, no constraints, indexes, keys
— Tested on version 11.0.1750 (SQL Server 2012 RC0)
USE [SequenceDemo];
GO
CREATE TABLE [dbo].[FactInternetSales](
[ProductKey] [int] NOT NULL,
[OrderDateKey] [int] NOT NULL,
[DueDateKey] [int] NOT NULL,
[ShipDateKey] [int] NOT NULL,
[CustomerKey] [int] NOT NULL,
[PromotionKey] [int] NOT NULL,
[CurrencyKey] [int] NOT NULL,
[SalesTerritoryKey] [int] NOT NULL,
[SalesOrderNumber] [nvarchar](20) NOT NULL,
[SalesOrderLineNumber] [tinyint] NOT NULL,
[RevisionNumber] [tinyint] NOT NULL,
[OrderQuantity] [smallint] NOT NULL,
[UnitPrice] [money] NOT NULL,
[ExtendedAmount] [money] NOT NULL,
[UnitPriceDiscountPct] [float] NOT NULL,
[DiscountAmount] [float] NOT NULL,
[ProductStandardCost] [money] NOT NULL,
[TotalProductCost] [money] NOT NULL,
[SalesAmount] [money] NOT NULL,
[TaxAmt] [money] NOT NULL,
[Freight] [money] NOT NULL,
[CarrierTrackingNumber] [nvarchar](25) NULL,
[CustomerPONumber] [nvarchar](25) NULL,
[OrderDate] datetime NULL,
[DueDate] datetime NULL,
[ShipDate] datetime NULL
)ON [PRIMARY];
GO

Next I created the sequence object (increment by 1, with a min of 1, max of 10, caching of 10 at a time and a cycling of values):

CREATE SEQUENCE dbo.Seq_FactInternetSales
AS int
START WITH 1
INCREMENT BY 1
MINVALUE 1
MAXVALUE 10
CYCLE
CACHE 10;

After that, I added a new column to the Fact table called PartitionBucketKey and associated it with the new sequence object:

ALTER TABLE [dbo].[FactInternetSales]
ADD PartitionBucketKey int DEFAULT
(NEXT VALUE FOR dbo.Seq_FactInternetSales);

Next, I created a partition function and scheme:

— Create a new partition function
CREATE PARTITION FUNCTION pfFactInternetSales(int)
AS RANGE LEFT FOR VALUES (1,2,3,4,5,6,7,8,9);
— Create a new partition scheme
— And yes, being lazy about the FGs, as I just want to see whether the
— individual partitions fan-out the way I want…
CREATE PARTITION SCHEME psFactInternetSales
AS PARTITION pfFactInternetSales
ALL TO ( [PRIMARY] );

Next up, I created a clustered index on the table referencing the PK columns used in the original version of this table but then referencing the PartitionBucketKey in partition scheme:

— Create it on the new column referencing the sequence
CREATE CLUSTERED INDEX IX_FactInternetSales
ON dbo.FactInternetSales(SalesOrderNumber, SalesOrderLineNumber)
ON psFactInternetSales(PartitionBucketKey);

It’s show time. Now I went ahead and populated 60,398 rows from the original table. Not much for this test I realize, but this was just an initial proof-of-concept:

INSERT dbo.FactInternetSales
(ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey,
PromotionKey, CurrencyKey, SalesTerritoryKey, SalesOrderNumber,
SalesOrderLineNumber, RevisionNumber, OrderQuantity, UnitPrice,
ExtendedAmount, UnitPriceDiscountPct, DiscountAmount, ProductStandardCost,
TotalProductCost, SalesAmount, TaxAmt, Freight, CarrierTrackingNumber,
CustomerPONumber, OrderDate, DueDate, ShipDate)
SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey,
PromotionKey, CurrencyKey, SalesTerritoryKey, SalesOrderNumber,
SalesOrderLineNumber, RevisionNumber, OrderQuantity, UnitPrice,
ExtendedAmount, UnitPriceDiscountPct, DiscountAmount, ProductStandardCost,
TotalProductCost, SalesAmount, TaxAmt, Freight, CarrierTrackingNumber,
CustomerPONumber, OrderDate, DueDate, ShipDate
FROM [AdventureWorksDWDenali].[dbo].[FactInternetSales];

Now I’ll check if the 60,398 rows were divided up evenly over the 10 partitions:

SELECT partition_number, row_count
FROM sys.dm_db_partition_stats
WHERE object_id = object_id(‘[dbo].[FactInternetSales]’);

It worked. And if you look at the individual rows, you’ll see the cycle of sequence values were defined based on the PK composite key (SalesOrderNumber, SalesOrderLineNumber):

SELECT SalesOrderNumber, SalesOrderLineNumber, PartitionBucketKey
FROM [dbo].[FactInternetSales]
ORDER BY SalesOrderNumber, SalesOrderLineNumber;

Okay, so it works. But is this a wise thing to do?

I don’t know yet. I have other questions about this technique and I’d like to do more testing on various scenarios. But I do like the fact that I’m able to leverage a native engine feature in service of another native engine feature. Time will tell if this is a viable pattern or a known anti-pattern.

4 thoughts on “Hash Partitioning with SQL Server 2012’s SEQUENCE object and CYCLE argument”

Eric Humphrey says:

January 25, 2012 at 4:51 am

Can’t you do something similar just using an identity, the modulus operator, and a computed column? Does this gain you anything over that approach?

Thanks,
Joe Sack says:

January 25, 2012 at 4:56 am

Good questions Eric! I haven’t tried that technique before. I’d be interested in the CPU overhead of that versus this technique (and also would like to compare execution plans).

One thing I’m also wondering is if the "cache" option of the sequence will provide a benefit for large data loads. It would definitely be interesting to test several options. I was also curious to compare it to a UDF (T-SQL and CLR). We’ll see if I can get to that.

If you have time to validate, let me know and share here if you can.

Thanks!
Pedro Lopes says:

September 20, 2013 at 12:22 am

Hi Joe,
Came across this post when searching for a related matter, and thought you would like to check another cool scenario using sequences, using bit reversal instead of partitioning to tackle with latch contention. It’s in the second half of the post http://blogs.msdn.com/b/blogdoezequiel/archive/2013/05/23/pagelatch-ex-waits-and-heavy-inserts.aspx
Cheers!
1. Joseph Sack says:
  
  September 20, 2013 at 3:51 am
  
  Thank you Pedro, I’ll check that post out!

Comments are closed.

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

Hash Partitioning with SQL Server 2012’s SEQUENCE object and CYCLE argument

4 thoughts on “Hash Partitioning with SQL Server 2012’s SEQUENCE object and CYCLE argument”

Other articles

New Course: “SQL Server: Common Query Tuning Problems and Solutions – Part 2″

New Course: “SQL Server: Common Query Tuning Problems and Solutions – Part 1”

SQLIntersection Post-Conference Session

Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator

Finding Timeout-Prone Code with max_elapsed_time

Deck from “Practical SQL Server Cardinality Estimation”

Imagine feeling confident enough to handle whatever your database throws at you.