SEQUENCE Object versus Modulus Hash Partitioning

Last month I wrote a post called Hash Partitioning with SQL Server 2012’s SEQUENCE object and CYCLE argument and I was asked a good question by Eric Humphrey (twitter) about whether there was any advantage of using a sequence object for hash partitioning over an identity/modulus/computed-column technique.

At that point I hadn’t compared the two techniques, but I wanted to perform a test eventually, so here is what I found… This was a quick-and-dirty test, by the way, as I really should have been working on something else (first), but I needed a little “fun” time.

Before testing, I made a few minor modifications to the schema used in my original post:

1. I split the range of the partition function as follows:

ALTER PARTITION FUNCTION pfFactInternetSales () SPLIT RANGE (0)

2. I changed the sequence object to be zero-based instead of one-based:

CREATE SEQUENCE dbo.Seq_FactInternetSales

AS int

START WITH 0

INCREMENT BY 1

MINVALUE 0

MAXVALUE 10

CYCLE

CACHE 10;

That is all that was changed with the sequence object technique I used before. As for a modulus technique, I created two additional columns to another version of the FactInternetSales table:

ProductID int NOT NULL IDENTITY(1,1),

PartitionBucketKey AS ProductID %11 PERSISTED,

After adding the columns I referenced the PartitionBucketKey table as the partition key for the clustered index and imported the data from an INSERT…SELECT (as I did with the sequence object method).

So did the performance differ between the two techniques?

Here are the results of eight separate tests (each executed with a cold data cache for the SELECT, truncated destination table and pre-sized data and log files):

Test number	Rows loaded	Buckets	Hash Partition	Method
1	1,207,960 rows	11 buckets	26 seconds	Sequence
2	1,207,960 rows	11 buckets	26 seconds	Sequence
3	1,207,960 rows	11 buckets	29 seconds	Sequence
4	1,207,960 rows	11 buckets	26 seconds	Sequence
5	1,207,960 rows	11 buckets	27 seconds	Modulus
6	1,207,960 rows	11 buckets	24 seconds	Modulus
7	1,207,960 rows	11 buckets	25 seconds	Modulus
8	1,207,960 rows	11 buckets	26 seconds	Modulus

So the modulus method seems to perform ever-so-slightly faster than the sequence method. Stylistically, the sequence method involved a more direct way to create X number of hash buckets (basically no calculation necessary). With that said, the execution plans are nearly identical.

Sequence Plan (estimated cost – 17.4357)

Modulus Plan (estimated cost – 17.635)

I’m sure there are other variations of this test that I could try out, but that’s all for now. I’m curious about other techniques people may be using (like a CLR function to do the hashing), so please post them here if you have used an interesting variation on this post’s theme.

2 thoughts on “SEQUENCE Object versus Modulus Hash Partitioning”

Lee says:

February 22, 2012 at 11:22 am

Interesting.. I wonder how that compares to simple row_number().

Try increasing the cache size to 11 on the sequence and throw on a (tablockx) on the insert !
Eric Humphrey says:

February 23, 2012 at 1:03 am

Thanks Joe. So the upshot is we can do this in versions prior to 2012. The downside is the computed column, since it is persisted, takes up additional space.

Comments are closed.

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

SEQUENCE Object versus Modulus Hash Partitioning

2 thoughts on “SEQUENCE Object versus Modulus Hash Partitioning”

Other articles

New Course: “SQL Server: Common Query Tuning Problems and Solutions – Part 2″

New Course: “SQL Server: Common Query Tuning Problems and Solutions – Part 1”

SQLIntersection Post-Conference Session

Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator

Finding Timeout-Prone Code with max_elapsed_time

Deck from “Practical SQL Server Cardinality Estimation”

Imagine feeling confident enough to handle whatever your database throws at you.