Last month I wrote a post called Hash Partitioning with SQL Server 2012’s SEQUENCE object and CYCLE argument and I was asked a good question by Eric Humphrey (twitter) about whether there was any advantage of using a sequence object for hash partitioning over an identity/modulus/computed-column technique. 

At that point I hadn’t compared the two techniques, but I wanted to perform a test eventually, so here is what I found…  This was a quick-and-dirty test, by the way, as I really should have been working on something else (first), but I needed a little “fun” time.

Before testing, I made a few minor modifications to the schema used in my original post:

1.      I split the range of the partition function as follows:

 

ALTER PARTITION FUNCTION pfFactInternetSales () SPLIT RANGE (0)

 

2.      I changed the sequence object to be zero-based instead of one-based:

 

CREATE SEQUENCE dbo.Seq_FactInternetSales

    AS int

               START WITH 0

               INCREMENT BY 1

    MINVALUE 0

    MAXVALUE 10

    CYCLE

    CACHE 10;

That is all that was changed with the sequence object technique I used before. As for a modulus technique, I created two additional columns to another version of the FactInternetSales table:

ProductID int NOT NULL IDENTITY(1,1),

               PartitionBucketKey AS ProductID %11  PERSISTED,

 

After adding the columns I referenced the PartitionBucketKey table as the partition key for the clustered index and imported the data from an INSERT…SELECT (as I did with the sequence object method).

 

So did the performance differ between the two techniques?

 

Here are the results of eight separate tests (each executed with a cold data cache for the SELECT, truncated destination table and pre-sized data and log files):

 

Test number

Rows loaded

Buckets

Hash Partition

Method

1

1,207,960 rows

11 buckets

26 seconds

Sequence

2

1,207,960 rows

11 buckets

26 seconds

Sequence

3

1,207,960 rows

11 buckets

29 seconds

Sequence

4

1,207,960 rows

11 buckets

26 seconds

Sequence

5

1,207,960 rows

11 buckets

27 seconds

Modulus

6

1,207,960 rows

11 buckets

24 seconds

Modulus

7

1,207,960 rows

11 buckets

25 seconds

Modulus

8

1,207,960 rows

11 buckets

26 seconds

Modulus

So the modulus method seems to perform ever-so-slightly faster than the sequence method.  Stylistically, the sequence method involved a more direct way to create X number of hash buckets (basically no calculation necessary).  With that said, the execution plans are nearly identical.

Sequence Plan (estimated cost – 17.4357)

clip image002 thumb SEQUENCE Object versus Modulus Hash Partitioning

Modulus Plan (estimated cost – 17.635)

clip image004 thumb SEQUENCE Object versus Modulus Hash Partitioning

I’m sure there are other variations of this test that I could try out, but that’s all for now. I’m curious about other techniques people may be using (like a CLR function to do the hashing), so please post them here if you have used an interesting variation on this post’s theme.