OK, I've talked about the clustering key many times. Here, I thought I'd bring together one final series of details (and links) to help you really understand why I'm so adamant about DESIGNING a clustering key and not just letting SQL Server pick it for you (for example when it defaults to making the primary key clustered). Just because SQL Server defaults to making the primary key clustered doesn't actually mean it's a GOOD clustering key!

The key things that I've always recommended about a GOOD clustering key is that it is: unique, narrow, static and ever-increasing. For more details on the reasons behind this, check out these posts:

And, today, there are two additional items that I want to add to this list: your clustering key should be non-nullable and fixed-width!

First, why non-nullable?

In a regular data row you will ALWAYS have a null block. This is 2 bytes for a column count (in that row) and 1 bit per column (to store the actual null values). If you want to get more details on the internals of a data row, see Paul's post: Inside the Storage Engine: Anatomy of a record. However, the btree of the clustered index and the nonclustered index leaf/non-leaf levels do NOT have to have this "null block" of information if the columns in the index do NOT allow nulls. While this may be only 3 or 4 bytes (depending on the number of columns), it's still 3 or 4 bytes that would have to be added to EVERY nonclustered index's leaf level (for every row). And, it just doesn't have to be there. Why waste space when you don't have to!

Second, and along the same lines, is the need to use fixed-width columns!

Just like in data rows, index rows only have a variable block (offsets + end of row) when there are variable-width columns. If a clustered index is chosen that has one or more variable-width columns then you are adding at least 4 bytes to every row that might not have otherwise needed to be there. In fact, often people forget that uniquifiers are stored in the variable block portion of the row and as a result, a uniquifier really needs 8 bytes (4 bytes for the int + 2 bytes in the variable block for the offset + 2 bytes for the end of row marker).

To show you this, I've created four tables each with the same 3 columns DATA TYPES (table 4 has col1 as an int that's nullable and you can't do that with an identity so I used INSERT/SELECT to copy the data over):

col1 int identity not null,
col2 datetime2(7) not null default sysdatetime(),
col3 datetime2(7) null default sysdatetime()

And, each of the tables has these four nonclustered indexes:

  • Non-unique nonclustered on col2
  • Unique nonclustered on col2
  • Non-unique nonclustered on col3
  • Unique nonclustered on col3

The difference is solely within the definition of the clustering key as well as weather or not the values are unique (or nullable)!

CLTable1 is clustered and non-nullable but NOT defined as unique (the values are unique)

CLTable2 is clustered and non-nullable but there are duplicate values in the clustering key

CLTable3 is defined as a unique clustered index

CLTable4 is clustered, nullable and there are duplicate values in the clustering key

The end result is the sizes of all of the keys and the wasted space from different keys!

For CLTable1, CLTable2 and CLTable3 – all of the minimums are the same. For CLTable1 and CLTable3 – the maximums are the same as the minimums. In CLTable3 it's because the records MUST be unique. In CLTable1 it's because the records just happen to be unique right now (meaning there's no overhead unless there are actual dupes).

        record_size_in_bytes
index_id           min          max          avg
1 27 27 27
1 11 11 11
2 13 13 13
2 19 19 19
3 13 13 13
3 15 15 15
4 16 16 16
4 22 22 22
5 16 16 16
5 18 18 18

For CLTable2 the maximums are each 8 bytes larger (highlighted in yellow) EXCEPT where in the btrees of UNIQUE nonclustered indexes where SQL Server does NOT need to duplicate the clustering key (pale pink):

        record_size_in_bytes
index_id         min         max          avg
1 27 35 27.73
1 11 19 11.76
2 13 21 13.73
2 19 27 20.09
3 13 21 13.73
3 15 15 15.00
4 16 24 16.73
4 22 30 22.92
5 16 24 16.73
5 18 18 18.00

NOTE: If you're not familar with index internals and when/why SQL Server duplicates the clustering key in nonclustered indexes, check out my Index Internals chapter from the SQL Server 2008 Internals title (here's the Amazon link: http://www.amazon.com/Microsoft-SQL-Server-2008-Internals/dp/0735626243/ref=sr_1_1?ie=UTF8&qid=1291001371&sr=8-1) and my companion content from Chapter 6 here: Companion content for Chapter 6 (Index Internals) of SQL Server 2008 Internals. And, if you really want to see what's in your indexes, check out my updated versions of sp_helpindex here: http://www.sqlskills.com/blogs/kimberly/category/sp_helpindex-rewrites.aspx.

Finally, CLTable4 has the most interesting results…

       record_size_in_bytes
index_id         min         max          avg
1 27 35 27.73
1 14 22 14.78
2 16 24 16.73
2 22 30 22.92
3 16 24 16.73
3 15 15 15
4 16 24 16.73
4 22 30 22.92
5 16 24 16.73
5 18 18 18

Nonclustered indexes 2 and 3 (and the clustered index's btree) are all 3 bytes larger… why? Because the clustering key allows nulls and each of these structures has only 2-3 columns. As a result, the space needed for the null block is 3 bytes (2 bytes for the column count [NCol] and 1 byte for the actual null bitmap [less than 9 columns]).

The btree for index id 3 only needs the nonclustered key column (col2) which does not allow NULLs and the index is unique so the tree is the same as before.

Nonclustered indexes 4 and 5 already had a nullable column and therefore already had a null block.

My point – it all adds up and if it's not truly necessary, then avoid it!

Thanks for reading!
kt