How are auto-created column statistics names generated?

This is a question that came up at the Cleveland SQL Server User Group last night. An attendee had a wager with his colleagues about how the names of auto-created column-level stats are generated.

Let's find out. I'm going to create a table without any indexes and then do some queries.

USE master;
GO
CREATE DATABASE StatsTest;
GO
USE StatsTest;
GO

EXEC sp_dboption 'StatsTest', 'auto create statistics', 'on';
GO

CREATE TABLE t1 (c1 INT, c2 INT, c3 INT);
GO
INSERT INTO t1 VALUES (1, 1, 1);
INSERT INTO t1 VALUES (1, 1, 2);
INSERT INTO t1 VALUES (1, 1, 3);
GO

Now I'll do a little query. My database has AUTO_CREATE_STATISTICS turned out, so the query optimizer recognizes that it could use a statistic to potentially help choose a more efficient query plan more quickly – so it generates it.

SELECT * FROM t1 WHERE c3 < 2;
GO

SELECT [object_id], [name] FROM sys.stats
WHERE [object_id] = OBJECT_ID ('t1');
GO

object_id   name
———– ————————–
2073058421  _WA_Sys_00000003_7B905C75

I've got an automatically created statistic – but what does the name mean? Well, the 7B905C75 is the hex equivalent of the table's object ID. The Sys is because it was automatically created. The WA is because the SQL team is based in Washington State in the US (I'm absolutely serious!). The 00000003 is because the statistic is on column ID 3 of the table. I can prove this last one by doing a select on column ID 1 and checking the new statistic that is created.

SELECT * FROM t1 WHERE c1 < 2;
GO

SELECT [object_id], [name] FROM sys.stats
WHERE [object_id] = OBJECT_ID ('t1');
GO

object_id   name
———– ————————–
2073058421  _WA_Sys_00000003_7B905C75
2073058421  _WA_Sys_00000001_7B905C75

The new statistic has the exact same name as the previous one, but the first number is now 00000001 – clearly matching the column ID of the column I queried.

So there you go – wager settled!

11 thoughts on “How are auto-created column statistics names generated?

  1. Hi Paul,
    Thank you for this piece of nifty info. It just won bet from my director of development. He will be buying lunch at some nice sushi place.
    Cheers!

  2. Hi Paul,

    Sorry, a bit abrupt of me.

    I meant say — do you happen to know how Sybase also ended up with ‘WA’ stats objects?

    Thanks,

    James

      1. Thanks Paul. Would you say then — and I grant that this is completely unimportant — that rumors that the WA stands for SyBase’s home of Waterloo are false?

  3. I am trying to trace a problem where very simple updates on a fairly big table (225 million rows) is sometimes taking a long time. I see that there are there are three naming convention for the stats on these table – _WA, _IX_ and _dta_index_tablenamehere_7_1531152500__K1_K18_K17_K16_K20_K9_6_7_8_10_12_13_14_15. I suspect the third type with the long names are generated by a previous run of SQL tuninig and applying the recommendations from it. There are lots of these stats and could this be the reason for the slowness? is it safe to remove them as the other types (_WA and _IX) are already there

    1. If the _dta one is marked as hypothetical, you can safely drop it.

      Extra statistics are not something that causes queries to be slow, unless they’re not up to date and so are causing incorrect query plans. This isn’t really something we can diagnose over blog comments I’m afraid. Try updating your statistics and see if that helps.

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.