This is a question that came up at the Cleveland SQL Server User Group last night. An attendee had a wager with his colleagues about how the names of auto-created column-level stats are generated.
Let's find out. I'm going to create a table without any indexes and then do some queries.
USE master;
GO
CREATE DATABASE StatsTest;
GO
USE StatsTest;
GOEXEC sp_dboption 'StatsTest', 'auto create statistics', 'on';
GOCREATE TABLE t1 (c1 INT, c2 INT, c3 INT);
GO
INSERT INTO t1 VALUES (1, 1, 1);
INSERT INTO t1 VALUES (1, 1, 2);
INSERT INTO t1 VALUES (1, 1, 3);
GO
Now I'll do a little query. My database has AUTO_CREATE_STATISTICS turned out, so the query optimizer recognizes that it could use a statistic to potentially help choose a more efficient query plan more quickly – so it generates it.
SELECT * FROM t1 WHERE c3 < 2;
GOSELECT [object_id], [name] FROM sys.stats
WHERE [object_id] = OBJECT_ID ('t1');
GOobject_id name
———– ————————–
2073058421 _WA_Sys_00000003_7B905C75
I've got an automatically created statistic – but what does the name mean? Well, the 7B905C75 is the hex equivalent of the table's object ID. The Sys is because it was automatically created. The WA is because the SQL team is based in Washington State in the US (I'm absolutely serious!). The 00000003 is because the statistic is on column ID 3 of the table. I can prove this last one by doing a select on column ID 1 and checking the new statistic that is created.
SELECT * FROM t1 WHERE c1 < 2;
GOSELECT [object_id], [name] FROM sys.stats
WHERE [object_id] = OBJECT_ID ('t1');
GOobject_id name
———– ————————–
2073058421 _WA_Sys_00000003_7B905C75
2073058421 _WA_Sys_00000001_7B905C75
The new statistic has the exact same name as the previous one, but the first number is now 00000001 – clearly matching the column ID of the column I queried.
So there you go – wager settled!
13 thoughts on “How are auto-created column statistics names generated?”
That’s awesome Paul. I’m sorely tempted to start creating stats with names like:
_ON_Me_00000003_7B905C75
Many thanks as I will use this to collect my bet with some fellow workers
Hi Paul,
Thank you for this piece of nifty info. It just won bet from my director of development. He will be buying lunch at some nice sushi place.
Cheers!
A friend posed a question: how did Sybase also end up with ‘WA’ stats?
Hi Paul,
Sorry, a bit abrupt of me.
I meant say — do you happen to know how Sybase also ended up with ‘WA’ stats objects?
Thanks,
James
SyBase and Microsoft shared the codebase for a little while before splitting off.
Thanks Paul. Would you say then — and I grant that this is completely unimportant — that rumors that the WA stands for SyBase’s home of Waterloo are false?
Hello Sir,
Is is possible to get the “creation time” of auto created stats “_WA_Sys__<hex value" ?
Thank You!
Google for STATS_DATE.
I am trying to trace a problem where very simple updates on a fairly big table (225 million rows) is sometimes taking a long time. I see that there are there are three naming convention for the stats on these table – _WA, _IX_ and _dta_index_tablenamehere_7_1531152500__K1_K18_K17_K16_K20_K9_6_7_8_10_12_13_14_15. I suspect the third type with the long names are generated by a previous run of SQL tuninig and applying the recommendations from it. There are lots of these stats and could this be the reason for the slowness? is it safe to remove them as the other types (_WA and _IX) are already there
If the _dta one is marked as hypothetical, you can safely drop it.
Extra statistics are not something that causes queries to be slow, unless they’re not up to date and so are causing incorrect query plans. This isn’t really something we can diagnose over blog comments I’m afraid. Try updating your statistics and see if that helps.
I have a lot approximately 4000 _wa statistics in my database. Is it safe to drop system generated ones and create our own on the same columns
It’ll cause your query plans to recompile so if you’re going to do that, do it off hours if you can.