In this post I’ll just share the results of a simple set of tests regarding automatic-update sampling.  Nothing fancy – I just populated the category table from the Credit sample database with varying row counts and then took note of the associated duration and sampling percent.  The query I used to kick of auto-updates after the row inserts was the following:

    
SELECT  [category_no],
[category_desc],
[category_code]
FROM [dbo].[category]
WHERE [category_desc] = 'Meals'
OPTION (RECOMPILE);

I also show the StatMan MAXDOP just for kicks (once true sampling kicks in, it is set to MAXDOP “1”, so not terribly interesting). I executed a few test iterations beyond what I’m showing here, but I just include two test runs per scenario just for comparison:

Test ID Row Count Data Pages StatMan Duration (ms) TABLESAMPLE PERCENT MAXDOP
1 10 1 0 100.0000 16
2 10 1 0 100.0000 16
1 1,000,000 3,228 1228 34.6964 1
2 1,000,000 3,228 1184 34.6964 1
1 2,000,000 6,455 1297 18.0480 1
2 2,000,000 6,455 1248 18.0480 1
1 10,000,000 32,275 1287 4.2324 1
2 10,000,000 32,275 1171 4.2324 1
1 20,000,000 64,550 1481 2.3625 1
2 20,000,000 64,550 1489 2.3625 1
1 40,000,000 129,103 2071 1.3617 1
2 40,000,000 129,103 1890 1.3617 1
1 80,000,000 258,210 2487 0.8129 1
2 80,000,000 258,210 2301 0.8129 1

A few comments:

  • Notice the sample percent is identical between identical row/page tests (each test started with a restore to the same base state)
  • Notice that as the data pages grow, the duration of the stats update doesn’t grow significantly, and then consider the sample percentage multiplied by the total data page value (for example – ~1757 data pages for 40M vs. ~2098 data pages for 80M)
  • Duration is just based on my laptop and I was more interested in the relative duration across test runs

Now what happens if I rebuild the index with a low fill factor? (Just doing this on the 80M row version):

    

USE [Credit]
GO
ALTER INDEX [category_ident] ON [dbo].[category]
REBUILD PARTITION = ALL
WITH (FILLFACTOR = 10);
GO

I’ll do a non-updating update so that I can get the threshold for the column updates incremented and thus kick off an auto-update of statistics:

  
UPDATE TOP (16000500) [category]
SET [category_desc] = [category_desc];
GO

Here are the results (Test ID 3 in bold):

Test ID Row Count Data Pages StatMan Duration (ms) TABLESAMPLE PERCENT MAXDOP
1 80,000,000 258,210 2487 0.8129 1
2 80,000,000 258,210 2301 0.8129 1
3 80,000,000 2,543,516 1399 0.4568 1

Same row count, significantly more pages, lower duration (interesting) and a reduced table sample percent – but if you do the math, more pages than the previous samplings. Lots of other fun things we could try, but that’s all for now…