Further Testing with Automatic Updates to Statistics

A new question on an older post, Understanding When Statistics Will Automatically Update, came in this week.  If you look in the comments, you see that Rajendra asks:

…I want know, whether schema change of a table (adding new / (deleting an existing) column) also qualifies for UPDATE STATS?

It’s an interesting question that I’d never considered before.  I had an instinct about the answer, but the scientist in me wanted to test and prove it, so here we go.  The scenario and code are similar to my original post, so this should be a quick read for many of you 🙂

The Setup

Start with a copy of the AdventureWorks2012 database, which you can download from CodePlex, on a 2012 SP1 instance, and confirm that the Auto Update Statistics option is enabled:

IF (SELECT COUNT(*) FROM [sys].[databases] WHERE [name] = 'AdventureWorks2012' AND [is_auto_create_stats_on] = 0) = 0
BEGIN
ALTER DATABASE [AdventureWorks2012] SET AUTO_UPDATE_STATISTICS ON
END;

Create a copy of the Sales.SalesOrderDemo table for our testing, and add the clustered index and one nonclustered index:

USE [AdventureWorks2012];
GO

SELECT *
INTO [Sales].[TestSalesOrderDetail]
FROM [Sales].[SalesOrderDetail];
GO

CREATE CLUSTERED INDEX [PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID] ON [Sales].[TestSalesOrderDetail] ([SalesOrderID], [SalesOrderDetailID]);
GO

CREATE NONCLUSTERED INDEX [IX_TestSalesOrderDetail_ProductID] ON [Sales].[TestSalesOrderDetail] ([ProductID]);
GO

Validate the current statistics and modifications; the statistics should be current since we just created the indexes, and modifications should be 0 since we haven’t run any inserts, updates, or deletes:

/*
Note: I refer to this query throughout the post as "the sys.stats query", rather than including it multiple times!
*/
SELECT
OBJECT_NAME([sp].[object_id]) AS "Table",
[sp].[stats_id] AS "Statistic ID",
[s].[name] AS "Statistic",
[sp].[last_updated] AS "Last Updated",
[sp].[rows],
[sp].[rows_sampled],
[sp].[unfiltered_rows],
[sp].[modification_counter] AS "Modifications"
FROM [sys].[stats] AS [s]
OUTER APPLY sys.dm_db_stats_properties ([s].[object_id],[s].[stats_id]) AS [sp]
WHERE [s].[object_id] = OBJECT_ID(N'Sales.TestSalesOrderDetail');
Current statistics and no modifications
Current statistics and no modifications

Excellent, we’re starting with a clean slate.

The Test

Let’s first add a little bit of data (1000 rows), to up our modification count:

BULK INSERT [AdventureWorks2012].[Sales].[TestSalesOrderDetail]
FROM 'S:\SQLStuff\Dropbox\Statistics\Data\sod_1000.txt'
WITH
(
DATAFILETYPE = 'native',
TABLOCK
);

When we run our sys.stats query again, we see the 1000 modifications for both indexes, because both indexes had rows added:

Statistics and modifications after adding 1000 rows
Statistics and modifications after adding 1000 rows

Great.  Now add a new column to the table…let’s pretend that we have a new ProductID, but we want to keep the existing one for historical reasons, so our new column is, creatively, NewProductID:

ALTER TABLE [Sales].[TestSalesOrderDetail] ADD [NewProductID] INT;

Did adding this column cause statistics to update?  This was our main question, and if we run our sys.stats query we see:

Statistics and modifications after adding new column
Statistics and modifications after adding new column

Nothing changed…  Adding a new column to the table does not invoke an automatic update to statistics.  Nor should it.  We haven’t modified data in any existing statistics.  We would certainly expect a change in query plans, for existing queries that are modified to use this new column, but adding a new column to a table does not cause any existing statistics for the table to update.

More Testing (because it’s good practice to keep proving out what we expect)

Just for fun, let’s see what happens if we modify some more data:

UPDATE Sales.TestSalesOrderDetail SET ProductID = 717 WHERE SalesOrderID = 75123;
GO

If we check sys.stats, we see that our nonclustered index, IX_TestSalesOrderDetail_ProductID, has three more modifications than it had previously, but not enough data has changed overall for statistics to be invalidated.

Statisitics and modifications after updating 3 ProductIDs
Statisitics and modifications after updating 3 ProductIDs

What happens if we update NewProductID?

UPDATE [Sales].[TestSalesOrderDetail]
SET [NewProductID] = [ProductID] + 1000;

Well, we updated 122317 rows, but only the NewProductID column, which isn’t part of either the clustered index or nonclustered index key.  Therefore, no update to statistics:

Statistics and modifications after updating NewProductID
Statistics and modifications after updating NewProductID

To prove this out further, create a nonclustered index on NewProductID:

CREATE NONCLUSTERED INDEX [NCI_TestSalesOrderDetail_NewProductID] ON [Sales].[TestSalesOrderDetail] ([NewProductID]);
GO

And if we verify the statistics and modifications, we see that our new NCI has zero modifications, as expected.

Statistics and updates after adding the nonclustered index on NewProductID
Statistics and updates after adding the nonclustered index on NewProductID

We’ll make another massive update to NewProductID, because the first update wasn’t correct, and then check sys.stats again:

UPDATE [Sales].[TestSalesOrderDetail]
SET [NewProductID] = [ProductID] + 10000;
Statistics and modifications after updating NewProductID a second time
Statistics and modifications after updating NewProductID a second time

Now we see the modifications – all 122317 of them.  If we query the NewProductID column now, we should invoke an automatic update to statistics, because they are invalid:

SELECT NewProductID, SalesOrderID
FROM [Sales].[TestSalesOrderDetail]
WHERE NewProductID = 10718;
Statistics and modifications after an auto update was invoked
Statistics and modifications after an auto update was invoked

And there you have it.  Adding a column to a table will not cause an automatic update to statistics to occur.  Automatic updates occur only when the threshold for modifications has been exceeded for a statistics key.

/*
clean up code
*/
USE [AdventureWorks2012];
GO
DROP TABLE [Sales].[TestSalesOrderDetail];

4 thoughts on “Further Testing with Automatic Updates to Statistics

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

A Fond Farewell

If you haven’t guessed from the title, I’m writing this post because I am leaving SQLskills. This Friday, January 14th, is my last day, and

Explore

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.