For a few weeks now, there has been a long running thread on SQL Server Central about UPDATE operations that don't perform any changes because the statement sets a column to the same value as already exists in the row.  While I believe that one of the posters is incredibly snarky and outright rude to the people trying to answer the question and explain the behaviors being noticed, the topic is quite interesting and I finally got some time on my flight to Seattle to take a look at some of the internals of this with Extended Events in SQL Server 2012. 

Setting up the Tests

To setup the tests to be performed I am going to use the example table and row of data that are provided in that thread for the investigation with the exception that I am going to use a user database instead of tempdb for holding the test table so that I can create separate tests for a case-sensitive collation versus case-insensitive collations.

– Create the case-insensitive database
CREATE DATABASE TestUpdate COLLATE SQL_Latin1_General_CP1_CI_AI
GO
USE TestUpdate;
GO
CREATE TABLE Test
(
    ID         INT IDENTITY(1,1) NOT NULL,
    Name    VARCHAR(100) COLLATE SQL_Latin1_General_CP1_CI_AI NOT NULL
);
GO
INSERT dbo.Test
    (Name)
VALUES
    ('Tom');
GO
CHECKPOINT;
GO

– Create the case-sensitive database
CREATE DATABASE [TestUpdateCS] COLLATE SQL_Latin1_General_CP1_CS_AS
GO
USE TestUpdateCS;
GO
CREATE TABLE Test
(
    ID      INT IDENTITY(1,1) NOT NULL,
    Name    VARCHAR(100) COLLATE SQL_Latin1_General_CP1_CS_AS NOT NULL
);
GO
INSERT dbo.Test
    (Name)
VALUES
    ('Tom');
GO
CHECKPOINT;
GO

Then to setup the Extended Events session for collecting information about the operations being performed, we are going to collect a number of events in the engine along with the callstack for each even and the filter on the session_id where the tests are actually being run to prevent collecting data that is not related to the tests being executed.  I originally started with nearly double the events, but removed the events that were not firing for any of the tests from the final event session shown below.

IF EXISTS (SELECT 1 FROM sys.server_event_sessions
            WHERE name = 'SQLskills_InvestigatingUpdates')
BEGIN
    DROP EVENT SESSION [SQLskills_InvestigatingUpdates] ON SERVER;
END
GO
CREATE EVENT SESSION [SQLskills_InvestigatingUpdates]
ON SERVER
– Track sparse file writes for snapshots
ADD EVENT sqlserver.file_written_to_replica(
    ACTION(package0.callstack,sqlserver.session_id)
    WHERE ([sqlserver].[session_id]=(62))),
– Track memory page allocations
ADD EVENT sqlos.page_allocated(
    ACTION(package0.callstack,sqlserver.session_id)
    WHERE ([sqlserver].[session_id]=(62))),
ADD EVENT sqlos.page_freed(
    ACTION(package0.callstack,sqlserver.session_id)
    WHERE ([sqlserver].[session_id]=(62))),
– Track latching
ADD EVENT sqlserver.latch_acquire_time(
    ACTION(package0.callstack,sqlserver.session_id)
    WHERE ([sqlserver].[session_id]=(62))),
ADD EVENT sqlserver.latch_suspend_begin(
    ACTION(package0.callstack,sqlserver.session_id)
    WHERE ([sqlserver].[session_id]=(62))),
ADD EVENT sqlserver.latch_suspend_end(
    ACTION(package0.callstack,sqlserver.session_id)
    WHERE ([sqlserver].[session_id]=(62))),
– Track locking
ADD EVENT sqlserver.lock_acquired(
    SET collect_database_name = 1
    ACTION(package0.callstack,sqlserver.session_id)
    WHERE ([sqlserver].[session_id]=(62))),
ADD EVENT sqlserver.lock_released(
    SET collect_database_name = 1
    ACTION(package0.callstack,sqlserver.session_id)
    WHERE ([sqlserver].[session_id]=(62))),
– Track transaction log activity
ADD EVENT sqlserver.log_flush_requested(
    ACTION(package0.callstack,sqlserver.session_id)
    WHERE ([sqlserver].[session_id]=(62))),
ADD EVENT sqlserver.log_single_record(
    ACTION(package0.callstack,sqlserver.session_id)
    WHERE ([sqlserver].[session_id]=(62))),
ADD EVENT sqlserver.transaction_log(
    ACTION(package0.callstack,sqlserver.session_id)
    WHERE ([sqlserver].[session_id]=(62))),
– Track batch and statement activity
ADD EVENT sqlserver.sql_batch_completed(
    ACTION(package0.callstack,sqlserver.session_id)
    WHERE ([sqlserver].[session_id]=(62))),
ADD EVENT sqlserver.sql_batch_starting(
    ACTION(package0.callstack,sqlserver.session_id)
    WHERE ([sqlserver].[session_id]=(62))),
ADD EVENT sqlserver.sql_statement_completed(
    ACTION(package0.callstack,sqlserver.session_id)
    WHERE ([sqlserver].[session_id]=(62))),
ADD EVENT sqlserver.sql_statement_starting(
    ACTION(package0.callstack,sqlserver.session_id)
    WHERE ([sqlserver].[session_id]=(62)))
– Store event data in ring_buffer target
ADD TARGET package0.ring_buffer;
GO

Testing Methodolgy

From the original thread on SQL Server Central, there are a couple of sticking points that keep getting brought up so in the interests of attempting to be thorough I am going to provide the testing methodology being used, and then list all of the tests that are going to be run. 

  1. All tests will be conducted against both of the databases to identify behavior changes between CS and CI collations if they occur. 
  2. At the start of every test, the data in the tables will have 'Tom' for the Name column.  If a previous test has changed the value to something different it will be reset before the test is executed.
  3. The event session will be stopped and started between each test to isolate the data collect to each individual test.
  4. At least one minute of time will be waited for between tests to allow the background thread that performs row version cleanup to execute and remove row versions that may have been created by the tests. (http://msdn.microsoft.com/en-us/library/ms175492(v=sql.105).aspx)
  5. All of the tests will be run once initially to allow the execution plan to be generated and cached in the plan cache, and then the test will be reset and rerun so that the event session doesn't capture the page allocations and locking/latching associated with compiling/caching the plans.

Tests

The tests that will be performed are as follows:

  1. Determine the way that SQL Server handles UPDATE statements that set the column to the same value that previously existed without filtering.

    – Non-updating update
    UPDATE dbo.Test
    SET Name = Name;
    GO

    – Non-updating update
    UPDATE dbo.Test
    SET Name = 'Tom';
    GO

  2. Determine the way that SQL Server handles UPDATE statements that change the value that previously existed without filtering.

    – Updating update
    UPDATE dbo.Test
    SET Name = 'TOM';
    GO

  3. Determine the way that SQL Server handles UPDATE statements that set the column to the same value that previously existed with filtering.

    – Non-Updating with filter
    UPDATE dbo.Test
    SET Name = Name
    WHERE Name != 'Tom';
    GO

    – Non-Updating with filter
    UPDATE dbo.Test
    SET Name = 'Tom'
    WHERE Name != 'Tom';
    GO

  4. Determine the way that SQL Server handles UPDATE statements that change the value that previously existed with filtering.

    – Updating update
    UPDATE dbo.Test
    SET Name = 'TOM'
    WHERE Name != 'TOM';
    GO

All of these tests will be run against the CS and CI database.  Additionally after the first tests are performed against a table without a trigger, the tests will be repeated with the following FOR/AFTER UPDATE trigger on the tables.

CREATE TRIGGER trg_AuditChanges_Test
ON Test
FOR UPDATE
AS
BEGIN
DECLARE @ins INT;
DECLARE @del INT;

SELECT @ins = COUNT(*)
FROM inserted;

SELECT @del = COUNT(*)
FROM deleted;

END
GO

I am not going to include the 16 frame callstack in the results in this post except where it provides useful information for the investigation of the results being collected.  The callstack results in this post will be materialized through the use of Trace Flag 3656 and the public symbols as I explained in my blog post about the callstack action in the XEvent a Day series, An XEvent a Day (24 of 31) – What is the callstack?.

Results No Trigger

For both of the UPDATE statements in test number 1, the following events ( they are identical for both scenarios ) are generated by the event session.  Note that the locks acquired include a conversion of the U lock on the RID to an X lock as a part of the execution,  and even though the UPDATE doesn't actually make any changes, two log records are generated for the BEGIN_XACT and COMMIT_XACT, and these have to also be flushed to the transaction log file, as shown by the log_flush_requested event.

event_name lock_resource_type lock_mode status log_op_name context database_id resource_0
sql_batch_starting NULL NULL NULL NULL NULL NULL NULL
sql_statement_starting NULL NULL NULL NULL NULL NULL NULL
lock_acquired OBJECT IX NULL NULL NULL 9 245575913
latch_acquire_time NULL NULL HASHED, ONLRU NULL NULL NULL NULL
lock_acquired PAGE IU NULL NULL NULL 9 228
lock_acquired RID U NULL NULL NULL 9 228
lock_acquired PAGE IX NULL NULL NULL 9 228
lock_acquired RID X NULL NULL NULL 9 228
log_single_record NULL NULL NULL LOP_BEGIN_XACT NULL 9 NULL
transaction_log NULL NULL NULL LOP_BEGIN_XACT NULL 9 NULL
lock_released RID NL NULL NULL NULL 9 228
lock_released PAGE NL NULL NULL NULL 9 228
log_single_record NULL NULL NULL LOP_COMMIT_XACT NULL 9 NULL
transaction_log NULL NULL NULL LOP_COMMIT_XACT NULL 9 NULL
log_flush_requested NULL NULL NULL NULL NULL 9 NULL
lock_released RID X NULL NULL NULL 9 228
lock_released PAGE IX NULL NULL NULL 9 228
lock_released OBJECT IX NULL NULL NULL 9 245575913
sql_statement_completed NULL NULL NULL NULL NULL NULL NULL
sql_batch_completed NULL NULL NULL NULL NULL NULL NULL

If we compare this to the UPDATE statement that actually performs a change to the data in the row from test number 2, we will see that additional log records are generated for the MODIFY_ROW operation, and then flushed to the log file.

event_name lock_resource_type lock_mode status log_op_name context database_id resource_0
sql_batch_starting NULL NULL NULL NULL NULL NULL NULL
sql_statement_starting NULL NULL NULL NULL NULL NULL NULL
lock_acquired OBJECT IX NULL NULL NULL 9 245575913
lock_acquired PAGE IU NULL NULL NULL 9 228
lock_acquired RID U NULL NULL NULL 9 228
lock_acquired PAGE IX NULL NULL NULL 9 228
lock_acquired RID X NULL NULL NULL 9 228
log_single_record NULL NULL NULL LOP_BEGIN_XACT NULL 9 NULL
transaction_log NULL NULL NULL LOP_BEGIN_XACT NULL 9 NULL
log_single_record NULL NULL NULL LOP_MODIFY_ROW NULL 9 NULL
transaction_log NULL NULL NULL LOP_MODIFY_ROW HEAP 9 NULL
lock_released RID NL NULL NULL NULL 9 228
lock_released PAGE NL NULL NULL NULL 9 228
log_single_record NULL NULL NULL LOP_COMMIT_XACT NULL 9 NULL
transaction_log NULL NULL NULL LOP_COMMIT_XACT NULL 9 NULL
log_flush_requested NULL NULL NULL NULL NULL 9 NULL
lock_released RID X NULL NULL NULL 9 228
lock_released PAGE IX NULL NULL NULL 9 228
lock_released OBJECT IX NULL NULL NULL 9 245575913
sql_statement_completed NULL NULL NULL NULL NULL NULL NULL
sql_batch_completed NULL NULL NULL NULL NULL NULL NULL

For the UPDATE statement in test number 3 where a WHERE clause is applied to check the column value against the value being set to ensure that they are not the same values, the results are different as shown below:

event_name lock_resource_type lock_mode status log_op_name context database_id resource_0
sql_batch_starting NULL NULL NULL NULL NULL NULL NULL
sql_statement_starting NULL NULL NULL NULL NULL NULL NULL
lock_acquired OBJECT IX NULL NULL NULL 9 245575913
lock_acquired PAGE IU NULL NULL NULL 9 228
lock_acquired RID U NULL NULL NULL 9 228
lock_released RID U NULL NULL NULL 9 228
lock_released PAGE IU NULL NULL NULL 9 228
lock_released OBJECT IX NULL NULL NULL 9 245575913
sql_statement_completed NULL NULL NULL NULL NULL NULL NULL
sql_batch_completed NULL NULL NULL NULL NULL NULL NULL

Notice that here we don't have the lock conversion for the RID from U to X, and we don't have the log records being generated and flushed to the transaction log for the BEGIN_XACT and COMMIT_XACT. 

The results for the UPDATE in test number 4 has the same outcome as shown above for test number 3 when run against the case-insensitive collation database.  However, running test number 4 against the case-sensitive collation database results in the same events being fired for the non-filtered UPDATE and data changes in the table with the LOP_MODIFY_ROWS log records being generated. (Feel free to test and validate this using the provided code and CS collation database)

Results FOR/AFTER UPDATE Triggers

The results for the two UPDATE statements in test number 1 when a trigger exists on the table are different than when the table does not have the trigger on it.

event_name lock_resource_type lock_mode status log_op_name context database_id resource_0
sql_batch_starting NULL NULL NULL NULL NULL NULL NULL
sql_statement_starting NULL NULL NULL NULL NULL NULL NULL
lock_acquired OBJECT IX NULL NULL NULL 10 245575913
lock_acquired PAGE IU NULL NULL NULL 10 357
lock_acquired RID U NULL NULL NULL 10 357
lock_acquired PAGE IX NULL NULL NULL 10 357
lock_acquired RID X NULL NULL NULL 10 357
page_allocated NULL NULL NULL NULL NULL NULL NULL
log_single_record NULL NULL NULL LOP_BEGIN_XACT NULL 10 NULL
transaction_log NULL NULL NULL LOP_BEGIN_XACT NULL 10 NULL
lock_acquired EXTENT X NULL NULL NULL 2 1568
lock_released EXTENT X NULL NULL NULL 2 1568
page_allocated NULL NULL NULL NULL NULL NULL NULL
page_allocated NULL NULL NULL NULL NULL NULL NULL
page_allocated NULL NULL NULL NULL NULL NULL NULL
page_allocated NULL NULL NULL NULL NULL NULL NULL
page_allocated NULL NULL NULL NULL NULL NULL NULL
page_allocated NULL NULL NULL NULL NULL NULL NULL
page_allocated NULL NULL NULL NULL NULL NULL NULL
page_allocated NULL NULL NULL NULL NULL NULL NULL
latch_acquire_time NULL NULL CHKGEN, DIRTY, HASHED, ON NULL NULL NULL NULL
lock_acquired EXTENT U NULL NULL NULL 2 312
lock_acquired PAGE X NULL NULL NULL 2 318
lock_released EXTENT U NULL NULL NULL 2 312
lock_released PAGE X NULL NULL NULL 2 318
lock_released RID NL NULL NULL NULL 10 357
lock_released PAGE NL NULL NULL NULL 10 357
latch_acquire_time NULL NULL CHKGEN, DIRTY, HASHED, ON NULL NULL NULL NULL
log_single_record NULL NULL NULL LOP_COMMIT_XACT NULL 10 NULL
transaction_log NULL NULL NULL LOP_COMMIT_XACT NULL 10 NULL
log_flush_requested NULL NULL NULL NULL NULL 10 NULL
lock_released RID X NULL NULL NULL 10 357
lock_released PAGE IX NULL NULL NULL 10 357
lock_released OBJECT IX NULL NULL NULL 10 245575913
sql_statement_completed NULL NULL NULL NULL NULL NULL NULL
sql_batch_completed NULL NULL NULL NULL NULL NULL NULL

As you can see in the table above, we now have exclusive locks on an extent in tempdb, and memory page allocations occurring (shown by the page_allocated events) in the new result set.  If we take a look at the execution callstack for the page_allocated events, we'll see:

MemoryClerkInternal::AllocatePages+26f [ @ 0+0×0
BPool::Steal+35a [ @ 0+0×0
BPool::NewPage+7af [ @ 0+0×0
PageRef::FormatInternal+8e [ @ 0+0×0
AppendOnlyStorageUnit::SetupPageHeaderPreAllocation+af [ @ 0+0×0
AllocationReq::AllocateExtent+285 [ @ 0+0×0
AllocationReq::AllocateFirst+a9a [ @ 0+0×0
AllocationReq::Allocate+e9 [ @ 0+0×0
AppendOnlyStorageUnit::AllocateNewExtent+29a [ @ 0+0×0
AppendOnlyVerStoreMgr::AddRecord+e07 [ @ 0+0×0
VersionMgr::InsertNewImage+13e [ @ 0+0×0
HeapPageRef::Update+b4e [ @ 0+0×0
HeapFragment::Update+3b7 [ @ 0+0×0
HeapRowObject::Update+67d9 [ @ 0+0×0
HeapDataSetSession::SetDataInternal+8dcb [ @ 0+0×0
DatasetSession::SetData+16d [ @ 0+0×0
RowsetNewSS::SetData+6a [ @ 0+0×0
CValRow::SetDataX+63 [ @ 0+0×0
CEsExec::GeneralEval4+e7 [ @ 0+0×0
CQScanUpdateNew::GetRow+23a [ @ 0+0×0
CQueryScan::GetRow+81 [ @ 0+0×0
CXStmtQuery::ErsqExecuteQuery+36a [ @ 0+0×0
CXStmtDML::XretDMLExecute+2b9 [ @ 0+0×0
CXStmtDML::XretExecute+ad [ @ 0+0×0

By adding the trigger, we now have version store records being generated associated with the execution of the FOR/AFTER UPDATE trigger.  The version store records are also generated for test number 2 where the column data is actually changed by the UPDATE statement being executed.  On the case-insensitive collation, test number 3 and number 4 do not generate version store records.

However, for the case-sensitive collation, test number 4 does generate version store records, as shown below:

event_name lock_resource_type lock_mode status log_op_name context database_id resource_0
sql_batch_starting NULL NULL NULL NULL NULL NULL NULL
sql_statement_starting NULL NULL NULL NULL NULL NULL NULL
lock_acquired OBJECT IX NULL NULL NULL 10 325576198
latch_acquire_time NULL NULL HASHED, ONLRU NULL NULL NULL NULL
latch_acquire_time NULL NULL DIRTY, HASHED, ONLRU NULL NULL NULL NULL
lock_acquired PAGE IU NULL NULL NULL 10 73
lock_acquired RID U NULL NULL NULL 10 73
lock_acquired PAGE IX NULL NULL NULL 10 73
lock_acquired RID X NULL NULL NULL 10 73
log_single_record NULL NULL NULL LOP_BEGIN_XACT NULL 10 NULL
transaction_log NULL NULL NULL LOP_BEGIN_XACT NULL 10 NULL
latch_acquire_time NULL NULL HASHED, ONLRU NULL NULL NULL NULL
lock_acquired EXTENT X NULL NULL NULL 2 2944
lock_released EXTENT X NULL NULL NULL 2 2944
lock_acquired EXTENT U NULL NULL NULL 2 264
lock_acquired PAGE X NULL NULL NULL 2 269
lock_released EXTENT U NULL NULL NULL 2 264
lock_released PAGE X NULL NULL NULL 2 269
log_single_record NULL NULL NULL LOP_MODIFY_ROW NULL 10 NULL
transaction_log NULL NULL NULL LOP_MODIFY_ROW HEAP 10 NULL
lock_released RID NL NULL NULL NULL 10 73
lock_released PAGE NL NULL NULL NULL 10 73
log_single_record NULL NULL NULL LOP_COMMIT_XACT NULL 10 NULL
transaction_log NULL NULL NULL LOP_COMMIT_XACT NULL 10 NULL
log_flush_requested NULL NULL NULL NULL NULL 10 NULL
lock_released RID X NULL NULL NULL 10 73
lock_released PAGE IX NULL NULL NULL 10 73
lock_released OBJECT IX NULL NULL NULL 10 325576198
sql_statement_completed NULL NULL NULL NULL NULL NULL NULL
sql_batch_completed NULL NULL NULL NULL NULL NULL NULL

The page_allocated events have the same callstack as the previously listed callstack in this post.

Conclusions

As can be seen by the results of these tests, UPDATE statements that do not actually change the value of a column in a row, can still have overhead associated with them, even when the database engine is recognizes that no change is actually occurring and doesn't actually affect the row on the page.  There is still overhead for the exclusive lock conversion and the implicit transaction that occurs as a part of the UPDATE occurring when no filter is provided to check the value currently on the row against the value being SET, and there is added overhead for row versioning if triggers exist on the table.

Looking at the callstack for the LOP_MODIFY_ROW log operation, we can also see that changes being made affect the specific row on the page.  The changes are NOT made on a separate page allocated by SQL Server to allow a binary comparison of the two pages, as suggested in the original forum thread that lead to this investigation. This is NOT how SQL Server behaves.

SQLServerLogMgr::FireAppendXevent+11b [ @ 0+0×0
SQLServerLogMgr::AppendLogRequest+6d8 [ @ 0+0×0
SQLServerLogMgr::ReserveAndAppend+74 [ @ 0+0×0
XdesRMReadWrite::GenerateLogRec+5ac [ @ 0+0×0

PageRef::ModifyRow+2b4a [ @ 0+0×0


PageRef::ModifyColumnsInternal+162 [ @ 0+0×0

HeapPageRef::Update+208d [ @ 0+0×0
HeapFragment::Update+3b7 [ @ 0+0×0
HeapRowObject::Update+67d9 [ @ 0+0×0
HeapDataSetSession::SetDataInternal+8dcb [ @ 0+0×0
DatasetSession::SetData+16d [ @ 0+0×0
RowsetNewSS::SetData+6a [ @ 0+0×0
CValRow::SetDataX+63 [ @ 0+0×0
CEsExec::GeneralEval4+e7 [ @ 0+0×0

CQScanUpdateNew::GetRow+23a [ @ 0+0×0
CQueryScan::GetRow+81 [ @ 0+0×0

CXStmtQuery::ErsqExecuteQuery+36a [ @ 0+0×0
CXStmtDML::XretDMLExecute+2b9 [ @ 0+0×0
CXStmtDML::XretExecute+ad [ @ 0+0×0
CMsqlExecContext::ExecuteStmts<0,0>+1a71 [ @ 0+0×0
CMsqlExecContext::FExecute+a4f [ @ 0+0×0
CSQLSource::Execute+866 [ @ 0+0×0
CStmtPrepQuery::XretPrepQueryExecute+3e2 [ @ 0+0×0
CMsqlExecContext::ExecuteStmts<1,0>+17c6 [ @ 0+0×0

This is further supported by the fact that no additional page_allocated events fire outside of the version store information being generated by the UPDATE.  Since these tests only focused on the internal operations of UPDATE statements against a single row table, no realistic determination can be made for the actual performance overhead of comparing the column value to the value being SET by the UPDATE statement, specifically for what additional overhead might be incurred by a less ideal execution plan based on the additional filter criteria being optimized.  This is something that I will definitely be looking at in more detail for a possible later post.  However, we can see that comparing the current value to the value being set, can affect row version generation, lock conversions, and the number of log records that are generated where the data would not actually be changed by the UPDATE statement being issued.

Addition:

For further reading on this subject, take a look at Paul Whites blog post from a few years ago.  The outcomes are the same, but he uses a different method of investigation