SQLskills SQL101: Why does repair invalidate replication subscriptions?

SQLskills has an ongoing initiative to blog about basic topics, which we’re calling SQL101. We’re all blogging about things that we often see done incorrectly, technologies used the wrong way, or where there are many misunderstandings that lead to serious problems. If you want to find all of our SQLskills SQL101 blog posts, check out SQLskills.com/help/SQL101.

Whenever I’m teaching and recovering from corruption, I always stress that if the REPAIR_ALLOW_DATA_LOSS option of DBCC CHECKDB (which I’ll just call ‘repair’ from now on) has to be used, you should do the following:

  • Run another DBCC CHECKDB to make sure that repair fixed everything and no new problems were exposed (by the repair fixing something that was blocking DBCC CHECKDB‘s access to a portion of the database)
  • Run DBCC CHECKCONSTRAINTS on any tables or indexes that were repaired and are involved in constraints (stated in Books Online too)
  • Reinitialize any replication subscriptions that are based on tables that were repaired (stated in Books Online too)
  • Reevaluate your HA/DR strategy so you don’t have to use repair again in future

One question I’m often asked is about why replication can be broken by running repair.

There are two ways that replication can be affected: repairs on replication metadata tables, and repairs on anything else to do with a subscription.

Repairs on replication metadata tables

This is the simplest case to explain. If the repair operation affects any of the replication metadata tables (i.e. deleted some data from them), the entire replication publication will be in an inconsistent state and you should remove replication completely from the database an reinitialize it. This isn’t limited to a single subscription – all replication should be reconfigured.

Repairs on anything else

Transaction replication captures changes to the publication database by analyzing the transaction log, looking for transactions that change data in any of the publications, and converting those operations into logical operations that can be applied to the subscribers. Merge replication captures changes to the publication database using DML triggers and converting those operations into logical operations that can be applied to the subscribers.

Neither of these mechanisms can capture of operations performed by repair. Repair operations are always direct physical changes to the database structures to fix inconsistencies in the structures (e.g. a database page, table record, or a linkage between two pages), as opposed to physical changes because of queries performing inserts, updates, or deletes on tables.

These repair operations cannot translated into logical operations that can be applied to replication subscribers because there are no logical operations than can be expressed using T-SQL for the equivalent of the direct structural changes that repair is performing. Replication does not preserve the exact physical location of a particular record between the publication and subscription databases, so a direct change to record Y on page X in the publication database would not be able to be replayed on the subscription database (remember, replication ships logical changes, not physical changes). This means that if any table that is part of a replication publication is changed by a repair operation, the replication subscription is no longer valid and must be reinitialized.

As an example, imagine a repair operation is forced to remove a data page from a unique clustered index (essentially deleting some table records), and the subscription is NOT reinitialized. Those records would still exist on the replicated copy of the table. If a subsequent insert operation inserted records with cluster key values corresponding to the records deleted by the repair operation, the Distribution Agent would fail when trying to apply the inserts to the replicated copy – because the repair operation was not applied to the subscription database and a duplicate key violation error will occur when attempting to apply the insert to the replicated table.

A replication subscription must always be reinitialized if any table in the publication is affected by a repair operation, or the replication metadata tables are repaired.

Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.