Upgrading/Migrating Large Replicated Databases Without Reinitializing

This is a topic that comes up a lot for us as consultants and I realized recently that I have never blogged about it. Usually the question is something along the lines of:

  • I have a multi-TB replicated database and want to upgrade to SQL Server XXXX.
  • I have a multi-TB replicated database and want to upgrade hardware to newer but not upgrade SQL Server versions.
  • How do I do either of the above without having to fully reinitialize the subscribers?

Depending on the environment and what is changing this is actually a relatively simple process to accomplish with a little bit of planning.

Know What is Being Migrated or Upgraded or Both

The first part of building any plan is to know what specifically is being migrated or upgraded. This is important because the steps will be slightly different if you are only migrating or upgrading one part of the environment vs. the entire environment. It is also important to know where the distributor is for the configuration and whether or not the distribution database is going to be affected by the changes being made. In most cases it is a full environment migration to newer hardware or upgrade to newer release of SQL Server and the steps below will work. This is not the only way of migrating/upgrading large replicated databases but this is the one I have used for over 15 years at the date this post was written and it has been flawless.

Start with Log Shipping

My general recommendation for any migration or upgrade is typically to start out by log shipping the database(s) to the new server(s) as a means of minimizing downtimes when performing the final cutover. I have assisted clients with upgrading SQL Server while also migrating to Availability Group configurations with multi-TB databases with minimal downtime and being able to initialize and bring up 5+ replica’s at go live in minutes by using log shipping to prime the environment. I love using log shipping because it is super simple; it’s just a backup, copy, restore. It doesn’t matter if it is the publisher, the subscriber, or both, log shipping can be used to keep the new server updated to the point of the final migration/upgrade.

Full Environment Upgrade Steps

  1. Stop all application access to publisher SQL Server – Firewall rules blocking 1433 traffic in the Windows Firewall will do this, so you can also allow log reader agent access between the publisher, distributor, and subscriber
  2. Open Replication Monitor – Verify log reader agent has processed all log records from publisher
  3. Take tail log backup (WITH NORECOVERY) of the publisher database – this will take it offline and set it into Restoring… mode
  4. Copy tail log backup to new publisher database and restore WITH RECOVERY – this will bring the database online and will also remove all of transaction replication configuration from it (DO NOT USE KEEP_REPLICATION here)
  5. Configure new distributor and configure distribution for new publisher server.
  6. Configure all transactional publications/articles with a script specifying @allow_initialize_from_backup = N’true’, @immediate_sync = N’true’ – this could take some time to complete but you can do steps 7-9 concurrently
  7. Check Replication Monitor for old server distribution agent – ensure all transactions have been delivered.
  8. Take tail log backup of Subscriber database
  9. Copy tail log backup to new subscriber database and restore with RECOVERY
  10. When step 6 is complete query syspublications table on publisher database and mspublications table on distribution database to validate the min_autonosync_lsn matches for each publication.
  11. Create subscriptions specifying @sync_type=’initialize from lsn’, @subscriptionlsn= <min_autonosync_lsn> with sp_addsubscription based on the results in step 10.
  12. Insert trace tokens in Replication Monitor to each publication and verify end to end delivery and timings. 
  13. Open the application up to access new publication server
  14. Process complete.

Publisher Only Migration

If only the publisher database is migrating, essentially the same steps listed above can be followed with the only difference being that you need to remove the old subscriptions for the subscriber database after step 7. Depending on where the distribution database resides, usually it is on the publisher or the subscriber and is not a separate instance entirely, you may have different steps required to do this, but typically you can accomplish this using SSMS or by calling sp_subscription_cleanup in the subscriber database.

Subscriber Only Migration

If only the subscriber database is migrating, things get a little bit trickier, but not by much. First, the publications have to be configured with @allow_initialize_from_backup = N’true’, @immediate_sync = N’true’, which can be verified in the MSPublications table of the distribution database. If this is not configured properly then you can’t initialize the subscription by LSN. From the subscriber, we need the last sync LSN for the subscription to use for initialize by LSN which we can get after stopping the distribution agents from either the distribution database or from the subscriber database:

-- Inside Subscriber Database
FROM MSreplication_subscriptions AS s
ORDER BY publication;

-- Inside Distribution Database
SELECT a.publisher_db, a.publication, a.subscriber_db, t.xact_seqno, t.[time], a.name AS AgentName, a.job_id
FROM distribution.dbo.MSdistribution_agents AS a 
FROM distribution.dbo.MSdistribution_history WITH(NOLOCK)
) AS t ON a.id = t.agent_id
WHERE rowid = 1
  AND runstatus <> 0
ORDER BY publication;

Take the tail log backup of the subscriber, restore it with recovery on the new machine and then run the script to form the new subscriptions using initialize from LSN just as for a full migration, only you are using the LSN provided by where the old distribution agent was for the subscription when stopped/disabled.

Why Not Move Distribution Database and use KEEP_REPLICATION?

This is a really good question, and I figured someone would ask this in the comments so I’ll just address it as a part of the post. Why aren’t we keeping the existing distribution database for a publisher migration and using KEEP_REPLICATION for the database during the restore? First I have never seen this actually work for someone the way they might expect it to. Second, the server names are often being changed for the publisher/subscriber so the distribution database information is not going to be correct for the new servers and renaming the new servers during the cut-over adds steps and creates more potential failure points that could create other issues, though it is supported as a log shipped secondary for the replicated database per the Books Online:

“SQL Server does not support renaming computers that are involved in replication, except when you use log shipping with replication. The secondary computer in log shipping can be renamed if the primary computer is permanently lost. For more information, see Log Shipping and Replication (SQL Server).”

5 thoughts on “Upgrading/Migrating Large Replicated Databases Without Reinitializing

  1. I’ve tried this method (full migration) on a couple of large replication environments so far and I run into the issue that using ‘Initialize from LSN’ sets the subscription sync_type = ‘Replication Support Only’ going forward. This means no new articles can be added to the publication, and indeed you cannot re-initialize the subscription at all, ever, without dropping the whole subscription and adding it back with sync_type = ‘automatic’

    Even if you create a new snapshot, the subscription ignores it. Even if you tell the subscription to re-initialize from a snapshot, it is ignored.

    Do you encounter this problem when you do these migrations? Or am I doing something wrong?

    1. I haven’t gone back to look at it from your first question to be honest. I don’t typically add articles to an existing publication or put all of the articles in a single large publication. I generally don’t recommend more than 5-10 articles to a publication depending on the total number of articles, with logical grouping of the articles together based on business knowledge of what would be reused by multiple subscriptions.

      1. Wow! 5 – 10 articles per publication would really kill a lot of what I see. I have publications with hundreds of articles in them, and some over 1,000. Looks like if this is the only option for proceeding, we will need to just use an additional publication if we need to add new articles. Thanks for responding, I always learn something new when I read your posts.

        1. It’s a matter of limiting risk of impact when a problem happens to occur. If you have hundreds to thousands of articles in a single publication, and an error occurs for just one article, that essentially halts all distribution for the remaining articles until that issue is resolved. It could be something really stupid like a delete that got @@ROWCOUNT = 0 but all progress is stopped by that one command raising an error (this is why for a lot of replication implementations I will comment out the @@ROWCOUNT check for a delete as the end state is that the row doesn’t exist so unless you have triggers or other logic that needs to fire to archive that row or do something specific on the subscriber side, I don’t need that to stop replication, and you could still do better handling of the error with custom coding in the delete procedure than making it stop replication). If you break the articles up into different publications, your risk of impact is much smaller and constrained to the articles in the publication where an issue occurs. If you have to reinitialize a subscription, it is also a smaller amount being reinitialized.

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.