This post idea was prompted by a discussion I had this week with Jonathan Kehayias about an environment that had multiple transactional replication publications defined with overlapping table articles.
In other words, a table was defined as an article in more than one publication.
While I can think of some cases where you would want to leverage different article options or filters, in this particular case the articles had no differences in how they were defined. I’ve seen this in other environments in the past – and as I recall it wasn’t a conscious decision, but rather a lack of coordination across application teams and projects.
For small databases with lower volumes of modifications, this overlap could likely go unnoticed. For larger tables with high amounts of data modifications, well, consider the following scenario:
· You have two transactional replication publications that each reference the same table as an article. No other article properties are changed between the two publications and articles.
· Each publication maps to a single subscriber.
· Your table article setting for this scenario use the default – propagating INSERTs, UPDATEs and DELETEs via the default statement delivery method (spMSins_ / sp_MSupd_ sp_MSdel) etc. (And while we are propagating changes made directly to the table, we’re not using stored procedure execution articles.)
So let’s say we execute the following single statement batch update against the redundantly published table. This is one statement that updates 3,120 rows:
SET charge_amt = charge_amt * .97
WHERE provider_no = 386;
If we used sp_replcmds in the publisher database (I had the log reader agent job stopped in order to step through the scenario), how many command transactions would you expect to see marked for replication?
The answer is – 6,240. One call per row updated, multiplied by two separate publications (and we’re still only in the publication database):
And as you may expect, those 6,240 rows move on to the distribution database (you can validate via sp_browsereplcmds or MSrepl_commands):
Now had you instead just created ONE publication with that article sent to the two different subscribers, you would see just 3,120 in the publication database for the original update – and 3,120 as well at the distributor prior to multicasting the update to the two subscribers.
Coupled with the already “chatty” nature of transactional replication – you can imagine scenarios where performance rapidly degrades for large batch updates, particularly on already-constrained topologies.
One thought on “The Transactional Replication Multiplier Effect”
Adding a query to check for duplicates here (distribution database):
SELECT publisher_db, source_object, COUNT(*)
GROUP BY publisher_db, source_object
Comments are closed.