It’s been a while since I wrote one of these, but I’m picking up the series again.
This is a question that came in through email from a prior student, which I’ll summarize as: with a distributed availability group, what are the semantics of log truncation?
As I’m not an AG expert (I just know enough to be dangerous ha-ha), I asked Jonathan to jump in, as he is most definitely an AG expert! I’ll use AG for availability group and DAG for distributed availability group in the explanation below. Whether the AGs in question have synchronous or asynchronous replicas is irrelevant for this discussion.
Log truncation (the act of marking zero or more VLFs as able to be reused) can only mark a VLF as reusable if nothing might need to use any of the log records in the VLF. An example of the many things that might need to use a log record is a transaction that hasn’t yet committed (because it might roll back) – if the log record is lost from being overwritten by the VLF being reused, the transaction would not be able to roll back.
With a simple AG, with a primary replica and secondary replica, a VLF on the primary replica can only be marked for reuse once that VLF is hardened (written to the log drive) on the primary replica and hardened on the secondary replica, and then backed up on one of the replicas. (Note that hardening on the secondary replica does NOT mean that the log records have to be replayed, only that they’ve been written to disk – this is a common misconception with both AGs and database mirroring.)
With a DAG, it’s a bit more complicated.
Imagine we have a DAG from AG1 to AG2. AG1 has a primary replica and secondary replica. AG2 has a primary replica and secondary replica. The primary replica of AG2 is essentially another secondary replica of AG1, and functions as a log forwarder to its own secondary replica. You can think of a DAG as an AG of AGs.
So log flows like this:
- AG1 primary replica -> AG1 secondary replica
- AG1 primary replica -> AG2 primary replica -> AG2 secondary replica
From my simple AG example above, you would then think that a VLF on AG1 primary replica can be marked for reuse once that VLF is hardened on the primary replica , hardened on the AG1 secondary replica, hardened on the AG2 primary replica, and then backed up on any AG1 replica.
But that is not the case. There’s an extra twist.
A VLF on the AG2 primary replica cannot be marked for reuse until that VLF has been hardened on the AG2 secondary replica. If that VLF cannot be marked for reuse on the AG2 primary replica, then it cannot be marked for reuse on the AG1 primary replica. And so the log in the primary replica may have to grow.
Summary: with a DAG, a VLF in the AG1 primary replica cannot be marked for reuse until it has been hardened on its own secondary replicas and all other secondary replicas in the DAG topology (and then backed up somewhere in AG1). It’s just an extension of the regular AG log-clearing semantics.