• 1. London, UK
  • 2. New York, NY
  • 3. Sydney, Australia
  • 4. Melbourne, Australia
  • 5. Moscow, Russia
  • 6. Singapore
  • 7. Paris, France
  • 8. Chicago, IL
  • 9. Hong Kong
  • 10. Houston, TX

Monday, March 17, 2008


Standby Continuous Replication (SCR): Replay and Truncation Lag

Posted by Bharat Suneja at 7:33 AM
Standby Continuous Replication (SCR) is a new High Availability feature in Exchange Server 2007 SP1. It uses Continuous Replication (also used by LCR and CCR) to replicate Storage Groups from a clustered or non-clustered mailbox server, known as a SCR source, to a clustered or non-clustered mailbox server, known as a SCR target.

SCR is managed using the Exchange shell - no management features exist in the EMC to configure or manage it.

Unlike LCR and CCR, which are designed to have a single copy of a Storage Group (consisting of an Exchange Store EDB + transaction logs & system files), SCR is designed to have many-to-one and one-to-many "replication relationships". (A SCR relationship or partnership - not formally defined terms, but simply used to explain the concept here - is SCR replication of a particular Storage Group from a SCR source server to a particular SCR target server).

A Storage Group from one SCR source can be replicated to multiple SCR target servers, and Storage Groups from one or more SCR source mailbox servers can be replicated to a single SCR target mailbox server.

By default, the Replication Service delays replaying 50 transaction logs to the SCR replica Database. Additionally, you can configure the following parameters to control how SCR replicas behave:
ReplayLagTime: specifies how long the Replication Service waits before replaying replicated transaction logs to the replica Database (EDB) on the target. Default:1 day
TruncationLagTime sets a lag time for truncating log files on that replica. Provided the other requirements are met for log file truncation on the SCR replica, log files are not truncated till ReplayLagTime + TruncationLagTime has elapsed. Default:0.

Why do I need the delay?

Replay lag gives you the protection of having a copy of your database from back in time. This back-in-time copy can be used to recover from logical corruption, pilot errors etc.

Additionally, if there is no delay, in the case of a lossy failover of the SCR source to a LCR or CCR replica, the (new source) Database will be behind its SCR target(s), requiring reseeding. Not something one would want to do for large Databases over WAN links (or even locally within the same datacenter). Delaying the last 50 transaction logs from being replayed to the SCR target avoids the need to reseed.

However, a large number of transaction logs not replayed to the Database means increased storage requirements for the SCR target, and also an increase in the time it takes to activate it in case of failure of the SCR source. Before it can be brought online, all the logs will need to be replayed.

To avoid this, you can set the ReplayLagTime to 0 (from the default of 1 day). Note, the replay will still lag behind by 50 transaction logs - a hard-coded limit enforced by SCR that cannot be changed. The TruncationLagTime can be set higher, so logs are replayed but not truncated. You can then take VSS snapshots of the target for the point-in-time copies.

Once setup using the Enable-StorageGroupCopy command, the ReplayLagTime and TruncationLagTime cannot be changed without disabling and re-enabling that SCR relationship for the Storage Group.

How can I see ReplayLagTime and TruncationLagTime? The following command shows the SCR targets a Storage Group is being replicated to:

Get-StorageGroup "SG Name" | fl

However, neither the above command, nor Get-StorageGroupCopyStatus show the lag times.

The parameters are returned as an array when you use the former (Get-StorageGroup) - only the name of the SCR target is displayed in the StandbyMachine property.

To see the lag times:

$sg = Get-StorageGroup "MyServer\MyStorageGroupName"

Here's what it looks like:

Figure 1: Displaying the Replay and Truncation lag time

Can I change ReplayLagTime and TruncationLagTime without reseeding the Database? You need to disable replication and re-enable it to add or modify the lag times. :

Disable-StorageGroupCopy "Storage Group Name" -StandbyMachine "SCR Target Server"

When disabling SCR, you get prompted to delete all files in the replica folder on the SCR target. Skip that. Reseeding is not required if you do not delete the files:

WARNING: Storage group "DFMAILMAN.e12labs.com\dfmailman-sg1" has standby continuous replication (SCR) disabled. Manually delete all SCR target files from "C:\Exchange Server\Mailbox\First Storage Group" and "C:\Exchange Server\Mailbox\First Storage Group\Mailbox Database.edb" on server "mirror".

Now, let's enable SCR with the replay and truncation lag times:

Enable-StorageGroupCopy "Storage Group Name" -StandbyMachine "SCR Target Server" -ReplayLagTime 1.00:00:00 -TruncationLagTime 2.00:00:00

Once replication is enabled again, make sure to test replication status using:

Get-StorageGroupCopyStatus "SG Name" -StandbyMachine "SCR Target Server"

Labels: , , ,


October 21, 2008 5:14 PM
Anonymous Anonymous said...

So here's what I don't get about 50 log files being delayed to prevent the need to reseed in case of a lossy failover. Even if you do have the 50 log file delay, the log files are still there to be played and will still eventually be played. So I don't quite get why delaying 50 log files prevents the need to reseed.

October 28, 2008 4:30 AM
Anonymous Anonymous said...

We currently have 2 Exchange 2007 servers, both running hub transport, client access and mailbox roles. Half the user accounts are on one server, the other half on the other, and we would like to configure redundancy so that the log files from 1 server, get copied to a shared space on another server, so they can be replayed if necesary. Does SCR allow me to do this??

October 28, 2008 6:47 AM
Blogger Bharat Suneja said...

@Anonymous Oct. 21: Generally your CCR replica will not be behind your SCR replica. Failovers will not occur if there are 50 log file missing from the CCR replica. This is controlled by the AutoDatabaseMountDial setting of the Mailbox server's properties. When set to BestAvailability, it has a tolerance of a maximum of 6 log files for a failover to occur.

Take a look at the following doc for more info about AutoDatabaseMountDial:
How to Tune Failover and Mount Settings for Cluster Continuous Replication

@Anonymous Oct.28: Yes, with SCR you can use 2 Mailbox servers to provide redundancy to each other as you described. However, unlike clustering (Cluster Continuous Replication and Single Copy Clusters), there are no automated failovers from the SCR source to the SCR target in event of a failure. It will help to be familiar with (and test) the recovery paths available.

More info about activating a SCR target:
Activating Standby Continuous Replication Targets

April 7, 2009 7:21 PM
Blogger Chad O. said...

Is there a general rule of thumb for how long it will take to playback the logs on the SCR target server in the event that the source server failed?

October 22, 2009 12:27 PM
Blogger Chris said...

Excellent post on how to change a setting that MS made harder to change.

I made a post of the full SCR seeding and failover on my blog:
Part One
Part Two


Post a Comment

Links to this post:

Create a Link

<< Home