CCR Over WAN: Failover and FSW questions answered

by Bharat Suneja

Exchange Server 2007’s Cluster Continuous Replication (CCR) feature provides a way to set-up geographically-dispersed clusters to protect against data center failure (aka “site failure”). Though the documentation provides plenty of detail on how to set up CCR clusters in a single data center – where both cluster nodes and the computer hosting the File Share Witness are in the same data center – the documentation on how to set this up across data centers has been skimpy, or even non-existent.

Matt Richoux’ post on the Exchange team blog provides more detail on such topologies, placement of the File Share Witness, failover scenarios, and related issues. Read the post, titled “Placement of the File Share Witness (FSW) on a Geographically Dispersed CCR Cluster“.

In a nutshell: 1) To facilitate CCR nodes across data centers, a CNAME record should be used to configure the FSW 2) Failover to CCR node in remote data center is not automatic (however, as Matt points out, the FSW can be placed in a third data center to achieve automatic failovers) 3) Be aware of the split brain syndrome that may occur if the first/primary data center comes back up with the (formerly) active node and File Share Witness set to start up automatically.

A frequent question – is CCR is a good solution for geo-dispersed clusters, particularly in context of the manual steps required to failover? It’s too early to say, given that Service Pack 1 is bringing us Standby Continuous Replication (SCR) – which is designed to work across data centers. However, in a lot of cases, automatic failover between data centers – generally located on the other end of a WAN link – is not desirable. You need an administrator to make the judgment whether an entire data center or site has failed, and a failover to another data center should be performed.

As Matt notes in the post, you can easily script the steps outlined in the post.

Another limitation of such deployments, perhaps till SCR arrives in SP1, is the fact that CCR clusters are limited to 2 nodes. It’s not possible to have 2 nodes in the primary data center, and replicate to a 3rd node in a remote data center. This would provide the ability to fail over locally first, in case of a single node failure, and fail over to the remote data center in case of a data center failure.

Exchange Server 2007’s Database Continuous Replication features provide answers to some of the most frequently asked questions by users in different forums – can I replicate an Exchange server or the Stores to another server (CCR does this), to another location (CCR and SCR), or to another disk (volume) on the same server (LCR). These features are some of the more important reasons to consider upgrading.

{ 0 comments… add one now }

Leave a Comment

Previous post:

Next post: