Discussions specific to the VMware vSphere hypervisor
Post Reply
mistersparky
Novice
Posts: 3
Liked: never
Joined: Dec 25, 2013 7:17 pm
Full Name: Mark Champness
Contact:

Details on testing and performing failover

Post by mistersparky » Dec 25, 2013 7:23 pm

Hello everyone!

We are currently running a production network of around a dozen VMs on ESXi 5.1 hosts on one data centre. VMs include DCs, Exchange and SQL. I have just got a replication job going on all these VMs to a secondary/DR data centre, also on ESXi 5.1 hosts, using Veeam 7.0 (which is a great product, btw!).

Now, I have been asked to do two things: First of all, test the replication process (without affecting production), and second of all, switch over to these replicas.

On the testing process, there seems like a number of ways of doing this. Ideally, I would like to do so first and test these myself, then make the replicas available to more technical colleagues for further testing. What would the best approach be for this?

On the switchover front, I am even less clear. I understand the steps required for failing over, making the failover permanent, failing back or undoing the failover and so on. However, I don’t understand the actual mechanics of this. Is the replication process even designed for this, or is it more suited to situations where one has to failover - such as power/connectivity issues, virus's etc. I also understand that I can choose to failover to a non-latest replica. However, if I failover to the most recent replica will that mean one further replication process before failover, or does that just mean the VM fails over to the state the VM was in during the last replication run?

My main concern is for the SQL and Exchange servers - traffic to and from such servers is constant, and I don’t want to be in a situation where data is lost on servers during either the failover or failback process.

Thanks in advance for any help with this!

Vitaliy S.
Product Manager
Posts: 22697
Liked: 1498 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Details on testing and performing failover

Post by Vitaliy S. » Dec 26, 2013 12:21 pm

Hello,
mistersparky wrote:On the testing process, there seems like a number of ways of doing this. Ideally, I would like to do so first and test these myself, then make the replicas available to more technical colleagues for further testing. What would the best approach be for this?
From your description it sounds like you want to verify your replicas and not to perform actual failover. If the main reason for this is to check that replicated VM can be used in the failover process, then I would suggest using SureBackup jobs/On-Demand sandbox. Here is an existing topic on this > Replication Design VLANS
mistersparky wrote:However, I don’t understand the actual mechanics of this. Is the replication process even designed for this, or is it more suited to situations where one has to failover - such as power/connectivity issues, virus's etc. I also understand that I can choose to failover to a non-latest replica.
Yes, we have SureBackup job and failover engine to test and use DR VMs in production.
mistersparky wrote:However, if I failover to the most recent replica will that mean one further replication process before failover, or does that just mean the VM fails over to the state the VM was in during the last replication run?
VM fails over to the selected snapshot, not additional replication job pass will be performed.
mistersparky wrote:My main concern is for the SQL and Exchange servers - traffic to and from such servers is constant, and I don’t want to be in a situation where data is lost on servers during either the failover or failback process.
During failback process there still will be downtime, cause if failover was performed, you need to transfer all changes back to production VMs.

Thank you!

mistersparky
Novice
Posts: 3
Liked: never
Joined: Dec 25, 2013 7:17 pm
Full Name: Mark Champness
Contact:

Re: Details on testing and performing failover

Post by mistersparky » Dec 27, 2013 9:03 am

Many thanks for the reply.
Vitaliy S. wrote:Hello,
From your description it sounds like you want to verify your replicas and not to perform actual failover. If the main reason for this is to check that replicated VM can be used in the failover process, then I would suggest using SureBackup jobs/On-Demand sandbox. Here is an existing topic on this > Replication Design VLANS
I would agree that SureBackup sounds like a much more logical choice. However, the powers that be are very specific in their requirements for a full switchover/failover.

Would I be right in assuming I have one of three options:

- Attempt to make the argument that we should only attempt a failover in a DR situation and that testing of replications should be done via other means (SureBackup etc).
- Gracefully power down production VMs and perform an extra replication before failover to ensure that DR matches production before failover.
- Propose going down a different route and using Quick Migration (http://helpcenter.veeam.com/backup/70/v ... ation.html) to just move the VMs to DR datastores.

Vitaliy S.
Product Manager
Posts: 22697
Liked: 1498 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Details on testing and performing failover

Post by Vitaliy S. » Dec 27, 2013 10:03 am

Hi Mark,

Based on your initial post I don't think number 3 is an option, as you will have downtime and VMs will be migrated to DR host, which you will never do in case of real disaster (there will be NO source VMs to migrate).

I suggest doing two things:
1. Test VM failover operation to verify that all the traffic routing is configured in a proper way.
2. Run SureBackup jobs/On Demand sandbox to verify application consistency/recoverability.

Hope this help!

mistersparky
Novice
Posts: 3
Liked: never
Joined: Dec 25, 2013 7:17 pm
Full Name: Mark Champness
Contact:

Re: Details on testing and performing failover

Post by mistersparky » Dec 27, 2013 3:56 pm

That sounds very logical, thanks!

One final question, I notice the replication failback process synchronizes any changes that have occurred on the replica back to the production host. If production and the replica DR were in a similar state before failover was performed, the end result is pretty clear. However, what would happen on a granular application level if failover is performed to a replica 2 or 3 hours old, with failback occurring a few hours after that. Are whatever changes that occurred on the DR “merged” with the changes that occurred on production in-between the last replication cycle and the failover? Or is the state of the DR replica just copied back to the production VM, losing any changes that occurred between the last replication cycle and failover?

Vitaliy S.
Product Manager
Posts: 22697
Liked: 1498 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Details on testing and performing failover

Post by Vitaliy S. » Dec 27, 2013 4:08 pm

The second one. DR VM image (virtual disk blocks) is copied back to production overwriting data on the original VM.

Post Reply

Who is online

Users browsing this forum: No registered users and 13 guests