Failover on DR site

yan972 · Post by **yan972** » Jun 26, 2013 4:36 pm this post

Hi,

We plan to failover on our DR site because a big maintenance operation is to be done on PROD site, ie full erase of SAN and reinstall of ESX hosts. The failover/failback tests seems great but can i let the users work on the failover replicas for a week ?
Is there a "rule of thumb" concerning the place the VMs will use on DR site (.vvram file, snapshot file, ...) ?
Tia,
Yan

Post by **Vitaliy S.** » Jun 26, 2013 10:26 pm this post

Hello Yan,

Yes, you can let your users on the failed over replicas for a week, just make sure you have enough space on the destination datastores to hold the changes that will be written to the snapshot. How many VMs are you going to failover and what is the current free space on the datastores holding VM replicas?

Thank you!

yan972 · Post by **yan972** » Jun 27, 2013 3:54 pm this post

Hello Vitaliy,

I'll have 13 VMs to failover and on the DR site 2 hosts have 149GB free space out of 681GB, the 3rd one has 169GB free on 556GB.

Jun 27, 2013 7:33 pm

Hi.

I have some tips you can consider, depending on your exact situation, needs and preferences:

* The failover process is quite quick, because the DR replica is ready and you need to transfer only few changes.

Since the planned failover will be about 1 week, you can consider the following:
1. failover to DR site.
2. Complete the failover (commit snapshots at DR site and stop replication).
3. Do whatever you need at primary site.
4. When the primary site comes back online, configure a reverse replication from DR site (now still acting as production) to primary site - using the former production VMs as targets.
5. Failover again in reverse direction from DR to primary site.
6. Reconfigure replication again from primary to DR.

The above seems (and really is) a bit more complicated then a regular failover/failback process, however it can reduce downtime during failback in compared to using the standard procedure.
I had a similar situation a few weeks ago but in smaller scale - only 2 VMs that needed to failover and failback for planned maintenance.
The failover process took less then 1 hour.
However failback 2 days after, took much longer about 3 hours because of the reverse sync, and during that time the VMs were down.

* If the DR site is remote and connected by slow WAN link, you can consider relocating the DR storage + hosts to the primary site beforehand, then the DR equipement will be connected to internal LAN during the planned maintenance.
This might streamline both the failover and failback procedures, and also avoid slow WAN traffic from clients to servers if applicable.
So if planned downtime is not a big issue and if you have fast link between sites - you can follow the regular process of temporary failover and then failback.
If several hours of downtime is a problem (at the time you plan to return from DR to primary), you can consider the alternative path of:
Complete failover + maintenance + Reverse replication + another complete failover in reverse direction.

Yizhar

Post by **veremin** » Jun 28, 2013 8:40 am this post

In case of a week, it might be, indeed, better to permanently failover to DR site and switch to production latter, tuning replication job accordingly. Not only will such scenario guarantee reduced downtime when final switch takes place, but also it is performance-wise approach, since performance of VM running on week-worth snapshot might go down.

Thanks.

yan972 · Post by **yan972** » Jul 02, 2013 4:42 pm this post

Hello,

Thanks for your answer and tips. I'll follow this scenario, avoiding also to run the VMs on too big snapshots.

Yan

yan972 · Post by **yan972** » Jul 13, 2013 8:09 pm this post

Hi guys,

The failover to DR site took place last week end and worked well, thanks to your roadmap: the VM on DR site were "failedover permanently" so became production VM and i planned replication jobs to prepare the way back to PROD site.... but at the end i left Veeam B&R aside and did "manual" VMs migration from DR to PROD due to the VM names.
I explain: on PROD site i have, say a VM called "SAPBUR1P1" and the replica on DR site is called (by default) by the replication job "SAPBUR1P1_replica". This VM became the "production" VM on DR site. No pb. But if i again replicate this VM from DR to PROD, the replica would be by default "SAPBUR1P1_replica_replica", or, if i avoid the suffix in the rep job "SAPBUR1P1_replica". I wanted to go back to my "SAPBUR1P1" original name but didn't found a way to do it in the replication job.

So, in the next replication jobs i have to recreate for PROD to DR site, i plan to indicate no suffix to the replicas, avoiding this pb. How did you solve this situation ?

TIA
Yan

Jul 14, 2013 4:29 pm

Hi.

When you reconfigure the reverse replication from DR to Prod site, you can (and should) use the existing VMs and use the feature:
low bandwidth - allow replica seeding.
Then you map the target replica to existing VM.

No need to rename them - they already have the right names as before.

Yizhar

yan972 · Post by **yan972** » Jul 15, 2013 3:02 am this post

Hi Yizhar,

Thanks for the fast answer. Sure, i saw the checkboxes on the replication jobs but never searched what it could do. Took a look at the books and as i also do VM backups, it could have helped.
Seems that i should take time to read the 333 pages of the B&R user's guide and then test, test, test...

Thanks again,

Yan

R&D Forums

Failover on DR site

Re: Failover on DR site

Re: Failover on DR site

Re: Failover on DR site

Re: Failover on DR site

Re: Failover on DR site

Re: Failover on DR site

Re: Failover on DR site

Re: Failover on DR site

Who is online