Host-based backup of VMware vSphere VMs.
chad156
Enthusiast
Posts: 30
Liked: 1 time
Joined: Mar 22, 2012 3:35 pm
Full Name: Chad Gibson
Contact:

Replication Failover and Failback

Post by chad156 »

I'm trying to test to my DR site, a failover and failback. I successfully failover in about 3 minutes, however my failback takes 10+ hours and seems like its copying the entire VM back?

Can someone point me in the direction of what i'm doing wrong? All the demos how the failback taking minutes as well...
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Replication Failover and Failback

Post by foggy »

Failover is just a switch from the original VM to its replica, while during replica failback to the original VM Veeam B&R needs to calculate and synchronize the differences between the original VM and the replica VM. The time required to scan the VM image depends on its size and is comparable to the full backup time of this VM.
chad156
Enthusiast
Posts: 30
Liked: 1 time
Joined: Mar 22, 2012 3:35 pm
Full Name: Chad Gibson
Contact:

Re: Replication Failover and Failback

Post by chad156 »

Its only 25 GB. This is a test server. The initial seed to my DR site took 3 hours.

I failed over, put a test file on the desktop and am now failing back...its going to take another 10 hours...i'll update when done.

Something is not right...
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Replication Failover and Failback

Post by foggy »

Chad, have you picked up the proxy servers both in source and target sites during the failback wizard?
chad156
Enthusiast
Posts: 30
Liked: 1 time
Joined: Mar 22, 2012 3:35 pm
Full Name: Chad Gibson
Contact:

Re: Replication Failover and Failback

Post by chad156 »

Hmm now we are getting somewhere. No I only have one Veeam server at my main site.

So I need a Veeam proxy at my DR site?

Just over 46% in 3 hours...
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Replication Failover and Failback

Post by foggy »

Yes, as with replication, not having proxy servers on both ends results in poor data transfer performance. In your case, source proxy pulls the data using network mode across WAN, instead of using hotadd mode and transferring data between proxies.
chad156
Enthusiast
Posts: 30
Liked: 1 time
Joined: Mar 22, 2012 3:35 pm
Full Name: Chad Gibson
Contact:

Re: Replication Failover and Failback

Post by chad156 »

Alrighty, I installed a proxy at the DR. Once this failback finishes in a few more hours i'll try it again.
chad156
Enthusiast
Posts: 30
Liked: 1 time
Joined: Mar 22, 2012 3:35 pm
Full Name: Chad Gibson
Contact:

Re: Replication Failover and Failback

Post by chad156 »

Failback took just over 8 hours...

I've added the proxy to my DR site. Going to test again.
chad156
Enthusiast
Posts: 30
Liked: 1 time
Joined: Mar 22, 2012 3:35 pm
Full Name: Chad Gibson
Contact:

Re: Replication Failover and Failback

Post by chad156 »

Foggy,

Thanks!!!! Setup my proxies, when I did my failback and manually selected the proxies, the failback took 11 minutes. This is what I expected.

I'm assuming that the manual selection of the proxies is not needed either?
dellock6
Veeam Software
Posts: 6137
Liked: 1928 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Replication Failover and Failback

Post by dellock6 » 2 people like this post

No, Veeam Backup is usually able to determine which one is the source and which is the target.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Daveyd
Veteran
Posts: 283
Liked: 11 times
Joined: May 20, 2010 4:17 pm
Full Name: Dave DeLollis
Contact:

[MERGED] Slow failback for replica

Post by Daveyd »

I am doing a test replica failover and failback. I created a small 30GB VM running Server 2008R2. I created a replica job to run every 2 hours. The VM has stagnant data since its just for testing. I am replicating to a Datastore in our DR site over a 1 gig link. Failover works like a charm. It only takes a minute or so to failover. However, when I choose failback to the original location, it takes 20 minutes. Looks at the session logs, its spends about 16 minutes on "Calculating original signature Hard Disk 1 (30.0 GB)" There were no changes made on the VM after I failed it over. Is this typical?
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Replication Failover and Failback

Post by foggy »

Dave, yes, this is expected. See my first post above.
Daveyd
Veteran
Posts: 283
Liked: 11 times
Joined: May 20, 2010 4:17 pm
Full Name: Dave DeLollis
Contact:

Re: Replication Failover and Failback

Post by Daveyd »

foggy wrote:Dave, yes, this is expected. See my first post above.
Thanks.

Here's my scenerio.

My setup is this:

5 ESX hosts in Primary DataCenter
5 ESX Hosts in DR DataCenter
All 10 Hosts are in 1 cluster
DataCore servers sitting between ESX hosts and storage serving up iSCSI synchronous mirrored virtual disks to all ESX hosts. So all 10 ESX hosts have active VMs running. Its basically an active stretch cluster over a 1Gig link.
Veeam Server is physical with an iSCSI connection into the iSCSI network and has all mirrored DataCore ESX virtual disks presented to it in Read Only mode.

I want to replicate VMs to separate dedicated mirrored virtual volumes within the same stretch cluster. Since I am using the physical Veeam server as the only backup Proxy, would it be benefical, for replication performance, to add another proxy say on a VM in the cluster?
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Replication Failover and Failback

Post by foggy »

Adding virtual proxy will allow you to use hotadd for writing data to the target.
Daveyd
Veteran
Posts: 283
Liked: 11 times
Joined: May 20, 2010 4:17 pm
Full Name: Dave DeLollis
Contact:

Re: Replication Failover and Failback

Post by Daveyd »

What happens if a backup job kicks off the same time as a replication job or during a replication job on the same VM?
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Replication Failover and Failback

Post by Vitaliy S. »

Please take a look at the existing discussion for the answer: v6 backing up and replicating a VM simultaneous
chad156
Enthusiast
Posts: 30
Liked: 1 time
Joined: Mar 22, 2012 3:35 pm
Full Name: Chad Gibson
Contact:

Re: Replication Failover and Failback

Post by chad156 »

I did a test failover of my Sharepoint VM, failback took 9 hours...this still seems wrong to me.
Gostev
Chief Product Officer
Posts: 31457
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Replication Failover and Failback

Post by Gostev »

Chad,

It would be helpful to know what operation specifically takes 9 hours in the corresponding failback session?

Sounds like you simply do not have your replication infrastructure deployed per our guidelines. Generally, this kind of timings can usually only be caused by not having backup proxy available locally at both production and DR site, resulting in the digest calculation being performed over WAN, which obviously takes hours.

If you do have backup proxies at both sites and they are up, try selecting them manually in the replication job (instead of leaving the job's source and target proxy settings "autodetect") to ensure they are actually leveraged.

Thanks!
chad156
Enthusiast
Posts: 30
Liked: 1 time
Joined: Mar 22, 2012 3:35 pm
Full Name: Chad Gibson
Contact:

Re: Replication Failover and Failback

Post by chad156 »

I do have proxies set at both sites. But I am thinking this is part of the problem. I left it at auto-detect when I did that fail back as opposed to setting them manually which seems to work better.

log:

5/5/2012 9:03:28 PM Failback started at 5/5/2012 9:03:28 PM
5/5/2012 9:03:51 PM Queued for processing at 5/5/2012 9:03:51 PM
5/5/2012 9:03:51 PM Preparing next VM for processing
5/5/2012 9:03:51 PM Using source proxy '10.59.0.53' [hotadd;nbd]
5/5/2012 9:03:53 PM Using target proxy 'VMware Backup Proxy' [hotadd;nbd]
5/5/2012 9:03:56 PM Preparing original VM
5/5/2012 9:04:03 PM Creating working snapshot on original VM
5/5/2012 10:10:50 PM Calculating original signature Hard disk 1 (750.0 GB)
5/5/2012 10:45:34 PM Replicating RP Hard disk 1 (750.0 GB) 431.6 GB processed
5/5/2012 10:46:15 PM VM 'FLSPVM_replica' was shut down successfully
5/5/2012 10:46:22 PM Creating replica restore point
5/6/2012 8:22:55 AM Replicating changes Hard disk 1 (750.0 GB) 32.8 GB processed
5/6/2012 8:23:25 AM Removing working snapshot from original VM
5/6/2012 8:23:28 AM Powering on original VM
5/6/2012 8:23:30 AM Failback completed at 5/6/2012 8:23:30 AM
Gostev
Chief Product Officer
Posts: 31457
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Replication Failover and Failback

Post by Gostev »

Proxies are fine. Clearly, the signature calculation is done locally, very reasonable time for such a large disk. What's not clear from this log is why VMware snapshot creation takes 10 hours. That's definitely unexpected, this operation should take seconds? May be some unrelated VMware environment issue there, but I would start from opening a case with Veeam support and have them look at the detailed log first. Thanks!
chad156
Enthusiast
Posts: 30
Liked: 1 time
Joined: Mar 22, 2012 3:35 pm
Full Name: Chad Gibson
Contact:

Re: Replication Failover and Failback

Post by chad156 »

Will do, thanks for taking a look.
xefil
Enthusiast
Posts: 39
Liked: never
Joined: Jan 13, 2012 8:08 am
Contact:

[MERGED] Who and how is Calculating original signature Hard

Post by xefil »

Hi to all,

I've opened case #5204793 and would like to post the same question here. More experience, more answers?

I'm testing right now a failback operation. Here our scenario:

3 VMware infrastructure

Infrastructure 1 (our own):
- VM Backup & Replication Server
This infrastructure holds only this server.

Infrastructure 2 (customer's Production Site):
- All source VMs
- VM Source Proxy

Infrastructure 3 (DR-Site on a different location):
- All destination VMs
- VM Target Proxy

Connection link: WAN 30MBit with layer2 managed by us.

Failover went well and fast.

In order to reduce the times of failback I've incremented our WAN network for our customer. 30MBit to 100MBit/s. Now, I'm noticing there is another job that it takes long time to complete:

"Calculating original signature Hard disk..."

Who is calculating this signature? Source proxy, destination proxy, B&R Server, ESX?
How is the network flow during this operation?
So, if I know who is doing these steps, in example the proxies, I could increase the vCPU in use and reduce this time.

Any help to speed up this task would be apreciated.

Additional quiestion. What does the follow means?
[ProxyDetector] Proxy [veeam-proxy-<snip>] lies in different subnet with host [VMware ESX 4.1.0 build-502767]
[ProxyDetector] Detecting hotadd access level
[ProxyDetector] Testing proxy ip [10.0.2.26], netmask [255.0.0.0]
[ProxyDetector] Testing host ip [7.32.10.3]
[ProxyDetector] Proxy [veeam-prox-<snip>] lies in different subnet with host [VMware ESX 4.1.0 build-502767]
[ProxyDetector] Wasn't able to find proxy vm but can failover to network

And it does failback to network...
The proxy is able to contact ESXi well. The host is seen on IP 7.32.10.3 only from the central B&R server. On the proxy it's contacted/resolved on the same subnet. This because the hostnames has different resolutions for nat settings and different network topology.

Thank's a lot!

Simon
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Replication Failover and Failback

Post by Vitaliy S. »

If you need help making sense of debug logs, please continue working with our support. As to your initial question, then proxy servers are in charge of disk signature calculation, so to make failback process as quick as possible you should have proxy servers at both ends. Thanks!
tuscani
Enthusiast
Posts: 62
Liked: 3 times
Joined: Dec 28, 2012 8:00 pm
Full Name: Justin Durrant
Contact:

[MERGED] Testing Replica Failover\Failback

Post by tuscani »

I am testing replica failover and failback between our main DC and our DR location.. failover went great, Re-IP is really cool and even DNS was auto updated which I figured I would have to do manually. However, failback is sitting at "Replicating RP Hard Disk". Are the full VMDKs being copied back to our main site? I would have assumed only changes via CBT would need to be sent back?
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Replication Failover and Failback

Post by foggy »

Justin, please look through the topic you've been merged into for explanation. Basically, yes, only differences between the original VM and the replica VM are sent to the main site, however, those need to be calculated first, which takes considerable time.
tuscani
Enthusiast
Posts: 62
Liked: 3 times
Joined: Dec 28, 2012 8:00 pm
Full Name: Justin Durrant
Contact:

Re: Replication Failover and Failback

Post by tuscani »

Yeah... it looks like only the changes replicated back.. the compare took so long (30mins and the only change I made on the replica VM was a new folder on the desktop) I thought it was replicating the entire VMDK (40GB). Also, as already noted, having a proxy at the DR site made a HUGE difference. Thanks!

PS.. According to the log the amount of data replicated back was 894MB. I found this surprising as again the only change I made to the replica VM was the desktop folder.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Replication Failover and Failback

Post by foggy »

tuscani wrote:PS.. According to the log the amount of data replicated back was 894MB. I found this surprising as again the only change I made to the replica VM was the desktop folder.
Depends on other system activity (OS is not static) and amount of blocks that were affected by the changes (Veeam B&R operates with 1MB blocks, so even a 1KB change will cause the whole 1MB block to be copied).
DrWhy
Enthusiast
Posts: 38
Liked: 2 times
Joined: May 12, 2015 7:05 pm
Full Name: Caleb
Contact:

[MERGED] Planned Failback is Slow and Inefficent

Post by DrWhy »

Consider this scenario:
I perform a planned fail-over of a large, 60TB File Server to a secondary vSphere host on the same network. The planned fail-over goes great and finishes in ~5min. Several hours later, I go to perform a planned failback. Only the changes made to the replica VM will need to get synced back, so this shouldn't take that long... right? Wrong... This process gets to the "Caculating original signature Hard disk" stage where it remains for the next week because the entire source VM has to be read from disk. Why does the entire source VM have to be read from disk when the integrity of that disk was known to be accurate just hours before? To add insult to injury, once the fail-back does complete, and I want to resume the daily replication job for this VM, it then has to re-read the entire disk once more the first time this job runs, which will take another week. This is extremely inefficient and in the end makes a failover/failback task extremely limiting and less useful. Is there something that can be done to speed up this failback process? There has to be a better way to do this.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Replication Failover and Failback

Post by foggy »

Caleb, please review this thread for the answers to your questions and some considerations on how to improve failback performance. Thanks.
DrWhy
Enthusiast
Posts: 38
Liked: 2 times
Joined: May 12, 2015 7:05 pm
Full Name: Caleb
Contact:

Re: Replication Failover and Failback

Post by DrWhy »

Hey Foggy, I've read this post over and your answer appears to be that this is normal behavior. I realize that. I'm questioning the design and asking why Veeam has to calculate the hard disk signature for a VM that was known to be the exact same. Veeam should track the fact that both of these data sources are exactly the same during a "Planned" failover to save this time. It also seems that this shouldn't be too difficult to do. Can you please comment on this?
Post Reply

Who is online

Users browsing this forum: Coldfirex, jmaude and 97 guests