Comprehensive data protection for all workloads
arthurp
Influencer
Posts: 23
Liked: never
Joined: Jan 11, 2010 9:18 pm
Full Name: Arthur Pizyo
Contact:

How to perform replica failover commit in v5

Post by arthurp » Jul 29, 2010 2:10 pm

Hi Anton,

On a related issue, you have well documented failover reversal, but for some scenarios we need to have a complete documentation as to how do you perform "failover commit". More specifically, we have added a new host to our environment and are moving some of VMs to this host. One of the choices is clearly to just power on target replica through vCenter as we do not expect to failover back to the source. However, if (for added protection) we perform failover through VEEAM, what is correct sequence of steps to commit the failover once we are satisfied that everything runs fine?

We are expecting something like this:
- delete VEEAM snapshot using vCenter
- rename VM in vCenter from "vmname_replica" to "vmname"
- remove replica from the list of replicas in VEEAM
- delete/reconfigure replication job to account for new source and intended target
- delete vrb files from underlying storage as we no longer use these restore points
- delete replica.vbk and running.rbk files (please confirm)

Such "failover commit" procedure is important to keep things clean and organized. Also, vCenter will provision almost twice the required storage on failover until it is committed.

Two additional considerations for this process is that until VEEAM will do "thin-to-thin" replication we will be stuck with "thick" format. Also, underlying storage location will now be in VeeamBackup folder (not a big deal though)

Thank you

Arthur

Gostev
SVP, Product Management
Posts: 23844
Liked: 3206 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: How to perform replica failover commit in v5

Post by Gostev » Jul 29, 2010 2:31 pm

Hello Arthur, the procedure you have listed is perfectly correct. I am impressed with your knowledge of our product! Thin-to-thin replication is already implemented in v5, and automation of the above mention process will also follow in the future release. Probably we need to get rid of VeeamBackup folder as well...

Thank you for this perfect step-by-step guide ;)

tfleener
Influencer
Posts: 21
Liked: never
Joined: Jun 08, 2010 2:59 pm
Full Name: Tom Fleener
Contact:

Re: How to perform replica failover commit in v5

Post by tfleener » Jan 11, 2011 5:52 pm

An important note to add to this thread is if you are using VBR 5,

you will want to delete the replication job BEFORE you remove the snapshot. I followed the above process (but did not delete the replication job before). Veeam support said you need to delete the job before removing the snapshot.

After I followed this process, when I tried to remove the replica I encountered the error:
This backup file is locked by “…” job, and cannot be deleted

The only way to get it unlocked was to undo the failover. The snapshot had already been removed, so it did not roll anything back. The undo failover did power down the VM however, fortunately it was a SMTP gateway server so it was not an issue.

Vitaliy S.
Product Manager
Posts: 22127
Liked: 1380 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: How to perform replica failover commit in v5

Post by Vitaliy S. » Jan 12, 2011 9:42 am

Tom, thanks for posting such a valuable addition, much appreciated.

RumataRus
Enthusiast
Posts: 78
Liked: never
Joined: Jun 21, 2010 5:30 am
Full Name: Dmitry Prokudin

Should I delete replica`s snapshot after failing over?

Post by RumataRus » Feb 01, 2011 9:27 pm

[merged]

Let us suppose I restore a VM from replica. This VM has a snapshot called "VEEAM BACKUP RUNNING SNAPSHOT".
Should I delete this snapshot if I do not plan to restore the original VM that failed?

RumataRus
Enthusiast
Posts: 78
Liked: never
Joined: Jun 21, 2010 5:30 am
Full Name: Dmitry Prokudin

Re: How to perform replica failover commit in v5

Post by RumataRus » Feb 04, 2011 2:33 pm

Yes, it works!
I can offer the following sequence of actions:

- failover to replica :)
- delete replication job
- delete VEEAM snapshot using vCenter
- rename VM in vCenter from "vmname_replica" to "vmname"
- undo the failover
- remove replica from the list of replicas in VEEAM
- migrate VM to some productive datastore (in order to have VM`s folder NOT in "VeeamBackup" folder)
- delete "VeeamBackup" subfolder with replica files
- create new replication job if need

P.S.: If there is backup job (in addition to replication job) you need to add the VM to the job again.
Also old backups may be deleted.

Arthur, Anton, Tom, Vitaliy thank you very much for tips and hints!

DAXQ
Influencer
Posts: 24
Liked: never
Joined: Jan 31, 2011 3:43 pm
Full Name: David Anderson
Contact:

Re: How to perform replica failover commit in v5

Post by DAXQ » Apr 15, 2011 7:38 pm

I have a very small set up and have been trying to test running the Failover commit. And everytime I test it - I end up with a very buggy crashing VM. The first time I tried it, I moved the VM out of the Veeambackup folder and thought that was the problem so the last time I tested (following the steps above) I end up with the same crashy buggy BSOD kicking VM that I have to delete and start over with. I am not using vCenter, and have a very small set up and am finding this very frustrating.

For starters - I cannot for the life of me see any possible reason I would ever want a Non-Commital failover of a server . And secondly - to not be able to commit the VM that I have failed over so the users data is not lost with some level of simplicity is driving me kinda batty.

Would anyone have the steps to perform a Veeam B&R commit of data without using vCenter (all I have is the VIC and my Two versy small VM Hosts).

Thanks for any help with these matters.

Gostev
SVP, Product Management
Posts: 23844
Liked: 3206 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: How to perform replica failover commit in v5

Post by Gostev » Apr 15, 2011 8:20 pm

Commit procedure described above cannot cause anything like you are experiencing. The reason has to be different...
We have added automated replica commit in the next release (v6) that will be released later this year. Thanks.

tsightler
VP, Product Management
Posts: 5279
Liked: 2140 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: How to perform replica failover commit in v5

Post by tsightler » Apr 15, 2011 8:30 pm

DAXQ wrote:For starters - I cannot for the life of me see any possible reason I would ever want a Non-Commital failover of a server .


Well, this ones easy. Probably 99% of "Failover" is actually just to test that the replicated VM will actually boot and work as expected. If you didn't take a snapshot before doing this, every time you performed a "Failover" test, your remote side would be out-of-sync with the source side and you'd have to perform a resync. With the snapshot, you can test the failover server, and, once your done Veeam just deletes the snapshot and continues the replication.

DAXQ wrote:And secondly - to not be able to commit the VM that I have failed over so the users data is not lost with some level of simplicity is driving me kinda batty.
Would anyone have the steps to perform a Veeam B&R commit of data without using vCenter (all I have is the VIC and my Two versy small VM Hosts).


I can see why it would drive you "batty" but the honest truth is this is basically a very simple operation and doesn't really matter whether vCenter is involved or not. Basically, you can perform a commit of the failover server without even using Veeam, just delete the snapshot and you should be good.

My concern is that perhaps you are "testing" the commit, and then trying to allow Veeam to continue the replication to perform your next test. This is not possible, once you "commit" that new host has now diverged from the source host and can no longer be used as the replica target, if you want to start over you have to fully reseed the initial replica.

Now I don't know for a fact that this is what's happening in your case, but it's just a guess. Best is probably to work with Veeam support to make sure everything your doing is correct, however, if you post your exact procedures here, we can eyeball them for you and see if it's something obvious.

DAXQ
Influencer
Posts: 24
Liked: never
Joined: Jan 31, 2011 3:43 pm
Full Name: David Anderson
Contact:

Re: How to perform replica failover commit in v5

Post by DAXQ » Apr 18, 2011 2:55 pm

(Sorry this got kind of long with the steps) I must admit that I am not very well versed in this stuff and am learning it as I go along and truly do appreciate your responses and help! Based on your description above, I would have to say I do now get the "Failover" being used to only Test your replications. Maybe rather than calling it FailOver (which I would associate with committing to fail over), calling it Test-Replication might be less confusing to the novice such as myself =).

As for my setup and steps,
tsightler: Basically, you can perform a commit of the failover server without even using Veeam, just delete the snapshot and you should be good.
I would love to know how to perform these steps manually (did not find anything even remotely close to describing this process in the Admin Guide) and also I am not "testing" the commit then continuing replication after committing - that part of it I think I actually get. Once the server has been Failed over and Committed you are basically done and the server is now on a new host - if you are wanting to continue replication in the opposite direction then new jobs and database lists will need to be created reversing everything in Veeam B&R.

The steps I am performing are consistent so far, and the end results are consistent as well. Once I Test-Failover - every thing runs fine while Veeam thinks things are in a "not undone" state - as soon as I undo things in Veeam, my VM begins throwing Registry errors when I boot up, and will begin BSOD just after login. This is what I have done step-by-step so far (as I was trying to document the process for when I need it):

Non-Commital Failover Required
( Or - you need to fail over to a date later than the last successful replica )

Using the VIC, Verify that the Source and Target VM are both OFF

Open Veeam B&R and click Restore
Select Failover to Replica - Next
Select the Virtual Machine to fail over - Next
Select date and time you want to fail over to - Next
Enter your Restore Reason - Next
Press Finish to start the failover - Finish
A windows should open as the failover process runs - this process will be done when you see Failover completed successfully.

The failover took from 9:21:43 -> 9:29:20 to complete
The computer was up, accessible and verified ready for user to connect by 9:40.

Close the open window, and verify that your users can access the un-commited failed over server, and that it failed over to your correct date and time.

To commit the failed over server (I tested this all with failed over server running):

In Veeam B&R - delete replication job and modify Daisey Chain

In the VIC delete the snapshot for the failed over server - in the VIC it will be called VEEAM BACKUP RUNNING Snapshot please do NOT delete.
Wait until the Task finishes removing the snapshot.

Rename the VM in the VIC from VM_replica to VM

Power off VM (or it will shut off during the next steps [undo])
In Veeam B&R - undo the fail over
Click Restore
Select Undo previously performed failover - next
Select the VM you should have listed in the failed over VMs - next
Enter the reason - next
Click finish to start the process
When Undo Failover complete successfully - this step is done

Remove the Replica List from Veeam for the moved/committed server
Start VM on failed over host server.
Answer question to Keep
note - Registry Error
note - Check Disk Ran on computer and rebooted again
note - BSOD

I also tried the above process with out renaming the VM and other attempts to move or not to move the folder to a different spot on the Datastore, and each time I get registry errors and BSOD for a VM that is not very use able.

All my testing is being done with a Windows XPPro SP3 VM. I create the replicas (at least 5 or 6) and after each Replica I edit a desktop file on the VM to call it the replica # so I can try and keep track of which replica I am failing over too. If I just bring down the Source and run up the Target of the very last replica created - I have 0 issues - it just seems to work. I only experience this Registry error and BSOD if I try to roll back to an early date and time (which might be needed some day so I am trying to learn and document the process before I need it). The XPPro SP3 VM was created on a VmWare ESX3.5 server and has those VmWare tools installed on it. I am replicating back and forth between my new server which is running ESX4 (and yes is 64 bit rather than my 32 bit ESX3.5 server) - so I always leave the VMWare tools alone in case I need to have this VM run on either host. I am hoping to eventually have all 64 bit servers, but cannot afford the luxury just yet.

I will test today to see if leaving the Undo FailOver in Veeam B&R does anything for me, but am afraid it will leave my Veeam B&R in a bad state with an Undone fail over hanging out in its database and not allow me to remove the listed replicas or something.

tsightler
VP, Product Management
Posts: 5279
Liked: 2140 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: How to perform replica failover commit in v5

Post by tsightler » Apr 18, 2011 4:00 pm

DAXQ wrote: To commit the failed over server (I tested this all with failed over server running):

In Veeam B&R - delete replication job and modify Daisey Chain

In the VIC delete the snapshot for the failed over server - in the VIC it will be called VEEAM BACKUP RUNNING Snapshot please do NOT delete.
Wait until the Task finishes removing the snapshot.

Rename the VM in the VIC from VM_replica to VM

Power off VM (or it will shut off during the next steps [undo])
In Veeam B&R - undo the fail over
Click Restore
Select Undo previously performed failover - next
Select the VM you should have listed in the failed over VMs - next
Enter the reason - next
Click finish to start the process
When Undo Failover complete successfully - this step is done

Remove the Replica List from Veeam for the moved/committed server
Start VM on failed over host server.
Answer question to Keep
note - Registry Error
note - Check Disk Ran on computer and rebooted again
note - BSOD
I believe that you are making this process needlessly complicated. Once you have powered on your remote replica, and you decide to "commit" to that replica, all you have to do is remove the information from Veeam. Once the "replica" is powered up, it's a normal VM just like any other. Simple open the Veeam console, go to Jobs and remove the VM from the replication job (or delete the job if it's the only VM).

You definitely don't want to remove the snapshot and then perform an "Undo Failover" as that will lead to a corrupt VM if you are reverting to a previous restore point. Why? Well, follow the process that happens when you decide to revert to a previous restore point:

1. Veeam has to revert you VMDK's to the "restore point" so it applies that changes from a VBR file to the VMDK
2. Since you might want to revert the failover, Veeam creates "running.rbk" file to be able to "revert" the rollback
3. Veeam takes a snapshot of the system and powers it on

Now normally, if you select "undo failover" Veeam reverts the system to the "pre-snapshot" state, deletes the snapshot, and reapplies the changes from the "running.rbk", which basically leaves your replica in the exact state it was before you start the failover process so that replication can continue from that point.

However, you're interfering in the mix. Your manually deleting the snapshot, but then still attempting to perform an "undo failover". This is a mess since Veeam will now attempt to apply the changes from "running.rbk" to the current state of the disk, rather than the "pre-snapshot" state of the disk since it has no choice, you've already deleted the snapshot. This will lead to a corrupt VMDK file and thus the issues that you are seeing. This issue will not happen if you choose the most recent restore point because there's no requirement for the "running.rbk" file since your not applying changes to the VMDK files before powering on the replica.

Here's my process for "committing" a replica, I guess it might not work for you, but it's what I do:

1. Failover to whatever restore point I need
2. Rename/Relocate replica target, either manually, or by using sVmotion to migrate the replica to a new location
3. Delete replica job and sessions from Veeam (we only have one VM per replica job, makes this easier)
4. Manually cleanup old VeeamBackup replica folder on target

DAXQ
Influencer
Posts: 24
Liked: never
Joined: Jan 31, 2011 3:43 pm
Full Name: David Anderson
Contact:

Re: How to perform replica failover commit in v5

Post by DAXQ » Apr 18, 2011 4:23 pm

Hey i'm all about KISS :D as I am the idiot at the end of that kis :D

But I thought I was following the steps originally given and don't see how I am over complicating it by just following those steps. So given my scenario, I have a live running VM with 5 replica points. So I choose to failover to point 4 - the only way I can failover to point 4 is by turing off my live VM and using Veeam B&R to fail over to that restore point (which in turn creates an entry in Veeam to Undo this failover).

So Now I have a failed over server (running) and everything is great - I am at the time/date I want to be, but it is not committed - so the changes my users are making will be lost as soon as I undo this failover. From that point, my commit process is to (based on the original steps given) it would be to:

- delete replication job
- delete VEEAM snapshot using vCenter
- rename VM in vCenter from "vmname_replica" to "vmname"
- undo the failover
- remove replica from the list of replicas in VEEAM
- migrate VM to some productive datastore (in order to have VM`s folder NOT in "VeeamBackup" folder)
- delete "VeeamBackup" subfolder with replica files
- create new replication job if need

which was what I lined out (with a bit more detail as I was including the Next, Next button etc. for documenting but basically the same)

So I have now entered more confusion land - as I smell what your steppin in (about undoing the failed over VM that you have deleted the snapshot for - but that was in the original steps), so I dont really see how to accomplish what your saying.

tsightler
VP, Product Management
Posts: 5279
Liked: 2140 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: How to perform replica failover commit in v5

Post by tsightler » Apr 18, 2011 6:10 pm

The "steps" outlined in this post likely assume that you "failing over" to the most recent snapshot, and they would work fine in that case.

I guess what I don't understand is why you keep attempting to "undo" the failover. This is not a requirement. In Veeam simply delete the replica job, and then go to "Sessions", right click on the Job Name, and select "Remove from replicas". At that point Veeam no longer knows about the replca VM, and you can do whatever you want with this VM using vCenter, it's just a standalone VM. I usually use sVmotion or simply power it off and manually "cleanup" the VM files at some point because I don't like them sitting in a folder called "VeeamBackup", and the .vrb files aren't any good anymore since their Veeam specific, but you don't even have to do that if you don't want. The VM is standalone at that point, just delete the Veeam snapshot and go on.

Now, if you have a replica job that has multiple VM's, and you only failover one VM, I can see this as being an issue as I don't know of any way to remove just a single VM from a replica job during failover, but you can still svMotion the VM and then tell Veeam to "undo" the failover, which won't actually work since the VM's been moved out from under Veeam.

DAXQ
Influencer
Posts: 24
Liked: never
Joined: Jan 31, 2011 3:43 pm
Full Name: David Anderson
Contact:

Re: How to perform replica failover commit in v5

Post by DAXQ » Apr 18, 2011 6:32 pm

I did not think the multiple replicas things at one time was going to help simplify things so I have one job for each replica - I just ended up Daisy Chaining them in the end so one job calls the next.

I guess I just assumed (and I suppose all this assuming is not good for anybody =) that when you fail over using the Veeam -> Restore -> Failover button, that Veeam goes through all its stuff of using those restore points and patching things to get us to the correct date and time wanted (as I have said - to fail over to the latest replica, none of this is necessary - it is only needed if I need to go back further that then most recent). And once you do this - if you go back into Veeam, and click Restore again - the files are Locked by veeam becuase it is waiting to Undo this last failover.

From our ongoing conversation, I am beginning to feel like what I need to do is:

Fail Over With Veeam (which should bring up working running VM that I want).
Shutdown that VM - and copy its folder contents to another place on the storage (so I have a working failed over non-commited copy).
Start the VM again
Go back into Veeam and undo the fail over (to get Veeam B&R straight)
For clean up Remove the source, target, and any entries in Veeam regarding this VM
And finally bring up my Copy, delete the snapshot and also any unneeded rbk files that Veeam placed in there to commit it as a production running VM.

or something similar to that.

tsightler
VP, Product Management
Posts: 5279
Liked: 2140 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: How to perform replica failover commit in v5

Post by tsightler » Apr 18, 2011 6:57 pm

If you're only using one job per replicated VM then you definitely do NOT need to undo the replica nor do you need to power down the replicated VM to "commit" it. As as stated above, the task is actually quite simple:

1. Failover the VM to the replica
2. In Veeam Console select "Backup and Replication"..."Jobs" and delete the replica job
3. In Veeam Console select "Backup and Replication"..."Replicas" right click on the Job name and select "Remove from replicas"
4. Delete "VEEAM RUNNING" snapshot on target VM

Step #3 removes Veeam's knowledge of the relpica, whether it's running or not, and will NOT shut down your target VM, and will not complain about the files being "locked".

After these steps there is no actual requirement to do anything, but you can choose to "cleanup" the target side if you wish as there will be left over "Veeam" files are the target host that are no longer of any value. I usually just "svMotion" the running host and remove the old directory, since that lets me leave the VM running, but if I can get a few minutes of downtime on the replica I sometimes just shutdown the VM, unregister the VM, remove the old Veeam files, rename/move the directory, and reregister the VM. You can do all of this from the command line in about 2 minutes if you know the commands and you're not moving between datastores (I replica my systems to the stores where they'll failover to).

Post Reply

Who is online

Users browsing this forum: DGrinev, Exabot [Bot], Google [Bot], manuelwall, YoMarK and 73 guests