-
- Enthusiast
- Posts: 39
- Liked: never
- Joined: Jun 26, 2010 2:27 pm
- Full Name: chris h
- Contact:
recovering from network outage during a replication job
I'll be performing the first full replica locally. Then changing the IP address on the ESXi host and placing the ESXi host offsite. I already have a site to site VPN tunnel established between the two sites. The local site has a IP range of 192.168.1.0 and the remote side has an IP range of 192.168.3.0. Replicas from then on will be done over a 2.25 mbps connection. I actually get 2.25 mbps. Now, what happens if the remote side drops during a replica job... I've seen this happen in the past and the VM disks show to be using snapshots. They show to be using snapshot files until I have run a batch file to stop the hung job. I get nervous when my VM disks are running on snapshot files during productino hours. How can we simplify from a network crash like a DSL connection dropping? What steps can I take to get my VM's in a normal state again? As I understand having the disks pointing to snapshot files is not a good thing if I reboot the VM for it will run off of the snapshot, right?! My remote site will be using a DSL connectino which will having issues every once and awhile.
-
- Chief Product Officer
- Posts: 31814
- Liked: 7302 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: recovering from network outage during a replication job
Hello Chris, I am not 100% sure what you are asking, so I will try to answer to my best understanding of your question.
If you terminate the running job in anyway, then you are not giving Veeam Backup a chance to delete its snapshot. In that case, the snapshot must be deleted manually. To do that, you should open vSphere Client, and delete the Veeam snapshot. All data from snapshot will be commited back to VMDK.
Alternatively, you can wait for the next incremental job run - Veeam Backup will detect that the snapshot is still present, and will clean it up before proceeding further. Does this anwer your question.
If you terminate the running job in anyway, then you are not giving Veeam Backup a chance to delete its snapshot. In that case, the snapshot must be deleted manually. To do that, you should open vSphere Client, and delete the Veeam snapshot. All data from snapshot will be commited back to VMDK.
Alternatively, you can wait for the next incremental job run - Veeam Backup will detect that the snapshot is still present, and will clean it up before proceeding further. Does this anwer your question.
-
- VP, Product Management
- Posts: 27377
- Liked: 2800 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: recovering from network outage during a replication job
Chris,
I've just thought of an idea why not to use Veeam Monitor 5.0 to alert on the opened/orphaned snaphosts growth. Besides you may trigger a custom script that will remove/consolidate existing snapshots to a VM automatically as soon as the alert is risen. This post alert script option can also be configured with Veeam Monitor application.
Just my two cents.
I've just thought of an idea why not to use Veeam Monitor 5.0 to alert on the opened/orphaned snaphosts growth. Besides you may trigger a custom script that will remove/consolidate existing snapshots to a VM automatically as soon as the alert is risen. This post alert script option can also be configured with Veeam Monitor application.
Just my two cents.
-
- Enthusiast
- Posts: 39
- Liked: never
- Joined: Jun 26, 2010 2:27 pm
- Full Name: chris h
- Contact:
Re: recovering from network outage during a replication job
Both of your replies help me alot... What is the impact on a VM is it's virtual disks have snapshot assigned to them? I'll be replicated across a DSL connection... It will fail sometimes... When it fails the VM being replicated will have snapshots assigned to it's virtual disks. How does this affect the operation of the VM? Also, what if the VM reboots? Will the VM mount the snapshot?
-
- Chief Product Officer
- Posts: 31814
- Liked: 7302 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: recovering from network outage during a replication job
No impact whatsover - it is completely transparent. Just watch the disk space on VM datastore. Snapshot can potentially grow to the same size as original VMDK (that is, when every single VMDK block is changed by running VM, which is of course not realistic).
-
- VP, Product Management
- Posts: 6035
- Liked: 2860 times
- Joined: Jun 05, 2009 12:57 pm
- Full Name: Tom Sightler
- Contact:
Re: recovering from network outage during a replication job
Well, it's not quite fair to say there is "no impact whatsoever". VMware snapshots do have a measurable impact on the performance of I/O in a VM, and, if you're using SAN storage, can actually impact the performance of the entire cluster because of the additional SCSI reservations and the significant increase in IOPS that are required when there is an active snapshot on a VM, but this is probably not a major issue except for VM's with moderate to high I/O requirements. See http://www.vmdamentals.com/?p=332 for a more detailed description than I can provide. My real world observations are pretty much identical to their test results.
-
- Enthusiast
- Posts: 39
- Liked: never
- Joined: Jun 26, 2010 2:27 pm
- Full Name: chris h
- Contact:
Re: recovering from network outage during a replication job
So.. If my replica job starts at 11:59PM on Monday and at 2:00am Tuesday the DSL connection drops at that moment in time my virtual disks are set to run on snapshots... The .vmx file shows each virtual disk pointing to a snapshot file and in the vSphere client each virtual disk shows the snapshot file being associated to it.. Even with this said I have all the time in the world to simply resume the replica job OR cancel the job then consolidate/delete remaining snapshots, correct?
-
- VP, Product Management
- Posts: 27377
- Liked: 2800 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: recovering from network outage during a replication job
Chris,
If your replication job fails due to DSL connection drop and you have an open snapshot running, you will need to remove it manually or wait till the second job run removes it. There is no resume option for backup/replication jobs.
If your replication job fails due to DSL connection drop and you have an open snapshot running, you will need to remove it manually or wait till the second job run removes it. There is no resume option for backup/replication jobs.
-
- Enthusiast
- Posts: 74
- Liked: never
- Joined: Mar 26, 2011 4:02 am
- Full Name: Conrad Gotzmann
- Contact:
[MERGED] Replication should continue after failed replicatio
Feature request !!!!
If find the recovery of the replication jobs very annoying. First that the replication job fails after 13.5 hours. Second it cannot pickup from where it left off. 3rd I need to clean up the mess it left behind. I have added a new disk to a vm and I am replicating to the DR site. Its a additional 80GB VM. After 3 attempts to add this disk I give up. The error is a simple client timeout message. Please recover and continue where you left off. Dont give me a "annoying"
Cannot complete the operation because the file or folder
2011_6.vmdk already exists. Of course it does, if the program checked the last replication it would know it was from a failed attempt. Continue on !!!. or at least check the logs and clean up.
Is this asking to much !.
Is there some settings that make veeam better at recovery from errors. I would like a set it and forget it config.
If find the recovery of the replication jobs very annoying. First that the replication job fails after 13.5 hours. Second it cannot pickup from where it left off. 3rd I need to clean up the mess it left behind. I have added a new disk to a vm and I am replicating to the DR site. Its a additional 80GB VM. After 3 attempts to add this disk I give up. The error is a simple client timeout message. Please recover and continue where you left off. Dont give me a "annoying"
Cannot complete the operation because the file or folder
2011_6.vmdk already exists. Of course it does, if the program checked the last replication it would know it was from a failed attempt. Continue on !!!. or at least check the logs and clean up.
Is this asking to much !.
Is there some settings that make veeam better at recovery from errors. I would like a set it and forget it config.
-
- VP, Product Management
- Posts: 27377
- Liked: 2800 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: recovering from network outage during a replication job
There is no such setting, but thanks for the feedback!
-
- Influencer
- Posts: 22
- Liked: 3 times
- Joined: Dec 21, 2010 10:31 pm
- Full Name: brad clements
Re: recovering from network outage during a replication job
We have a similar issue, temporary outage on the internet drops the vpn, replication job gets killed
on the next retry, the job fails with:
so far manually removing a snapshot on the replica vm hasn't fixed this problem.
I am opening a support case for this issue, it's a pain to cleanup after replication failures.
also I would like replication jobs to have a 'network retries' option. I realize it's not as simple as that as some vmware commands are atomic (eg. create snapshot), but the software neesd to be a lot more resilient to 1 minute network outages in a 15 hour replica job.
on the next retry, the job fails with:
Code: Select all
Preparing replica VM Error: Detected an invalid snapshot configuration.
Error: Detected an invalid snapshot configuration.
I am opening a support case for this issue, it's a pain to cleanup after replication failures.
also I would like replication jobs to have a 'network retries' option. I realize it's not as simple as that as some vmware commands are atomic (eg. create snapshot), but the software neesd to be a lot more resilient to 1 minute network outages in a 15 hour replica job.
Who is online
Users browsing this forum: Bing [Bot] and 66 guests