Comprehensive data protection for all workloads
Post Reply
cby
Enthusiast
Posts: 97
Liked: 6 times
Joined: Feb 24, 2009 5:02 pm
Contact:

Veeam backup error and impact on VM performance

Post by cby »

We have recently been encountering the following error during backup:

VCB error: Error: Other error encountered: Snapshot remove failed: Operation failed since another task is in progress. An error occurred, cleaning up...

The problem is occurring more and more frequently. Veeam retries and usually succeeds. However, today it failed after 3 retries (the configured number of retries per job). There are no other tasks in progress as far as we can tell. The 4 attempts to backup seemed to have a huge performance impact on the VM being backed up.

Our biggest concern is that around the periods when it was attempting to remove the snapshot the VM (RHEL5) performance was hammered and our users had to wait several minutes at the Linux login prompt before the Password: prompt appeared. Performance and response continued to be appalling until the above error was reported by Veeam. At which point normal performance resumed on the VM.

The entire VM infrastructure was inaccessible during these periods -- no Vmware stats, no monitoring stats, nothing in /var/log nor dmesg to indicate an o/s failure. Our other VMs were unaffected -- but then the backups were successful on those machines.

We are running Veeam backup 3.0.1, ESX 3.5 with VC 2.5.

I'd be grateful for any pointers.

Thanks.
Gostev
Chief Product Officer
Posts: 31803
Liked: 7298 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Veeam backup error and impact on VM performance

Post by Gostev »

Snapshot removal is arguably one of the most common issues with VMware overall and it can be affected by multiple factors. As a start, I recommend investigating the content of VM log file (the one stored next to VM files). Snapshot operations are covered well there (at least in ESX4 - I just did some heavy stress testing on snapshot removal lately).

There could be multiple technically reasons for this behavior, so really best would be to let VMware to troubleshoot this. Especially when using VCB, since in this case snapshot management is performed by VCB directly, while Veeam Backup is not even involved... but even with other job modes, it is about Veeam Backup calling a single VMware API function - while the rest is being handled by VMware.
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Veeam backup error and impact on VM performance

Post by tsightler »

We've actually started seeing this since our upgrade to ESX/vCenter4. Because of the performance issues with ESX4 and network backups we had to switch back to VCB backups which take a good bit longer. Around 12:30-1AM we have an issues where a few Veeam jobs finish, but we also have Oracle RMAN jobs running (they start at 12AM). The Veeam jobs finish and attempt to remove the snapshot, but because of the other Veeam jobs that are still running, and the Oracle RMAN jobs, the storage array is under significant stress and the snapshot removal takes a VERY long time. I'm really not sure exactly what happens then, but it appears that Veeam eventually times out, and then starts retrying these jobs and getting the "Operation failed since another task is in progress" messages.

It's only happened about 3 times that a job didn't run at all (after retries) so we've been living with it anxiously waiting for Veeam Backup 4 that will hopefully fix the issue thanks to block change tracking. Since we're so close to that release we didn't really investigate the exact cause or whether there's anything we might be able to do about it, save for maybe rescheduling the jobs.
cby
Enthusiast
Posts: 97
Liked: 6 times
Joined: Feb 24, 2009 5:02 pm
Contact:

Re: Veeam backup error and impact on VM performance

Post by cby »

Thanks -- I've been looking through the log but nothing obvious yet.
Post Reply

Who is online

Users browsing this forum: Baidu [Spider], bigbruise, jsprinkleisg, Semrush [Bot] and 121 guests