Comprehensive data protection for all workloads
Locked
Felix
Enthusiast
Posts: 37
Liked: 2 times
Joined: Oct 30, 2009 2:43 am
Full Name: Felix Buenemann
Contact:

Helper snapshot not removed when vStorage API job is stopped

Post by Felix » Oct 30, 2009 1:43 pm

If a backup of a VM in Veeam Backup 4.0 fails, the helper snapshots (consolidate helper-0) are not removed and need to be manually flushed. This happens both on normal failure and on backup abort.

The VI client logs the following error "Name: Remove snapshot, Target: VMNAME, Task: Unable to access file <unspecified filename> since it is locked".

Enviroment: vSphere 4.0 + recent Updates, vCenter Server is W2K3 SP3 32-Bit, all VMs HW Version 7 using SAN/NBD failover vStorage API backup. Backup host is a physical server 8core Xeon, also running vCenter and connected to SAN via MS iSCSI initiator w/2x 2GBps trunks.

- Felix

Gostev
SVP, Product Management
Posts: 24450
Liked: 3410 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by Gostev » Oct 30, 2009 3:37 pm

Felix, this is actually not a Veeam Backup snapshot. The snapshot Veeam Backup creates has the "VEEAM BACKUP TEMPORARY SNAPSHOT" name. I believe that (consolidate helper-X) snapshots are created automatically by VMware during snapshot management procedures.

Felix
Enthusiast
Posts: 37
Liked: 2 times
Joined: Oct 30, 2009 2:43 am
Full Name: Felix Buenemann
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by Felix » Oct 31, 2009 7:43 am

I must correct myself, these snapshots only get stuck on the VM if you manually abort a backup job, not if a scheduled backup fails for another reason.

Anyways this reproducably happens everytime I manually abort a Veeam Backup job running through vStorage API.

So you are saying this is a bug in vSphere not in Veeam?

tjestr
Enthusiast
Posts: 44
Liked: never
Joined: Mar 05, 2009 9:33 am
Full Name: Falko Dohse
Location: Hamburg
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by tjestr » Oct 31, 2009 3:16 pm

I see the same behaviour too. If stopping a backup job (through vStorage API) in veeam, the "consolidate helper-x" snapshot remains and is not removed automatically. There havent't been any Snapshot before the backup job started.

Enviroment: vSphere 4.0 + recent Updates, vCenter Server is Win2k3R2 SP3 32-Bit. All VMs HW Version is 7. Backup Mode: SAN/NBD failover vStorage API backup. Backup Host: Physical server 4core Opteron.

Gostev
SVP, Product Management
Posts: 24450
Liked: 3410 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by Gostev » Oct 31, 2009 6:35 pm

Felix wrote:So you are saying this is a bug in vSphere not in Veeam?
Yes, our product does not create this snapshot - it is automatically created by ESX host during snapshot management process. Whether the job fails due to an error, or manually stopped, Veeam Backup goes through the same (and only) cleanup procedure that would remove snapshot we have created during backup ("VEEAM BACKUP TEMPORARY SNAPSHOT") - you can easily confirm this snapshot appearing first, and then disappearing.

Now, when the last snapshot is being removed from the VM, ESX host creates "consolidate helper" snapshot to host the data writes while actual snapshot is being removed. After that was done, the actual "consolidate helper" snapshot is being injected into the main VMDK by ESX. Because in order to commit the last helper snapshot VM I/O must be completely frozen (for obvious reasons), the commit can only take place if both of these conditions are true:
- Helper snapshot size is less than 16MB (which is minimal snapshot size in VMware)
- There is very little write I/O going on the VM at the given moment

If any of these are not true, ESX will wait, iteratively creating new helper snapshots to host writes while commiting old ones (remember, it needs to have smallest possible snapshot before final commit) while waiting for a "good moment" to freeze VM and commit the last snapshot. Now, this process may obviously take quite a long time, depending on initial "consolidate helper" snapshot size (in turn, mostly defined by the VEEAM snapshot size - if VEEAM snapshot is large, the "consolidate helper" will also grow large while VEEAM snapshot is being commited), as well as depending on datastore and VM I/O load.

And this actually explains why you observe the snapshot only when stopping the job manually. While you are at the console, stopping the job interactively and immediately going to investigate the Snapshot Manager - you will almost always see the consolidate helper snapshots present.

On the other hand, jobs almost never fail during the actual backup (there is simply no reasons for a job to fail in a midst of data copying, unless network goes down or something), so in most cases jobs fail before our snapshot is even created (or due to being unable to create that snapshot). Thus, there are no snapshots to commit in the first place, and so "consolidate helper" snapshots would simply never appear.

Now, there were also quite a few bugs around snapshot commit functionality in VMware, if you search VMware Communities for "consolidate helper", you will see about 100 threads about this problem of "consolidate helper" snapshots left behind. Most of those issues were bugs from older version of ESX hosts, and they are all fixed now. There are some scenarios when this could still happen under vSphere, for instance lack of free disk space on the datastore, and may be some due to some other new bugs - although I am not aware of such.

Assuming that you are not facing some new snapshot management bug or issue, all you have to do is simply wait until the "consolidate helper" removes itself. Remember that on VMs with heavy I/O, or when the actual datastore is loaded, this process may take quite a long time, up to an hour or more (although more typically, under 20 minutes even for heavily loaded Exchange servers). Also, keep in mind that while the actual vCenter task for snapshot removal times out in 10-15 minutes (can't remember the default setting), snapshot removal will still be processed by ESX in the background, and eventually the "consolidate helper" snapshot will be gone.

Phew, long post, hope this helps :)

Felix
Enthusiast
Posts: 37
Liked: 2 times
Joined: Oct 30, 2009 2:43 am
Full Name: Felix Buenemann
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by Felix » Oct 31, 2009 10:23 pm

You are forgetting the error I'm seeing in vCenter:

"Name: Remove snapshot, Target: VMNAME, Task: Unable to access file <unspecified filename> since it is locked"

This seems to indicate that vCenter is trying to remove the helper snapshot, but for some reason in can't determine the filename.

And snapshots are stuck until I remove them manually, sometime I need to add a snapshot on top of the helper, to make it appear in snapshot manager.

Gostev
SVP, Product Management
Posts: 24450
Liked: 3410 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by Gostev » Oct 31, 2009 11:10 pm

Not forgetting it - I simply don't know what does this error mean... our code works with "VEEAM BACKUP TEMPORARY SNAPSHOT" solely, which gets removed successfully after you stop the job because it no longer appears in the Snapshot Manager. Our code does not work with any other snapshots and thus cannot really "lock" anything else. Anyhow, I will ask QC to double check and try to reproduce something like this by stopping the running jobs.

tjestr
Enthusiast
Posts: 44
Liked: never
Joined: Mar 05, 2009 9:33 am
Full Name: Falko Dohse
Location: Hamburg
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by tjestr » Nov 01, 2009 8:46 am

I've manually stopped a Veeam backup job of a not actively used VM. (Assuming that the consolidate helper has not grown, because the VM was not used, the consolidate helper hasn't been removed for 6 hours. I've deleted the snapshot myself after this long period of time.)
So there must be some kind of bug. Either in VMware (most likely) or in Veeam.

Gostev
SVP, Product Management
Posts: 24450
Liked: 3410 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by Gostev » Nov 01, 2009 8:09 pm

Falko, thanks for your testing. We will be trying to reproduce this behavior this week. May be in this scenario vStorage API does not release some resources correctly, or something like this.

Felix
Enthusiast
Posts: 37
Liked: 2 times
Joined: Oct 30, 2009 2:43 am
Full Name: Felix Buenemann
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by Felix » Nov 01, 2009 10:07 pm

Well it seems to take some time for the lock to be released. I tried manually commiting the snapshot right after it failed automatically with the error and got the same error again. Then I waited a minute or two and tried again and now the snapshot was successfully removed, so it seems to be a timing issue.

Gostev
SVP, Product Management
Posts: 24450
Liked: 3410 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by Gostev » Nov 02, 2009 9:05 pm

Just wanted to post an update that our QC was unable to confirm this issue in our lab today. Of course, this does not guarantee yet that the issue is not there. If this is a "chasing" situation between a number of processes, then the issue may appear in some environments, but not in another (depending on storage speed, timeouts, etc). I am guessing that this "chasing" issue can potentially appear in some cases due to vStorage API having some synchronizations issues in cleanup procedures when terminated.

We will try to use some "heavy artillery" now and involve our system devs to look behind the scenes on what happens. If anyone who can easily confirm this issue is available for webex, please let me know.

Felix
Enthusiast
Posts: 37
Liked: 2 times
Joined: Oct 30, 2009 2:43 am
Full Name: Felix Buenemann
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by Felix » Nov 03, 2009 1:12 am

Shouldn't be a problem, so far I've been able to reproduce this at each abort.

bgbGsy
Influencer
Posts: 18
Liked: 2 times
Joined: Sep 10, 2009 7:40 pm
Full Name: Brendan Bougourd
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by bgbGsy » Nov 11, 2009 7:47 am

Hi Gostev.

I too have seen this behaviour using Veeam v4. I never encountered it on V3.x. Basically the first time was when I had a VSS error in Veeam (Due to trying to backup the SQL Server hosting the vCenter database, which I now realize is not possible if included in a job with other machines in a vCenter folder). The snapshot was created, but the job did not clean up after the failure. Upon trying to manually delete the Veeam Backup Temp Snapshot, I got the error others have reported about a file being locked. The way around it for me was to create a temporary snapshot (Which appears as consolidated helper in the snapshot manager) and then to 'delete all'. After this the directory listing of the VM shows no snapshots present.

I am using the latest install package of 32bit v4.

I would be interested if you have been able to replicate this behaviour at all.

Many thanks.

Brendan.

Gostev
SVP, Product Management
Posts: 24450
Liked: 3410 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by Gostev » Nov 18, 2009 1:25 pm

Update. We have changed the way vStorage API jobs are stopped when manual STOP command is issued, so the issue above should no longer happen. The hotfix was built and is currently being verified (significant code changes there, so testing will take 1-2 days). After that, the hotfix will be available through support (or you can wait for the next minor release in December where this will be incorporated).

Gostev
SVP, Product Management
Posts: 24450
Liked: 3410 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by Gostev » Nov 18, 2009 2:22 pm

Sorry: slight correction! Fix for this specific issue will not be available in form of hotfix for version 4.0, this is implemented in version 4.1 code branch only. Changes around this are major, and are not easily ported to version 4.0 because of additional dependencies. Version 4.1 release is expected in December 2009.

tjestr
Enthusiast
Posts: 44
Liked: never
Joined: Mar 05, 2009 9:33 am
Full Name: Falko Dohse
Location: Hamburg
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by tjestr » Nov 18, 2009 3:16 pm

I know this does not belong to this thread but will you implement the possibility to selcet datastores as backup source to 4.1? (We would really like this feature!)

Gostev
SVP, Product Management
Posts: 24450
Liked: 3410 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by Gostev » Nov 18, 2009 3:23 pm

Falko, there are no new features planned for 4.1 - just bugfix and a few features which were implemented in 4.0, but disabled/hidden before the actual release due to lack of time for testing. Datastore selection is not there unfortunately.

Gostev
SVP, Product Management
Posts: 24450
Liked: 3410 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Helper snapshot not removed when vStorage API job is stopped

Post by Gostev » Dec 18, 2009 1:31 pm

Fixed in version 4.1

Locked

Who is online

Users browsing this forum: Google [Bot] and 36 guests