Snapshot removal issues of a large VM

VMware specific discussions

Snapshot removal issues of a large VM

Veeam Logoby curruscanis » Tue Jan 19, 2010 4:42 pm

I have an issue within my enviorment where a VM that is 500+ Gb is taking a very long time for snapshot removal. After the Veeam backup process calls to remove the snapshot the VM will go off line from the networks perspective and be in a snapshot removal state for almost an hour, sometimes more. Is there anything that I can do to keep the VM online during this process or make the process take less time?

Thanks.
curruscanis
Novice
 
Posts: 4
Liked: never
Joined: Wed Jul 29, 2009 6:17 pm
Full Name: Jack

Re: Snapshot removal issues of a large VM

Veeam Logoby tsightler » Tue Jan 19, 2010 5:48 pm

What version and patch level of VMware are you running? VM's should generally stay online during snapshot removal except for a few seconds as the final commits are made. We backups several VM's that are 500+GB, including one that's 1.2TB, and I've never seen this issue.
tsightler
Veeam Software
 
Posts: 4768
Liked: 1737 times
Joined: Fri Jun 05, 2009 12:57 pm
Full Name: Tom Sightler

Re: Snapshot removal issues of a large VM

Veeam Logoby jgremillion » Tue Jan 19, 2010 7:43 pm

We have this problem occasionally when performing backups of large GroupWise VMs. It seems to only happen when we have a Post Office that has been pretty busy during the backup Window. I asked VMware about this and they said that if you have a VM that was being used heavily during the time the snapshot was created and was being backed up it can take quite while for the snapshot to consolidate before it's removed. This can and will effect the VMs performance.

Their solution was not to perform a backup of a busy VM during heavy use periods.
jgremillion
Enthusiast
 
Posts: 87
Liked: never
Joined: Tue Oct 20, 2009 2:49 pm
Full Name: Joe Gremillion

Re: Snapshot removal issues of a large VM

Veeam Logoby tsightler » Tue Jan 19, 2010 8:09 pm

Well, I can certainly understand "effect the VMs performance", but he's saying "VM will go off line from the networks perspective..." for "...almost an hour, sometimes more." That's a little more than a performance issue. We backup some very busy VM's, including our Exchange VM. It's almost 400GB now and it's pretty busy almost all the time. It's not unusual for it to grow a multi-gigabyte snapshot that takes 30-40 minutes to remove, even backing it up during a "quiet" time. Still, I've never seen a system go completely offline for an hour. That's sounds like a serious problem to me.
tsightler
Veeam Software
 
Posts: 4768
Liked: 1737 times
Joined: Fri Jun 05, 2009 12:57 pm
Full Name: Tom Sightler

Re: Snapshot removal issues of a large VM

Veeam Logoby jgremillion » Tue Jan 19, 2010 8:35 pm

Well hope it doesn't happen to you either because it aint pretty when everyone starts yelling and calling. :D User's mailboxes on the affect POs are pretty much inaccessible until the snapshot is consolidated and removed.

I've had two VMs become pretty much unusably during the snapshot removal time. The first time it happened I was skeptical but around the third time it happened I pretty much decided that I need to start the backup earlier.

One thing that may be the culprit is all of our large GroupWise (on Windows) VMs are virtual RDMs. I wonder if it's a issue with consolidating the snapshot of the RDM to the raw disk?
jgremillion
Enthusiast
 
Posts: 87
Liked: never
Joined: Tue Oct 20, 2009 2:49 pm
Full Name: Joe Gremillion

Re: Snapshot removal issues of a large VM

Veeam Logoby jgremillion » Tue Jan 19, 2010 8:37 pm

And yes, I've had this happen for almost that long. I had one that was stuck for 45 minutes. Talk about a major panic around here.
jgremillion
Enthusiast
 
Posts: 87
Liked: never
Joined: Tue Oct 20, 2009 2:49 pm
Full Name: Joe Gremillion

Re: Snapshot removal issues of a large VM

Veeam Logoby tsightler » Tue Jan 19, 2010 8:51 pm

I wasn't trying to claim that it couldn't happen, or that you didn't have it happen, only that I believe that it shouldn't happen. To be fair, I've seen similar problems back in the ESX 3.5 days and earlier. There were some known issues with snapshot removal that could cause this. But since 3.5 U2 (I think U2, I guess it might have been U3) snapshot removal was overhauled completely and now uses helper snapshots in a loop until the final snapshot is small, and thus the "stun" time should be short. Veeam 4.x also has "safe snapshot removal" that let's you get similar behavior from older versions of ESX.

I guess what I'm saying is, if you're seeing this with current VMware versions, well, that still seems like a problem, perhaps something unique in your environment (slow storage, the virtual RDM's you mention -- we use VMDK's, etc). In other words, I'm fully buying that it can happen, but if it were happening to me, I don't think I'd let VMware off the hook with the "don't preform a backup of a busy VM" excuse. What if I were using snapshots for other purposes?
tsightler
Veeam Software
 
Posts: 4768
Liked: 1737 times
Joined: Fri Jun 05, 2009 12:57 pm
Full Name: Tom Sightler

Re: Snapshot removal issues of a large VM

Veeam Logoby Gostev » Tue Jan 19, 2010 11:19 pm

ESX4 indeed has an issue when removing snapshot causes long VM freezes, but this only happens if there is another snapshot exists on VM before you create, and then try to delete an additional snapshot. The VM freeze is proportional to the first (existing) snapshot size, and does not matter on how big the second snapshot have grown. So please check if you have other snapshots on your VM.

This is the only issue I am aware of which may cause significant downtimes on production VM during the snapshot removal with ESX4. If you do not have additional snapshot, then your VM definitely should not become inaccessible for more than a few seconds during the snapshot removal, no matter what the snapshot size is - I've personally done stress testing on this (snapshot removal while copying large files to VM). The way snapshot removal is implemented in ESX4 ensures that large snapshots do not result in longer VM freezes (except the issue/bug with extra snapshots present - described above).
Gostev
Veeam Software
 
Posts: 21390
Liked: 2349 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: Snapshot removal issues of a large VM

Veeam Logoby jgremillion » Wed Jan 20, 2010 2:45 am

I do not have any other snapshots when this happens. This only happens when the snapshot from VBR is trying being removed. No other snapshots.
jgremillion
Enthusiast
 
Posts: 87
Liked: never
Joined: Tue Oct 20, 2009 2:49 pm
Full Name: Joe Gremillion

Re: Snapshot removal issues of a large VM

Veeam Logoby curruscanis » Wed Jan 20, 2010 5:12 pm

First my version of ESX: 4.0.0 Build 164009
Vcenter version : 4.0.0 Build 162856

I do have some items in my snapshot manager for my large VM I have two levels of Consolidate Helper-0.

Is this a remenant of failed backups?

Thank you all for your help and suggestions.
curruscanis
Novice
 
Posts: 4
Liked: never
Joined: Wed Jul 29, 2009 6:17 pm
Full Name: Jack

Re: Snapshot removal issues of a large VM

Veeam Logoby tsightler » Wed Jan 20, 2010 6:19 pm

I have never seen the problem you describe with ESX 4 but of course that doesn't mean that it might not exist. Are you running the latest VM tools? Do you have the "VMware Tools Quiesce" disabled? You might want to make sure that the VMware Tools sync driver is not installed or is disabled, having this legacy service enabled has been known to cause hangs during snapshot removal. Just a few thoughts.

I'd also suggest that you remove the snapshots that are currently on the VM. It's likely that those are leftovers from failed backups and I would suggest you remove them via the snapshot manager GUI.
tsightler
Veeam Software
 
Posts: 4768
Liked: 1737 times
Joined: Fri Jun 05, 2009 12:57 pm
Full Name: Tom Sightler

Re: Snapshot removal issues of a large VM

Veeam Logoby Gostev » Thu Jan 21, 2010 12:41 pm

Jack, also make sure you are using the latest Veeam Backup version (4.1), as with previous release Consolidate Helper-0 snapshot could be left if your stop the backup job manually.

Tom is correct that you should remove the snapshot manually. If you do not have this option available in snapshot manager GUI, you should create and extra (new) snapshot first, then you will be able to remove the helper snapshot.

Thanks!
Gostev
Veeam Software
 
Posts: 21390
Liked: 2349 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: Snapshot removal issues of a large VM

Veeam Logoby Ace T » Mon Mar 22, 2010 11:42 am

We have this issue on 4.1 with a large VM sitting removing snapshots and not availble on the network on ESX4. What can I do to resolve this ?
Ace T
Enthusiast
 
Posts: 35
Liked: never
Joined: Wed Dec 02, 2009 8:32 am
Full Name: Amit Panchal

Re: Snapshot removal issues of a large VM

Veeam Logoby Gostev » Mon Mar 22, 2010 11:56 am

Amit, the only possible cause we know about is described in my post above (20 Jan 2010). If this does not apply to you, it would be better for you to open a support case with VMware to investigate why snapshot removal causes issues such a long VM locks. Veeam Backup is merely issuing command to remove snapshot, so this is similar to removing snapshot manually with VMware Infrastructure Client. The actual process is fully handled by ESX host.

This is definitely not "normal" behavior, it does not matter how large the VM or snapshot is, this should not be happening.
Gostev
Veeam Software
 
Posts: 21390
Liked: 2349 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: Snapshot removal issues of a large VM

Veeam Logoby Ace T » Mon Mar 22, 2010 2:06 pm

Hi Gostev,

VMWare could not see where the problem was and advised me to wait till the operation completed before making sure all snapshots are removed from the VM. They had a look through all the logs and said it was just a very slow snapshot removal process but they were not sure about not being able to ping the VM. This is a large VM but the snapshot is still removing now and has been going on for over 4 hours. I can see the Consolidated Helper snapshot but there are 3 snapshots in total so it is taking a while to clear them all.
Ace T
Enthusiast
 
Posts: 35
Liked: never
Joined: Wed Dec 02, 2009 8:32 am
Full Name: Amit Panchal

Next

Return to VMware vSphere



Who is online

Users browsing this forum: UT2015 and 33 guests