Discussions specific to the VMware vSphere hypervisor
Post Reply
benwaynet
Enthusiast
Posts: 30
Liked: never
Joined: Apr 15, 2009 5:27 pm
Full Name: Jason Benway
Location: West Michigan
Contact:

Re: Snapshot removal issues of a large VM

Post by benwaynet »

Anyone have any open post on the vmware forums we can follow?

We just started experiencing this issue this week, after months of the same backup schedule.

jb

zak2011
Expert
Posts: 367
Liked: 41 times
Joined: May 15, 2012 2:21 pm
Full Name: Arun
Contact:

Re: Snapshot removal issues of a large VM

Post by zak2011 » 1 person likes this post

Hello Jason,
I was experiencing the issue for a few VMs which were on a particular esxhost. After the Veeam patch was applied, I didnt experience the problem for the same VM, however i started experiencing the left over snapshots for another two VMs on the same ESX host.
After complete investigation, it was found that we were running into a pure vmware issue. The build of my vcenter and esxhosts hosts were quite old. vcenter was 4.0.0.193498 and the esxhost was 4.0.0.164009.
There have been reported several issues with snapshot removals with these builds of vcenter. So I was advised to update the vcenter and the hosts. I have upgraded them to the latest build and there hasnt been any issues ever since.
So applying the Veeam patch along with upgrade of the vcenter helped.

Hope this helps,
Arun

tgodon
Novice
Posts: 6
Liked: never
Joined: May 06, 2013 12:13 pm
Full Name: Tom Godon
Contact:

[MERGED] Exchange Backup Snapshot Removal

Post by tgodon »

After an Exchange backup runs it takes more than 2 days for the snapshot to be removed. For example there was a snapshot removal that started last night at 8:31pm, at 8:15 this morning it is only 20% complete. All other backups the snapshot is removed rather quickly. When the snapshot removal finally does complete that mailbox databases fail over to the other Exchange server. Our Exchange is 2010 and in a DAG environment. We are running VMWare ESXi 5.1, and Veeam v6.5 on a physical 32bit Windows 2003 R2 server. I plan to move Veeam to 64 bit 2008 hardware but this will not be for a couple weeks. FYI, not sure that anything really changed, but when first installed and configured the Exchange backups were not an issue.

Any help would be appriciated.

Tom
Tom Godon
AVP & Network Engineer
Bollinger, Inc.

JWester
Service Provider
Posts: 58
Liked: 7 times
Joined: Apr 04, 2011 8:56 am
Full Name: Joern Westermann
Contact:

Re: [MERGED] Exchange Backup Snapshot Removal

Post by JWester »

tgodon wrote:After an Exchange backup runs it takes more than 2 days for the snapshot to be removed. For example there was a snapshot removal that started last night at 8:31pm, at 8:15 this morning it is only 20% complete. All other backups the snapshot is removed rather quickly. When the snapshot removal finally does complete that mailbox databases fail over to the other Exchange server. Our Exchange is 2010 and in a DAG environment. We are running VMWare ESXi 5.1, and Veeam v6.5 on a physical 32bit Windows 2003 R2 server. I plan to move Veeam to 64 bit 2008 hardware but this will not be for a couple weeks. FYI, not sure that anything really changed, but when first installed and configured the Exchange backups were not an issue.

Any help would be appriciated.

Tom
Hi Tom,

1st: Try a snapshot in vcenter, wait as long as a backup takes, then remove the snapshot. Same behaviour as with Veeam Backup -> problem is in your vsphere installation.
2nd: Have a look at your datastore IO. Is there enough capacity left? What's the disk usage (not space!) during snapshot removal? If it's > 80% you probably found the bottleneck.
3rd: While snapshot deletion: Have a look at the snapshot files on the datastore. Is it switching between snapshot 01 and 02, while each snapshot files gets larger than the one before? Then your VM is writing more data than the snapshot deletion task is capable integrating in the vmdk.

We also have one VM with a MS-SQL-server which has a about 700 write IOPS in average - with peaks up to 5000. It's impossible to back it up with VBR because the snapshot could not be deleted. Yes, after some hours it will be deleted: VM is being freezed for 30 minutes, snapshot integrated and VM released. But nothing I can use in a production setup. So in this case we use a "classic" backup with an agent installed inside the VM.

Joern

dinycarpenter
Lurker
Posts: 1
Liked: never
Joined: Jun 03, 2013 5:58 am
Full Name: Charpentier Denis
Contact:

[MERGED] Removing VM Snapshot

Post by dinycarpenter »

Hi.

i have veeam backup & Replication V6.5.

i have one replication task of 6 VM.

i have one problem on one VM (Windows 2008 R2), the task of replication is blocked on the stage: Removing VM Snapshot since 9 hours.

how i can solve this problem..??

Thx

foggy
Veeam Software
Posts: 19444
Liked: 1765 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Snapshot removal issues of a large VM

Post by foggy »

Hello! Please look through this thread for possible causes of long snapshot removal. What kind of VM is it? Does it have some highly-transactional application installed on it (e.g. Exchnage Server, SQL Server..)?

mfmesa
Novice
Posts: 5
Liked: never
Joined: May 13, 2013 8:41 am
Full Name: Manuel R. Fdez Mesa
Contact:

[MERGED] : Loss of communication with the server

Post by mfmesa »

Hi good morning, I found a serious lack of service on a server at the end of the realization of the Replicas and Job Backups made ​​with Veaam BR 6.5, so that programs that access that server during the NIGHT shift, fail for some minutes while performing these operations (perhaps snapshots?).

I wanted to know if it can be related to VMware KB 2003638.

Consolidating snapshots in vSphere 5.x

http://kb.vmware.com/selfservice/micros ... Id=2003638

To make sure that this server replica was the reason I created a new replica just for that server, and I've programmed in a different time (3:00 to 3:15 pm.), And we confirm that occurs again the lack of access, replica just when finish at 3:28 pm, and also for that server backups Job.

What may be the reason that makes the server no longer accessible by the programs?

Until several weeks ago, everything worked normally.

Can anyone tell us the right direction to solve this serious problem?

Thanks in advance and a friendly greeting. Manuel.

veremin
Product Manager
Posts: 17834
Liked: 1662 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Snapshot removal issues of a large VM

Post by veremin »

You’ve been merged to existing discussion; so, kindly take a look at the information given above.

In a nutshell, usually this particular behaviour is related to VMware rather than Veeam side. So, as a first step of investigation I would recommend reproducing this situation in order to understand by whom this issue should be addressed (VMware, Veeam). You can take the snapshot of your production VM manually, keep the snapshot open for long enough time before deleting it, similar to time it takes to backup the VM, and see if you experience the aforesaid issue or not.

Thanks.

mfmesa
Novice
Posts: 5
Liked: never
Joined: May 13, 2013 8:41 am
Full Name: Manuel R. Fdez Mesa
Contact:

Re: Snapshot removal issues of a large VM

Post by mfmesa »

Thank you v.Eremin.

I reproduce the situation, changing the time to execute the replica, and verify thats at the last, program lost connection with the server.

I can't play with VMware and Veeam because affectedproduction and it's imposible.

I read the thread and thing about some solution.

Best regards, Manuel.

m.wonschik
Lurker
Posts: 1
Liked: never
Joined: Jul 01, 2013 9:37 am
Contact:

[MERGED] Replication every 30 minutes / timeouts / rds disco

Post by m.wonschik »

hello users,

I have a problem with veeam. On my SQL server runs a periodically replication job every 30 minutes.
At the end of the job i have a timeout from my SQL server of 4 - 5 seconds. In this time all users will be disconnected.
So no user can work at this time. Does everyone have an idea? Or maybe a solution ?

King regards

Magnus Wonschik

foggy
Veeam Software
Posts: 19444
Liked: 1765 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Snapshot removal issues of a large VM

Post by foggy »

Magnus, most likely this timeout is caused by the snapshot commit operation. Please take a look at the topic you've been merged into for possible tips and tricks that might help and feel free to ask any additional questions in case you have them. Thanks!

Daveyd
Expert
Posts: 283
Liked: 11 times
Joined: May 20, 2010 4:17 pm
Full Name: Dave DeLollis
Contact:

Re: Snapshot removal issues of a large VM

Post by Daveyd »

Vitaliy S. wrote:

Has anyone had any success with snapshot commit time decreasing (Especially for Exchange) when moving the snapshot location to a different datastore on a different set of disks (like SSD or a different RAID10 LUN on a different RAID group) as stated above?

dellock6
Veeam Software
Posts: 5927
Liked: 1743 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Snapshot removal issues of a large VM

Post by dellock6 »

To be honest, I'm not sure the increase in commit performance would be so good, because regardless where you store the temporary snapshot, at the end the commit is going to write the changes and merge them into the original vmdk disk, that still exists in the main (and slower) datastore.

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2020
Veeam VMCE #1

DavidReimers
Enthusiast
Posts: 50
Liked: 2 times
Joined: Sep 20, 2010 4:39 am
Full Name: David Reimers
Contact:

Re: Snapshot removal issues of a large VM

Post by DavidReimers »

What I'm finding is that the snapshot DOES indeed commit, but Veeam has stalled - almost as if its taken longer than a specified timeout threshold, and it hasn't re-checked so see if the snapshot is gone. This behaviour is somewhat random but is affecting at least one job per week (both backup and replication.

tsightler
VP, Product Management
Posts: 5678
Liked: 2490 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Snapshot removal issues of a large VM

Post by tsightler »

dellock6 wrote:To be honest, I'm not sure the increase in commit performance would be so good, because regardless where you store the temporary snapshot, at the end the commit is going to write the changes and merge them into the original vmdk disk, that still exists in the main (and slower) datastore.
True, but the big benefit here is that relocating the -delta files to a different datastore means that, instead of performing a bunch of what is effectively R/W I/O on the same datastore, you get read I/O from one datastore and write I/O to another. This significantly reduces the IOP load to a single datastore since it's now split to two different datastores and typically the latency to commit each block goes down noticeably. That being said, one mistake I've seen made is people redirecting snapshots to a new datastore that is still backed by the same physical disk pool. In this case there's not going to be much improvement at all since you're still performing a mix of R/W on the backend disks.

I did some extensive testing on this with one customer back in the 4.x days and the difference was significant, using a dedicated datastore for storing the deltas reduced snapshot commit time by roughly 50% in their case. 5.x made significant changes in snapshot handling and I don't typically see as many problems, but I still suspect it would make a difference. Unfortunately doing so also has some significant administrative challenges (can't do this with storage DRS and storage vMotions wipe out the manual changes required) so I'm not aware of a lot of customers doing this at this point, although I still know of a few large ones that do, in some case they redirect snapshots to SSD or hybrid SSD arrays which helps to keep performance decent while the snapshot is open as well as improves the commit.

It's certainly very easy to test to see if it will help in your environment.

zak2011
Expert
Posts: 367
Liked: 41 times
Joined: May 15, 2012 2:21 pm
Full Name: Arun
Contact:

Re: Snapshot removal issues of a large VM

Post by zak2011 »

Are there are problems for Veeam if a manually created snapshot is kept for a long period of time?

Thanks

foggy
Veeam Software
Posts: 19444
Liked: 1765 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Snapshot removal issues of a large VM

Post by foggy » 1 person likes this post

Aside from the fact that long snapshots affect VM operation performance per se, they also can impact the process of getting the actual state of each virtual disk performed during backup/replication (although, probably not as hard as the actual consolidation of such snapshot).

zak2011
Expert
Posts: 367
Liked: 41 times
Joined: May 15, 2012 2:21 pm
Full Name: Arun
Contact:

Re: Snapshot removal issues of a large VM

Post by zak2011 »

Thankyou Alex

DavidReimers
Enthusiast
Posts: 50
Liked: 2 times
Joined: Sep 20, 2010 4:39 am
Full Name: David Reimers
Contact:

Re: Snapshot removal issues of a large VM

Post by DavidReimers »

Is there a way to increase the timeouts that Veeam has for snapshot operations? I often find (and this is loosely related to the guest's VMware Tools and virtual hardware version) that snapshots do actually commit quickly enough, but Veeam hangs waiting for it. Is this a Veeam timeout or more of an ESX timeout? (i.e. ESX is telling Veeam the operation has timed out, even though it does eventually complete in 15-20 mins or so)?

dellock6
Veeam Software
Posts: 5927
Liked: 1743 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Snapshot removal issues of a large VM

Post by dellock6 »

honestly I've never seen Veeam "hanging" while waiting for a snapshot commit, even with 10hrs or more of commit.
What usually happens is vSphere crashing if the commit operation lasts too long, but this is a VMware related problem....

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2020
Veeam VMCE #1

DavidReimers
Enthusiast
Posts: 50
Liked: 2 times
Joined: Sep 20, 2010 4:39 am
Full Name: David Reimers
Contact:

Re: Snapshot removal issues of a large VM

Post by DavidReimers »

That's what I'm wondering, Luca. I believe vCenter has a 15-min timeout. I'm wondering whether the operation times out after 15 mins, but the snapshot actually does complete after say 20 mins. Thus Veeam never actually gets a 'completed' message.

dellock6
Veeam Software
Posts: 5927
Liked: 1743 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Snapshot removal issues of a large VM

Post by dellock6 »

Oh, the 15 minutes timeout was something from the past days with ESX 3.5 and first versions of 4.0:
http://kb.vmware.com/selfservice/micros ... Id=1004932
but it was something fixed many years ago. What version of ESXi are you running on?
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2020
Veeam VMCE #1

DavidReimers
Enthusiast
Posts: 50
Liked: 2 times
Joined: Sep 20, 2010 4:39 am
Full Name: David Reimers
Contact:

Re: Snapshot removal issues of a large VM

Post by DavidReimers »

5.1 Update 1. Upgrading from 4.1 to 5.1 made a big difference, but we still get the occasional issue.

DavidReimers
Enthusiast
Posts: 50
Liked: 2 times
Joined: Sep 20, 2010 4:39 am
Full Name: David Reimers
Contact:

Re: Snapshot removal issues of a large VM

Post by DavidReimers »

I believe I have narrowed down this issue a little further. It is indeed larger VMs that have an issue. In my situation, these VMs tend to have one or more VMDK files located on a SATA LUN on the SAN (IBM V7000).

Would it be worth relocating snapshot files to a faster LUN? Or an easier option would be to relocate the VMDK files to faster disk as a test, say to allow 2 weeks of clear running.

foggy
Veeam Software
Posts: 19444
Liked: 1765 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Snapshot removal issues of a large VM

Post by foggy »

DavidReimers wrote:Would it be worth relocating snapshot files to a faster LUN?
Yes, moving snapshot location to a different (preferably faster) datastore is among the basic recommendations in case of such issues.

DavidReimers
Enthusiast
Posts: 50
Liked: 2 times
Joined: Sep 20, 2010 4:39 am
Full Name: David Reimers
Contact:

Re: Snapshot removal issues of a large VM

Post by DavidReimers »

Well here's an interesting one. I thought I had it narrowed down to VMs with 1 or more VMDKS on a SATA LUN (rest of SAN is SAS). However, just had one fail overnight and it only has SAS storage. AND - this is the interesting bit - if I go in to vCenter and manually remove the snapshot, it deletes within a few seconds.

This is where I'm stuck. It's like Veeam is requesting the snapshot be removed, but vCenter isn't responding or actually doing the removal. We're running ESX / vCenter 5.1 Update 1.

DavidReimers
Enthusiast
Posts: 50
Liked: 2 times
Joined: Sep 20, 2010 4:39 am
Full Name: David Reimers
Contact:

Re: Snapshot removal issues of a large VM

Post by DavidReimers »

Of note, I can find some errors in the job log.

So it appears Veeam is requesting removal of the snapshot, but VMware is saying it doesn't have a snapshot. In this case, the snapshot was still present when I checked the VM the next day (job was still running but sitting for 8 hours at 'removing snapshot'.)

Vitaliy S.
Product Manager
Posts: 24246
Liked: 1859 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Snapshot removal issues of a large VM

Post by Vitaliy S. »

Hi David, did you show these logs to VMware support team? It would be good to know the reason why vCenter Server reports that there is no snapshot on the VM...

DavidReimers
Enthusiast
Posts: 50
Liked: 2 times
Joined: Sep 20, 2010 4:39 am
Full Name: David Reimers
Contact:

Re: Snapshot removal issues of a large VM

Post by DavidReimers »

Not yet - wasn't sure is this was something that had been noticed before by other Veeam users. Will look at logging support call with Veeam.
I'm running the same build of Veeam with vCenter 5.1 Update 1 but ESX 5.1. That site doesn't have any issues.

veremin
Product Manager
Posts: 17834
Liked: 1662 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Snapshot removal issues of a large VM

Post by veremin »

Hi, David.

I believe there must have been a slight misunderstanding between you and Vitaliy. He was actually asking you to provide this information to VMware support team, not Veeam one, since the reasons why vCenter didn’t notice the snapshot presence should be best investigated by VMware itself.

Thanks.

Post Reply

Who is online

Users browsing this forum: Daniella and 21 guests