Host-based backup of VMware vSphere VMs.
Post Reply
ziceman
Novice
Posts: 4
Liked: never
Joined: Jun 14, 2021 12:20 am
Full Name: Stefan Zauchenberger
Contact:

Out-of-control VMWare snapshot on 6.0 U3

Post by ziceman »

Have a specific VM Server (Win2016) running on ESXI 6.0 u3 that was unfortunately allowed to run as snapshot for 2 years.I never manually created any snapshots, so it was either done by one of my colleagues or by a backup program that did not clean up afterwards. Either way - it never should not have gone undetected. I It was a huge miss indeed and now we have huge (literally) mess, so I am greatly in need of some insight and recommendations how best to avoid a worst-case scenario.

I know the Snapshot needs to be deleted, but I am afraid this would lock up the VM hours, days. This is a production server that cannot be down that long. I have seen horror stories about "remove snapshot stuck at 99%" when the snapshots are very large. This would be a disaster.

It is backed up by Veeam, but I am thinking that restoring to another host would carry the big Snapshot over as well. Is that true?

Also, this is a small site with two hosts running VMWare Essentials, so there is no VCenter,

When the backup program last completed overnight, removing the temp snapshot took nearly an hour - and the VM was not accessible during this time frame. The other VM was OK, but this main seemed dead in the water. The host disk I/O did no seem to go crazy and there was not much CPU or RAM utilization, but removal of the backup temp snapshot hung the VM pretty much the whole hour. The storage is a 8-disk STATA Array - FREE: 1.47 TB
60% USED: 2.16 TB CAPACITY: 3.64 TB

Here are the other hardware specs:
Dell PowerEdge R62012 CPUs x Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz
47.96 GB RAM.
The VM:

CPU 12 vCPUs
Memory 16 GB
Hard disk 1 250 GB
Hard disk 2 750 GB

I am very concerned (and it seems rightly so) that removing the this one could lock up the VM for days. Not sure if a restored backup would fix it.

Feeling kind of trapped. Any ideas..?
Andreas Neufert
VP, Product Management
Posts: 6748
Liked: 1408 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Out-of-control VMWare snapshot on 6.0 U3

Post by Andreas Neufert »

Veeam Backup & Replication has a Snapshot Hunter functionality backed into the processing that detect stuck snapshots and try to remove it with some advanced processing. We basically do not trust what ESXi/vCenter is telling us in case of snapshot removal success and go back to the filesystem to verify it. If the snap can not be remove for whatever reason we will report this in the Job statistics and alerts. When you look at the snapshot name it is maybe as well an indicator.

Anyway it looks like that you have additional space for a copy of the VM. So if you do not want to perform the snapshot removal, it is maybe a way out to VM replicate the VM with Veeam and then Failover (and permanently failover when it was successful). Then just remove the old version from disk. When you replicate the VM, make sure to choose a target name that can stay that way after failover (same name as source not possible).

"Old" Snapshots are not part of the backup processing and are not replicated. Just the data that is needed for the specific restore point.

Depending on what license you have from Veeam, you could run Veeam ONE (or maybe even the free/community edition?). It offers snapshot lifetime and size monitoring and would alert you about this issue.
Post Reply

Who is online

Users browsing this forum: Semrush [Bot] and 65 guests