Host-based backup of VMware vSphere VMs.
Post Reply
ferrus
Veeam ProPartner
Posts: 299
Liked: 43 times
Joined: Dec 03, 2015 3:41 pm
Location: UK
Contact:

Ten minute backup delay, with VM snapshots

Post by ferrus »

I've been testing out IBM Storage Integration in 9.5 u3, and so far everything is working well first time.

One issue I've noticed however, is that any VM with an existing VMware snapshot experiences an 8-13 minute delay during the initial backup stage.
This only happens with Storage Integration enabled, and only on VMs with snapshots. The delay happens at the "Collecting disk files location data" stage - and holds up all other VMs from progressing in the same storage snapshot.

Before I submit a support ticket - is this normal, or does anyone have an solution for it?
Gostev
Chief Product Officer
Posts: 31540
Liked: 6712 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Gostev »

I don't think it's normal and would suspect either a vCenter API performance issue, or some unoptimal code on our side not handling this scenario very well - we'll need to see the debug logs to understand what is taking the time. Thanks!
ferrus
Veeam ProPartner
Posts: 299
Liked: 43 times
Joined: Dec 03, 2015 3:41 pm
Location: UK
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by ferrus »

Thanks for the reply. It's still a new system - so the issue could be our side.
Will open a call and send the debug logs now.
chalkynz
Influencer
Posts: 23
Liked: 3 times
Joined: Aug 06, 2019 2:02 am
Full Name: Nathan Shaw
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by chalkynz »

Hi did you get to the bottom of this? Have seen same thing - only using storage integration and when SAN snapshots are present. Up to 45 min delay for a single VM backup job. Test environment consists of one VM, one VMDK, one datastore, one LUN. Can add a 2nd VM to same datastore and back it up in 2 minutes.
Vitaliy S.
VP, Product Management
Posts: 27116
Liked: 2720 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Vitaliy S. » 1 person likes this post

Nathan, I believe this is a bit different issue since you're seeing the delay when SAN snapshots are present while OP was referencing the vSphere snapshot. In this situation, it's better to contact our support team to validate the setup and review the debug log files.
ferrus
Veeam ProPartner
Posts: 299
Liked: 43 times
Joined: Dec 03, 2015 3:41 pm
Location: UK
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by ferrus » 2 people like this post

I had to check the support calls for this, as we experienced two similar issues.
One caused a 20-45 minute per disk delay on Backup Copy Jobs (fixed with a reg key), and one caused a 10 minute delay with VM snapshots (this one).

Your issue sounds different to the one we experienced, but for others reading this thread with this issue - I believe it was diagnosed as fragmentation on the VM snapshot file.
A few fixes were tried, but ultimately the best resolution was a policy change of not allowing/keeping snapshots >24 hours.
Veeam Quick Backups had already replaced the vast majority of our manual VM snapshot usage, and the rest was just educating support staff about best practice usage.
chalkynz
Influencer
Posts: 23
Liked: 3 times
Joined: Aug 06, 2019 2:02 am
Full Name: Nathan Shaw
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by chalkynz »

Vitaliy S. wrote: Aug 16, 2020 9:20 am Nathan, I believe this is a bit different issue since you're seeing the delay when SAN snapshots are present while OP was referencing the vSphere snapshot. In this situation, it's better to contact our support team to validate the setup and review the debug log files.
Yes, co-incidence for us, was indeed happening only when VMware snapshot present.
chalkynz
Influencer
Posts: 23
Liked: 3 times
Joined: Aug 06, 2019 2:02 am
Full Name: Nathan Shaw
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by chalkynz »

I’ve also logged various support cases over this issue, as it can cause delays so long that your backups don’t even start within a daily window. Would love Veeam to workaround this by switching to non-SAN backup for a given VM if VMware snapshots are present for that VM. Unfortunately, support teams feedback is always ‘not our fault, it’s VMware API, too bad’ then we close the case :-(
Regnor
VeeaMVP
Posts: 938
Liked: 290 times
Joined: Jan 31, 2011 11:17 am
Full Name: Max
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Regnor »

Snapshots should only be active temporarily. If you experience this problem, your snapshots probably grew too big or were active too long? Why can't you just clean them up?
chalkynz
Influencer
Posts: 23
Liked: 3 times
Joined: Aug 06, 2019 2:02 am
Full Name: Nathan Shaw
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by chalkynz »

They are active temporarily. VMware say don’t keep them longer than 72 hours. However, Veeam backups can be crippled by the existence of a single snapshot that is only hours old. Veeam docs say don’t use SAN backups if it breaks. Not a very premium feature as it stands.
Gostev
Chief Product Officer
Posts: 31540
Liked: 6712 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Gostev »

chalkynz wrote: Jan 21, 2022 7:53 pmVMware say don’t keep them longer than 72 hours.
That is a really weird recommendation. Do you happen to have a link to this statement? Just feels like a totally random and unjustified number, so might be a good topic to discuss with VMware. The very reason we added backup from storage snapshots is because there were so many customers with I/O intensive VMs for which having a snapshot present for longer than a few minutes caused huge issues with the commit phase... but 72 hours? Where this number is even coming from, when one VM will write at 1KB/s and another at 1GB/s, resulting in dramatically different snapshot sizes.

This also begs for a question though, if you ARE fine keeping snapshots for days, then what is the reason to do backup from storage snapshots to start with? When clearly you don't have the issue this functionality is designed to solve in the first place.
Gostev
Chief Product Officer
Posts: 31540
Liked: 6712 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Gostev »

@Andreas Neufert btw should we add an option to automatically skip VMs with snapshots from BfSS processing? What do you think?
Andreas Neufert
VP, Product Management
Posts: 6748
Liked: 1408 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Andreas Neufert »

Let me research first the root cause a bit myself.
Andreas Neufert
VP, Product Management
Posts: 6748
Liked: 1408 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Andreas Neufert »

Likely root cause identified. Nathan and Max, is there a way you could open a support case to share logs? (Please add a note which Job and VM processing at what time was delayed). Please share here the ID.

This delay issue did not pop up for a longer time and maybe we have optimized things in later versions. Are you on any newer release (v11 or v11a)?
Regnor
VeeaMVP
Posts: 938
Liked: 290 times
Joined: Jan 31, 2011 11:17 am
Full Name: Max
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Regnor »

@Andreas: We've had an open case, but didn't work on it as the customer discovered the snapshots and resolved the problem by cleaning them up. VBR build was my first assumption, but the problem persisted on the most current build (P20211211). If you want to check the logs, here's the case ID: #01975070

I have also never seen this that extreme before. It could be a combination of Nimble Storage (Flash+Dedup?), Storage Snapshots and existing (old) VMs snapshots.
chalkynz
Influencer
Posts: 23
Liked: 3 times
Joined: Aug 06, 2019 2:02 am
Full Name: Nathan Shaw
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by chalkynz »

Gostev wrote: Jan 21, 2022 8:31 pm That is a really weird recommendation. Do you happen to have a link to this statement? Just feels like a totally random and unjustified number, so might be a good topic to discuss with VMware. The very reason we added backup from storage snapshots is because there were so many customers with I/O intensive VMs for which having a snapshot present for longer than a few minutes caused huge issues with the commit phase... but 72 hours? Where this number is even coming from, when one VM will write at 1KB/s and another at 1GB/s, resulting in dramatically different snapshot sizes.

This also begs for a question though, if you ARE fine keeping snapshots for days, then what is the reason to do backup from storage snapshots to start with? When clearly you don't have the issue this functionality is designed to solve in the first place.
Yeah I agree 72 hours is quite arbitrary & not useful for high-change VMs, but source here anyway: https://kb.vmware.com/s/article/1025279

For us, no, we don’t want multi-day snapshots at all, and any snapshots are usually a surprise, we always aim to have no snapshots at all, but the reality is that sometimes they do get created by admins, and not purged before the backup run.

My feeling is this is worse with thin-provisioned disks, but I don’t have any hard stats to back that up.
chalkynz
Influencer
Posts: 23
Liked: 3 times
Joined: Aug 06, 2019 2:02 am
Full Name: Nathan Shaw
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by chalkynz » 1 person likes this post

Also case 05185300.
Andreas Neufert
VP, Product Management
Posts: 6748
Liked: 1408 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Andreas Neufert » 1 person likes this post

Just to update here. We are working with VMware on the question on how the used API can be improved that we use for this. As well we have some additional ideas on our side.
I will update you all here, but as it is working as designed for now, there is no short term fix.

I suggest to use Veeam ONE to monitor snapshot usage and lifetime.
As well I suggest to implement the Veeam Enterprise Manager and implement the Veeam vcenter plug-in.
Instead of letting anyone create snapshots, let them create Quick Backups from the vcenter instead. It is as well very fast (incremental backup to the last restore point) and is a real backup of that restore point that do not affect the performance of the datastore/VM/backup or bring the environment at risk because of growing snapshots consuming all space of the datastore. This feature was introduced exactly to address this some years ago when a customer automated exactly this and we found it was a brilliant idea and implemented it to the product.
More info:
https://helpcenter.veeam.com/docs/backu ... ml?ver=110
Spex
Enthusiast
Posts: 55
Liked: 2 times
Joined: May 09, 2012 12:52 pm
Full Name: Stefan Holzwarth
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Spex »

2 years later the problem still exists. (We use vbr 12.1)

For security reasons we switched from san mode to san mode with hardware assisted snapshots. This way we don't need to present all production luns to our hardware proxies. All works great as long as there are no snapshots involved. Even if the snapshot is only minutes old the process step for "Collecting disk files location data" can last for many hours - and all vms in this job are not processed and even worse have a vmware snapshot open during the whole time. (we use thin provisioned disks since years)

Quick backup is sometimes an option for our admins but also sometimes not - especially when you think about recovery time.

I had a case open (#07168825) and tried to find countermeasures but even did not find a way to measure disk fragmentation (root cause?) or a way to reduce it (storage migration from thin to thick and back didn't help a lot).

At the end I made a feature request to fail back transport mode for vms with snapshots to hotadd or nbd ...
Gostev already mentioned this change as a possible solution in this thread.

Please support this request, as I do not want to wait some more years to solve this.
popjls
Enthusiast
Posts: 55
Liked: 5 times
Joined: Jun 25, 2018 3:41 am
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by popjls »

I was under the impression this was "normal" :D - Due to the snapshot, it takes an extended period to collect this info. If this isn't "normal" then pretty sure this is still an issue using SAN snapshots. We just remove them before the backup if possible and that fixes it.
Andreas Neufert
VP, Product Management
Posts: 6748
Liked: 1408 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Andreas Neufert »

The API used is a VMware one and there have been no changes in the last VMware version to this API. We opened some weeks ago again some API support cases and will discuss again with VMware.

Yes, preventing everything that splits the data within VMware backend (like existing VM snapshots) leads to the API taking significantly longer as it is way more complex to build the specific information together for us. vCenter performance and general latency between vCenter and the ESXi system and their storage backends help here to reduce overhead.

The mentioned workaround to do storage vMotion back and forth only helps if you do not have existing VM snapshots.
Spex
Enthusiast
Posts: 55
Liked: 2 times
Joined: May 09, 2012 12:52 pm
Full Name: Stefan Holzwarth
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Spex »

popjls wrote: Apr 04, 2024 12:47 pm I was under the impression this was "normal" :D - Due to the snapshot, it takes an extended period to collect this info. If this isn't "normal" then pretty sure this is still an issue using SAN snapshots. We just remove them before the backup if possible and that fixes it.
Normal meeans in our case 12h+
Since snapshot usage is "normal" operation for server admins in vmware environment I can not remove them without asking and sometimes they are needed...(and quick backup isn't an option)
Spex
Enthusiast
Posts: 55
Liked: 2 times
Joined: May 09, 2012 12:52 pm
Full Name: Stefan Holzwarth
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Spex »

Andreas Neufert wrote: Apr 04, 2024 12:55 pm The mentioned workaround to do storage vMotion back and forth only helps if you do not have existing VM snapshots.
It does not realy help to do storage migration. Even after that times are only reduced a little.
My idea was to measure fragmentation (still unknown how to do that) and to schedule a job for defrag of these machines in advance before we see problems with them during backup.

We need a switch in each backupjob to allow failback of transport mode to hotadd/nbd for a single vm when a snapshot exists.
Andreas Neufert
VP, Product Management
Posts: 6748
Liked: 1408 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Andreas Neufert » 1 person likes this post

Are you aware of the vcenter plug-in that allows your vcenter users with rights on the VM to start the quick backups instead of VM snapshots?

In the end everyone tries to avoid to use VM snapshots because of the side effects and the vcenter plug-in was exactly created for this purpose, to remove the snapshot right from the regular vcenter user and let them use quick backup.
It takes a bit more time, but in total the admins are in a way better spot as it is a real backup and even if they mess up the VM completely they can (instant) restore.
popjls
Enthusiast
Posts: 55
Liked: 5 times
Joined: Jun 25, 2018 3:41 am
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by popjls »

Spex wrote: Apr 04, 2024 1:56 pm Normal meeans in our case 12h+
Ours would "pause" for this process for two hours and that annoyed me. I actually figured it out myself in the end but this thread caught my eye as I thought it was "normal". Easy fix but I'd be frustrated with 12 hours.
Spex
Enthusiast
Posts: 55
Liked: 2 times
Joined: May 09, 2012 12:52 pm
Full Name: Stefan Holzwarth
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Spex »

Andreas Neufert wrote: Apr 04, 2024 2:39 pm Are you aware of the vcenter plug-in that allows your vcenter users with rights on the VM to start the quick backups instead of VM snapshots?
Yes but we do not use the plugin to keep things simple (security, software-versions, responsibility for this plugin ....). Our admins know how to access and use Enterprise Portal and do that frequently. But the use case for our admins during maintenance windows is to have easy/simple/reliable/uncomplicated/FAST restores they can do without additional support.

Restoring VMs is always a question how to do it right - which proxy to use (we proxies within easy esx cluster for hotadd and restores)/which transport mode (we use san for backup)/which backup chain ... and needs much more time then recovery from snapshots
Instant recoveries are only an option if you know and can accept the reduced performance running from backup storage and if you know how to transfer the vm to production storage
VMWare snapshots are so easy to use and therefore I can not urge our admins to use only my backup environment.
chalkynz
Influencer
Posts: 23
Liked: 3 times
Joined: Aug 06, 2019 2:02 am
Full Name: Nathan Shaw
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by chalkynz »

Gostev wrote: Jan 21, 2022 8:40 pm @Andreas Neufert btw should we add an option to automatically skip VMs with snapshots from BfSS processing? What do you think?
Please? Am about to need storage-integrated transport backups again…
Regnor
VeeaMVP
Posts: 938
Liked: 290 times
Joined: Jan 31, 2011 11:17 am
Full Name: Max
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Regnor »

Snapshots are one cause of this issue, fragmentation is another one. Maybe we could just timeout the 'Collecting disk files' process after a certain period and then fail over to a different processing mode, or skip the VM?
Andreas Neufert
VP, Product Management
Posts: 6748
Liked: 1408 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Ten minute backup delay, with VM snapshots

Post by Andreas Neufert »

We will discuss this feature request internally.
Post Reply

Who is online

Users browsing this forum: Bing [Bot], david.domask, ithark and 60 guests