Host-based backup of VMware vSphere VMs.
Post Reply
adapterer
Expert
Posts: 227
Liked: 46 times
Joined: Oct 12, 2015 11:24 pm
Contact:

Applying Retention Policy Slow

Post by adapterer »

Hi,

I'm having a problem with a replication job not meeting RPO due to long snapshot deletion/merge times aka "Applying retention policy" segment of the job. Maybe this is more of a question for VMWare but perhaps someone has been down this path.

I have a client VM of about 3TB which has 34 VM disks attached. The snapshot removal time ranges from 40 mins to 8 hrs and the job is supposed to run every 2 hours so we are not meeting RPO. The disk subsystem I have never seen above about 30% busy, and is capable of much higher IOPS/throughput so it makes me suspect the bottleneck is elsewhere. The storage is iSCSI with 10GBe networking, again I can't see a bottleneck on the network. So, I have some questions:

1. Will VMWare try to 'merge' all 34 VM disks at once?
2. If so, is there a way to limit the number of disks merged at once?
3. Is it possible this is causing odd performance limitations with iSCSI storage and 'iSCSI disk reservations', i.e. the storage being locked for write whilst it's waiting for another to complete (for reference this VM is on it's own datastore by itself)

Finding this bottleneck is driving me nuts.. thanks in advance!
adapterer
Expert
Posts: 227
Liked: 46 times
Joined: Oct 12, 2015 11:24 pm
Contact:

Re: Applying Retention Policy Slow

Post by adapterer »

Has anyone dealt with VM's with lots of VM disks attached?
dellock6
VeeaMVP
Posts: 6166
Liked: 1971 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Applying Retention Policy Slow

Post by dellock6 »

Since vSphere 6.0 disks are managed in parallel:
https://www.virtualtothecore.com/en/vsp ... hing-past/

I don't know how many disks are consolidated at once, but in the tests that Tom did, 15 disks were all processed at the same time indeed. No idea if this option can be limited, as you said is something that you may ask to VMware, unless someone here knows.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
adapterer
Expert
Posts: 227
Liked: 46 times
Joined: Oct 12, 2015 11:24 pm
Contact:

Re: Applying Retention Policy Slow

Post by adapterer » 1 person likes this post

Thanks Luca, that really helps.

I have engaged VMWare support - I'm scratching my head on this one. I can see the merge happening and review ESXTOP stats - only getting 10MB/s read/write from the host. Yet, if I run the VMWare IO analyzer appliance simultaneously (same host, same datastore, no other VM's) I can get 300MB/s + for 512k random read/writes). Doesnt appear that host,network or storage are running out of headroom :(
hexadecimal
Influencer
Posts: 16
Liked: 8 times
Joined: Apr 26, 2021 3:18 pm
Contact:

Re: Applying Retention Policy Slow

Post by hexadecimal »

I'm curious if you ever found a solution to this as we're facing the same issues. Our storage array is capable of much higher IOPS yet the retention policy application takes an incredibly long time however, this is occurin to a VM that is no more than 200GB in size (90GB used.) Replicating to a DR site that has 2 small network monitoring VMs. Trully at a loss and will get vmware support involved shortly too.
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Applying Retention Policy Slow

Post by foggy »

Have you tried to perform the same process (i.e. create the snapshot and consolidate it) manually? That will allow to rule out Veeam from the equation completely. ;)
wsmery
Novice
Posts: 8
Liked: never
Joined: Sep 24, 2019 3:56 pm
Full Name: Wayne Mery
Contact:

Re: Applying Retention Policy Slow

Post by wsmery »

Seeing a similar problem with veeam11 after putting replicas on a new SAN.
PetrM
Veeam Software
Posts: 3626
Liked: 608 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Applying Retention Policy Slow

Post by PetrM »

Hi Wayne,

In fact, Veeam just sends a request to create or delete or revert a snapshot while the process itself is fully managed by the hypervisor. I suppose you would see a slow snapshot deletion if you tried to perform the same thing directly in vSphere client as Foggy said above. I don't recommend carrying out such a test on your working replicas but it definitely makes sense to ask our support engineers to have a look at the issue and probably involve VMware or storage vendor support team in the investigation.

Thanks!
Post Reply

Who is online

Users browsing this forum: No registered users and 30 guests