Discussions specific to the VMware vSphere hypervisor
Post Reply
jmeyer_cm
Novice
Posts: 3
Liked: never
Joined: Oct 09, 2012 1:45 pm
Full Name: Joel Meyer
Contact:

Feature Request: Proxy Prioritization (for NFS Customers)

Post by jmeyer_cm » Oct 22, 2012 4:58 pm

A job has two settings for the proxy: 1. Automatically select a proxy using ProxyDetector, and 2. Manually specify a proxy.

Based on information I've gathered on how ProxyDetector works, a job set to 'Automatic' chooses the appropriate proxy to use based on:

1. Network Location (Based off IP and subnet)
2. Current Load (Within the settings of the GUI you can specify how many concurrent tasks a proxy can do. Default: 1 per 2 cores)
3. Backup Mode Capabilities (In order of priority: SAN, HOTADD, or NBD)

I would like to request a fourth option for ProxyDetector:

4. Prioritize any proxies that exists on the same host as target VM.

This would benefit all customers using NFS storage and Veeam's 'Appliance Transport Mode' which utilizes VMware's VDDK API. This API has a known problem with NFS locking, requring either the backup appliance to run on the same host as the target VM, or to use Network Mode (not recommended by Veeam due to slower performance).

http://kb.vmware.com/selfservice/micros ... Id=2033540

dellock6
Veeam Software
Posts: 5552
Liked: 1538 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Feature Request: Proxy Prioritization (for NFS Customers

Post by dellock6 » Oct 23, 2012 8:59 pm

This is however a partial solution, since you have for sure many VMs and they are not on the same ESXi host. If the job has more than one VM in it, chances are it will be local to a given VM, but remote to many others. I've got the same problem too even if I am using iscsi, and I solved it by creating a new dedicated proxy running in network mode only.

Luca.
Luca Dell'Oca
EMEA Cloud Architect @ Veeam Software

@dellock6
http://www.virtualtothecore.com/en/
vExpert 2011-2012-2013-2014-2015-2016-2017-2018
Veeam VMCE #1

Gostev
Veeam Software
Posts: 23215
Liked: 2977 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Feature Request: Proxy Prioritization (for NFS Customers

Post by Gostev » Oct 23, 2012 9:22 pm

This VDDK bug has just been reported by us to VMware (the article you are referencing has been posted just a few weeks ago), and I am sure VMware will fix this bug soon since this is serious, production-down type issue. There's simply no need to build massive workaround, like this issue is going to be around forever.

What would help if all affected people open support cases with VMware regarding this issue, as the number of affected people directly affects priority of each fix.

jmeyer_cm
Novice
Posts: 3
Liked: never
Joined: Oct 09, 2012 1:45 pm
Full Name: Joel Meyer
Contact:

Re: Feature Request: Proxy Prioritization (for NFS Customers

Post by jmeyer_cm » Oct 26, 2012 8:50 pm

I had opened a case with VMware for this issue and they explained that the NFS locking behavior (documented in the link above) will exist until [at least] the next full release (v.6) of ESXi which they said would not be released until at lease Q4 2013. They recommened using the workarounds provided until then. Currently I use the network mode workaround, but Appliance/hotadd mode should provide better performance if we can just figure out how to ensure the backup proxy is on the same host as the target VM.

tsightler
Veeam Software
Posts: 5216
Liked: 2089 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Feature Request: Proxy Prioritization (for NFS Customers

Post by tsightler » Oct 26, 2012 9:01 pm

Has anyone tried this "fix" for the NFS locking problem:

http://kb.vmware.com/selfservice/micros ... Id=2010953

This link/KB references the 2033540 KB article so I've been wondering if it might help.

jmeyer_cm
Novice
Posts: 3
Liked: never
Joined: Oct 09, 2012 1:45 pm
Full Name: Joel Meyer
Contact:

Re: Feature Request: Proxy Prioritization (for NFS Customers

Post by jmeyer_cm » Oct 30, 2012 7:26 pm

Yes, we did set this parameter on our hosts (it was previously set to 3), but it did not resolve the behavior where the VM is 'stunned' during snapshot removal. This KB also mentions 'Alternately, you can use the NFC (Network File Copy) transport method in your backup solution' which is the solution I've had to settle on up to this point.

hennys
Lurker
Posts: 2
Liked: 1 time
Joined: Nov 02, 2012 2:14 pm
Full Name: Hen Savelkoul
Contact:

Re: Feature Request: Proxy Prioritization (for NFS Customers

Post by hennys » Nov 02, 2012 3:41 pm

jmeyer_cm wrote:I had opened a case with VMware for this issue and they explained that the NFS locking behavior (documented in the link above) will exist until [at least] the next full release (v.6) of ESXi which they said would not be released until at lease Q4 2013. They recommened using the workarounds provided until then. Currently I use the network mode workaround, but Appliance/hotadd mode should provide better performance if we can just figure out how to ensure the backup proxy is on the same host as the target VM.
We also had a case open with vmware. They told us that this would be fixed in 5.0U2, scheduled for Q1 2013.
Strange that they communicate different fix releases, or perhaps the need for the fix has increased, so they moved the fix from esxi6 to esxi5.0U2 ?

Kind Regards,

Hen

Gostev
Veeam Software
Posts: 23215
Liked: 2977 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Feature Request: Proxy Prioritization (for NFS Customers

Post by Gostev » Nov 03, 2012 8:13 pm

hennys wrote:perhaps the need for the fix has increased, so they moved the fix from esxi6 to esxi5.0U2?
Exactly right!

At least, this is how it works in most software companies (Veeam no exception) - the fix priority is solely defined by the number of support cases open. I am sure, if they get 10'000 support cases open for this issue, they will move it even sooner ;) or even make a standalone fix! That's why I always encourage everyone to open support cases for every issue.

EdQ
Novice
Posts: 4
Liked: never
Joined: Aug 17, 2012 7:16 pm
Full Name: Edward Quinn
Contact:

Re: Feature Request: Proxy Prioritization (for NFS Customers

Post by EdQ » Aug 28, 2013 4:33 pm

Concerning the VMware 2010953 syndrome (Virtual machines residing on NFS storage become unresponsive during a snapshot removal operation), have any of you seen a case where 100+ VMs on a proxy were fine for many months (5 second snapshot commits) then one day suddenly fell ill (60 second snapshot commits)? Running ESXi 5.1.0 on > 10 nodes, NetApp 8.1.2. Explaining to management how results could have changed so drastically overnight is a challenge. My VMware/NetApp/NFS background is very light, so all leads will be appreciated.

foggy
Veeam Software
Posts: 17125
Liked: 1399 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Feature Request: Proxy Prioritization (for NFS Customers

Post by foggy » Aug 28, 2013 4:44 pm

Edward, were there probably some changes performed to infrastructure that could result in longer snapshots commit or large amount of changes were performed inside the VM(s)? Do you see this behavior on all VMs/group of VMs? Is it a one-time occurrence or observed during each job run?

EdQ
Novice
Posts: 4
Liked: never
Joined: Aug 17, 2012 7:16 pm
Full Name: Edward Quinn
Contact:

Re: Feature Request: Proxy Prioritization (for NFS Customers

Post by EdQ » Aug 28, 2013 6:20 pm

All VMs and B&R jobs on this VMware cluster were affected (except those sharing the same ESX host as backup proxy) with little or no load dependency. Manually created snapshots kept open for several minutes deleted OK in several seconds. Minute-long snapshot-delete hangs were observed for two consecutive nights before VMware sysadmin ordered all Veeam backups disabled. After substituting network mode for hot-add on one proxy, problem went away so I think I understand the workaround, but concerns remain over how such a severe issue could have developed so suddenly.

v.Eremin
Veeam Software
Posts: 15427
Liked: 1165 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Feature Request: Proxy Prioritization (for NFS Customers

Post by v.Eremin » Aug 29, 2013 11:17 am

Manually created snapshots kept open for several minutes deleted OK in several seconds
Additionally, I’m wondering whether you have kept the snapshot open for long enough time before deleting it. In other words, whether this time was similar to time it takes to back this VM up or not. Thanks.

EdQ
Novice
Posts: 4
Liked: never
Joined: Aug 17, 2012 7:16 pm
Full Name: Edward Quinn
Contact:

Re: Feature Request: Proxy Prioritization (for NFS Customers

Post by EdQ » Aug 29, 2013 3:26 pm

Hi Vladimir, yes, the snapshot was left open longer than the backup took. How to duplicate the hang effect by running manual snapshots, however, is not the main issue. The greater concern is how ninety percent of a proxy's hot-adds could develop lengthy VMware NFS snapshot delete freezes on the same day, after many months of trouble-free operation. If there are any such known cases, the next question, of course, would be what particular infrastructure factors were involved. Thanks for reading.

Post Reply

Who is online

Users browsing this forum: cas, stephendietrich, wishr and 37 guests