Host-based backup of VMware vSphere VMs.
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Gostev » 1 person likes this post

davwalbfs wrote: Feb 22, 2023 2:10 pm@Milenco: You are talking about a new VDDK library, could you please explain what you mean by that? Thanks!
He's talking about the latest version of VMware VDDK included with V12.
mitchberlin
Novice
Posts: 3
Liked: 1 time
Joined: Jun 29, 2022 3:10 pm
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by mitchberlin »

I posted in this forum back in June, and I've been waiting for Veeam 12 to test this again. Today was the day, and here's what I found - overall, slight speed improvement, but not much and not near the speeds we are seeing with vSphere 6.7 hosts. My tests:
1. I installed Veeam 12 on a new Windows VM (fresh install to hopefully prevent any legacy settings/limitations). All Veeam components are on this single VM. It's my backup repository, proxy, console, etc. I am attempting to replicate a single VM - our vCenter Server Appliance.
2. The source ESXi host is 7.0 U2e (build 19290878) with 2 VMs on it (both are vCenter Server Appliances) and the destination is a freshly installed ESXi 7.0 U3f (build 20036589) with 0 VMs. Both hosts use a local SAS array (RAID10, SSDs).

First replication attempt - Replicating our test vcsa server (~160 GB of used disk space), it took 2 hrs, 12 mins at a processing rate of 21 MB/s. Load: Source 0% > Proxy 25% > Network 0% > Target 99%

Second attempt - same test vcsa server (deleted all files from the datastore), but I changed 2 ESXi settings I found online:
esxcfg-advcfg -s 32768 /BufferCache/MaxCapacity
and
esxcfg-advcfg -s 20000 /BufferCache/FlushInterval
(rebooted the ESXi host after changing them)

Results: 2 hours, 5 mins at a processing rate of 22 MB/s -- very little improvement

Third attempt - Wanting to make sure I didn't have any resource constraints on my Veeam server, I increased the CPUs from 4 to 8, increased RAM from 8G to 16G, converted my SCSI controller from LSI Logic to VMware paravirtual SCSI. Rebooted Windows a few times, verified pvscsi driver was in use. Then, I added this value and rebooted again:
Path: HKLM\SOFTWARE\VeeaM\Veeam Backup and Replication
Key: ViHostConcurrentNfcConnections
Type: REG_DWORD
Value: 14 (Decimal)
(mentioned in https://bp.veeam.com/vbr/3_Build_struct ... phere.html)

Results: 1 hour, 28 mins at a processing rate of 32 MB/s (Load Source 0% > Proxy 11% > Network 0% > Target 96%)

For reference, we were getting 50-80 MB/s on ESXi 6.7 hosts.

So, as others have indicated, it seems like there's still issues in Veeam 12 with vSphere 7.0 hosts. Based on the file details of:
C:\Program Files (x86)\Veeam\Backup Transport\x64\vddk_7_0\vmxDiskLib.dll, Veeam 12 is likely using VMware's VDDK 7.0.3.2. Per https://vdc-repo.vmware.com/vmwb-reposi ... Notes.html, there's mention of tweaks that can be done:

"Customers do not need to change the defaults, but if they encounter NBD performance issues, they can change change four buffer settings under vixDiskLib.nfcAio.SocketOption in the VDDK configuration file. On Linux, various socket buffer sizes have minimal performance impact, but on Windows changing them might yield good benefits. Software providers can assist their customers doing so."

Also - https://core.vmware.com/blog/improving- ... -options-0

Question for Veeam - can we modify these 4 settings that VMware mentions??

It seems other vendors have documented steps to do it (reference - https://www.veritas.com/support/en_US/article.100053281)
domsi
Novice
Posts: 6
Liked: never
Joined: Aug 08, 2022 6:42 pm
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by domsi »

Same issue... it just "improved" from 4 MB/s to 10MB/s... We have waited now so long for v12 and nothing really has improved, as you can see from the answer. Please provide a fix for v12 - we don't want to wait another 3 years for v13...

Also please answer to the question of mitchberlin:
Question for Veeam - can we modify these 4 settings that VMware mentions??
mitchberlin
Novice
Posts: 3
Liked: 1 time
Joined: Jun 29, 2022 3:10 pm
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by mitchberlin »

Two updates from my post on Feb 22.

1. I have opened a case with Veeam on this issue - Case # 05895182. One of my main questions is about modifying the 4 VDDK settings. So far, they have not responded on that inquiry, and the case has been open since Feb 23rd.

2. I found a way to improve my NDB performance! My all-in-one Veeam config was tested on two different VMs running Windows Server 2016. I installed a new VM with Windows Server 2019 and installed a fresh version of Veeam 12. My throughput doubled (from ~21-30 MB/s to 50-65 MB/s)! I asked Veeam support in the case above if there are known issues with throughput between Win2016 and 2019, but they have not confirmed any known issues.

Anyone else on a Win2016 VM and have you tried Win2019/2022?
domsi
Novice
Posts: 6
Liked: never
Joined: Aug 08, 2022 6:42 pm
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by domsi »

My Windows server are all on W2019. It tried it with Linux VM's too:

W2019 => 10MB/s
Debian Bullseye => 27MB/s
ThierryF
Expert
Posts: 129
Liked: 33 times
Joined: Mar 31, 2018 10:20 am
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by ThierryF »

Also hitting same poor recover perf with
VCENTER 7.0.3J Build 20990077
Vmware ESX Farm with 16 nodes 7.0.3 Build 20328353 (VMware ESXI 7.0U3G)
IBM FlashSystem V9000 NVMe Storage on 16GB SAN Switches
10Gbit lan.
About Veeam VBR, 11.0.1.1261_20230227 (KB4424 - CVE-2023-27532)

On same HW with VC6.7/ESX6.7/VBR10, last recover was 65-80MB/Sec.

Now, recovering 1.4TB VM at 5MB/Sec rate :-(
ThierryF
Expert
Posts: 129
Liked: 33 times
Joined: Mar 31, 2018 10:20 am
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by ThierryF »

Update to my last post : Using same infra recovering same VM to an ESX 6.7 host at 153MB/Sec ...
Andreas Neufert
VP, Product Management
Posts: 6707
Liked: 1401 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Andreas Neufert »

Hi, we are working with VMware on this one. Veeam v12 uses a newer VDDK kit for processing that could help to speed up things. Does anyone have feedback after v12 update ?
domsi
Novice
Posts: 6
Liked: never
Joined: Aug 08, 2022 6:42 pm
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by domsi »

Andreas Neufert wrote: Mar 17, 2023 1:13 pm Does anyone have feedback after v12 update ?
Haven't you read the posting of davwalbfs, mitchberlin and me? We have upgraded to v12 and it has not improved in a big manner - just a few MB/s improvement, but still incredible slow. mitchberlin has also asked about the suggest settings from VMware???

Veeam has always assured that the issues will be fixed with v12 and nothing really has changed. Please fix it and don't put us off to v13 again, where we can wait another 2 years :x
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Gostev »

From the immediate posts of ThierryF above it does not appear anything in Veeam needs fixing? Veeam can go as fast as the host can accept the data... but this speed apparently varies between ESXi versions on the same exact hardware.
ThierryF
Expert
Posts: 129
Liked: 33 times
Joined: Mar 31, 2018 10:20 am
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by ThierryF »

I completely disagree with your latest post :

Vmware 6.7 is not supported anymore, that's first point.
Next, explain me such recovery time difference recoverying SAME VM recover point using SAME infrastructure
(Veeam Proxy, Veeam File repoistory, 10Gb Lan, Networking Swithces, Forti Firewall, Cisco Hardware powering ESX ....):

to 7.0U3 Start 16/03/2023 13:15:22 End 19/03/2023 06:15:47 Duration 65:00:25 Speed 6,27 MB/Sec
To 6.7 Start 17/03/2023 21:49:40 End 8/03/2023 06:39:04 Duration 08:49:24 Speed 46,22 MB/Sec

65 HOURS FOR 4TB. A call as P1 at veeam support without even a call back from support !!!!

Veritas is experiencing same problem with NBD but gave a fix for their
customers, being configuring NFC (Network File Copy) AIO (Asynchronous I/O)
https://www.veritas.com/support/en_US/article.100053281

Vmware pointed VDDK, ok, but they also advised to use VDDK 7.0U3 or 8.0.
If you are NOT using patched libraries, the problem is not the library provider ...
Does Veeam 11 or 12 use that patched version ?
The problem is more impacting Veeam Windows Proxy than Linux one. Will you tell me to ask Microsoft ?

Please FIX ASAP, speak between engineering groups, seems known prob at Veeam since May 2021 !
ThierryF
Expert
Posts: 129
Liked: 33 times
Joined: Mar 31, 2018 10:20 am
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by ThierryF »

Correcting a little typo in my post :
Please read "65 HOURS FOR 1.4TB" in way of "65 HOURS FOR 4TB" ...
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Gostev »

ThierryF wrote: Mar 19, 2023 8:24 amVmware pointed VDDK, ok, but they also advised to use VDDK 7.0U3 or 8.0.
If you are NOT using patched libraries, the problem is not the library provider ...
Does Veeam 11 or 12 use that patched version ?
Yes, V12 uses the latest VDDK versions available. According to VMware, they were supposed to fix the issue in question, thus the comment from Andreas above (that made everyone upset).
ThierryF
Expert
Posts: 129
Liked: 33 times
Joined: Mar 31, 2018 10:20 am
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by ThierryF » 1 person likes this post

For the community ...

Using VBR 11a, NBD-mode 1.4TB SCCM VM recover completed in 65 hours.
Adding Vmware Proxy and recoverying same point in HotAdd mode completed in ... 1h 40 ...
Nearly 40 Times faster using SAME infrastructure ...

You will definitively have to look at/review your envs if still using NBD recoveries.
I opted for RedHat 9 Linux VM, 16GB Ram and 1 VCPU per allowed stream (6 in my case).
Proxy-VM need to be powered by ESX node that has acces to all used datastores or if,
like us, you have restricted ESX to some datastore, so, additional proxies will be needed.

BTW, on our test env (VBR 12 and ESX 8), extremely poor NBD perf is still present ...

Th
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Gostev » 1 person likes this post

So our QA did the testing and they get 200-250 MB/s NBD backup and 100-125 MB/s NBD restore performance on 10Gb Ethernet with v12 GA and ESXi 7.0.3 without any of the above mentioned VDDK tweaks. This is for Windows-based proxy, Linux-based still to be rechecked.

As such, for those getting 10-20x slower performance than that (as shared by a few folks above), it is very likely to be some sort of network and/or ESXi configuration issue. So the best course of action would be to open a support case, and if T1 support engineers can't find some obvious issue, ask them to escalate to the performance testing QA team for further investigation.
mbru
Lurker
Posts: 1
Liked: never
Joined: Apr 12, 2023 8:59 am
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by mbru »

Just another confirmation that V12 will not speed up neither replication nor restore over NBD to ESXi-7.0U3 targets (same problem as in V11). Both speeds (replication / restore) remain at 1 - 2 MB/s for vSphere ESXi-7.0U3g targets, backup speeds remain fast ( > 50 MB/s) both from vSphere 6.0U2 and 7.0U3.

Windows 2016 server is used, V12 was reinstalled including a fresh PostgreSQL database, but the issue remains. Will go for the hot-add backup proxy solution.
Andreas Neufert
VP, Product Management
Posts: 6707
Liked: 1401 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Andreas Neufert »

[Edited to address the typo mentioned below] We finished testing in our QA labs and we could not reproduce any reduced restore speed. 600MB/s for single VMs. We tested with ESXi-7.0u3f, ESXi-7.0u3g and ESXi-8.0

mbru can you please create a support ticket and attach current logs. (or upload new logs to an open support ticket) and please share the ticket number here with us. We would like to find the root cause in your environment.
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Gostev »

@Andreas Neufert he does not have an issue with reduced backup speed, but with restore speed.
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Gostev »

Checked with the QA engineer behind the test and he confirmed the numbers from Andreas were for restore, so it's just a typo from Andreas.
Andreas Neufert
VP, Product Management
Posts: 6707
Liked: 1401 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Andreas Neufert »

Please allow me to apologize for the typo.
@mbru can you please share additional details here or send me a message here with the following:
Source Storage System and used protocol (NFS/iSCSI/FC/...)
VM disk type (thin or thick disk)
Network interface speed
Backup target storage and how attached.

Antivirus best practices implemented on Repository Server and Proxy Server? https://www.veeam.com/kb1999
Andreas Neufert
VP, Product Management
Posts: 6707
Liked: 1401 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Andreas Neufert »

@ThierryF can you please share as well here or by direct message with us the storage and network configuration used (see one message above for the questions). Thanks in advance.
ThierryF
Expert
Posts: 129
Liked: 33 times
Joined: Mar 31, 2018 10:20 am
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by ThierryF »

Hello,

I do confirm that NBD restore performance issues I experienced comes with Windows proxies and NBD.

About our config, Veeam VBR Server, Proxy and Filestore are in same vlan.
Traffic from that vlan goes via intersite link 10Gbits to another DC.
In our case, Vmware VC 7.0.3 and ESX7.0.3 are in same vlan.

VMware-VCSA-all-7.0.3-20990077, Release Date: 2022-12-22, Build Number: 20990077
Product: VMware ESXi Version: 7.0.3 Build: Releasebuild-20328353 Update: 3 Patch: 55

About Veeam HW :
FileStore, UCSC-C240-M5L, Microsoft Windows Server 2016 Standard, Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz CPU, 8 Core, 16 Threads (2 #),64 GB, ReFS for 2*(6*10TB Raid-5) volumes
Proxy1, UCSC-C220-M5SX,Microsoft Windows Server 2016 Standard, Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz CPU, 10 Core, 20 Threads (2 #),64 GB
Proxy2,UCSC-C220-M5SX,Microsoft Windows Server 2016 Standard, Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz CPU, 10 Core, 20 Threads (2 #),64 GB
VBRSRV, UCSC-C220-M5SX,Microsoft Windows Server 2016 Standard, Intel(R) Xeon(R) Gold 5115 CPU @ 2.40GHz CPU, 10 Core, 20 Threads (2 #),128 GB
10Gbit NICs in teaming over Nexus 5000 and Nexus FEX 10Gb Coper (not FC ports).

About Vmware HW:
Dual Cisco UCSB-5108-AC2 chassis, 16 nodes Cisco UCSB-B200-M4, 1024 GB, Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz, 12 Core, 24 Threads CPU (2 #) each.
About our datastore storage, it is IBM Flash Storage V9000 NVMe Micro Latency 200TB Storage over Brocade 16Gbit SAN links and no, there are no storage IO Issues.

Would you have live view to our infa, drop me a MP. We could organize teams/webex meeting if interrested by.
Colleague of mine who has VBR12 and ESX8 env also experience same poor perf during recover without hot-add vmware proxy.
Sure we can organize both if you want.

Cheers

TH
HW_HW
Novice
Posts: 3
Liked: 1 time
Joined: Sep 20, 2018 9:23 am
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by HW_HW »

I can confirm the issue with Vsphere 7 + Veeam 12. In my opinion the best workaround is setting up Linux proxies - good enough for full speed NBD restores. Those without virtual proxies would get another advantage: High speed hot-add restores.
Andreas Neufert
VP, Product Management
Posts: 6707
Liked: 1401 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Andreas Neufert »

@ThierryF Thanks so much for sharing the details. If I remember right then the Cisco Nodes have Virtual Interface Cards where network and FC goes over the same card. Do you know the configuration in case of speed and committed throughput for FC/Network there?
ThierryF
Expert
Posts: 129
Liked: 33 times
Joined: Mar 31, 2018 10:20 am
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by ThierryF »

Hello Andreas,

The B200 Nodes are Blade servers. Both 8-nodes chassis are connected thru twinax cables to "Cisco Fabric Interconnects (FI)" that act as bridge between SAN and Network infrastructure from a side and Hardware server from the other.

At FI level, "Service profiles" are created and assigned to each Blade Server node, to describe node configuration. The profile includes vnics and vhbas. At vnic level, you present/authorize the vlan(s) that may transit to that interface, the VNIC MAC, Duplexing, ....
Same at vhba level, you define WWNN/WWPN node addresses.

Thru FI SAN or Network uplinks, vnic/vhba trafic will be routed to either Cisco Nexus 9000 (in my case) or Brocade SAN Switches.
At Cisco Nexus level, VPC (Virtual Port Channel) will be created and authorized VLAN(s) allowed,
At Brocade level, vhba will be zoned with FC Storage controllers. When detected by Storage arrays, "SAN Nodes" will be configured and Storage LUN(s) will be maskedto be presented to storage nde.

All of that config is working fine, like a charm.
About vNICS, each node has 6 vnics. 2 for Server MGMT as dedicated v-switch, 2 for VM data as v-switch#2, 2 for VMotion as v-switch #3.
Each ESX as VMKernel NIC vmk0 connected on mgmt v-switch.

Trafic from one vnic on each pair walk thru FI #1, the second, to FI #2.
FI#1 has uplink to Nexus 9000 #1 and Brocade Switch #1,
FI#2 has uplink to Nexus 9000 #2 and Brocade Switch #2 for full redundancy.

No problem at all for production nor normal backups.
The problem came during NBD recover of a big VM. With small VMs, as multiple disks are being recovered at same time, recover speed may seem higher.
Recovering a big disk (in my case, it was 1 TB VHD), it takes time to check poor performance from Veeam VBR recover stats.

If, at repository level, you monitor the IO Read rate to the VIB/VBK file(s) by the veeam processes you will see the poor performance during NBD.
Repeat the same test from the same recover point (to use same VBK/VIB chain) using HotAdd mode and monitor IORate like above, you will directly see the difference.

If you just rely on Veeam VBR Stats during recover or if recovered disks are not filled with data, you will just recover zero-filled spaces.

My issue pope-ed up when recovering SCCM env. Colleague of mine did same test (SCCM VM Recover), using VBR 12 and ESX 8, and experienced same poor perf in NBD and not in HotAdd.
My colleague has also a completely different Cisco HW Config, being traditional ESX physical server nodes, with physical nic.

Should you need more info or have a sight on our configs, I could arrange a session. Just let me know.

Cheers

Th
Sturniolo
Veeam Software
Posts: 59
Liked: 37 times
Joined: Feb 19, 2019 3:08 pm
Full Name: Andy Sturniolo
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Sturniolo » 1 person likes this post

Hello,

We identified a workaround for issues with some affected environments. We'll update this thread when it becomes available, customers that are experiencing this will then be able to request this workaround via our support channels. We expect these changes to also be included in our upcoming v12a release
Sturniolo
Veeam Software
Posts: 59
Liked: 37 times
Joined: Feb 19, 2019 3:08 pm
Full Name: Andy Sturniolo
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Sturniolo » 1 person likes this post

Customers that are experiencing this can now request the workaround via our support channels. Please reference Case ID - 05925557
Kwa-GJ
Novice
Posts: 9
Liked: 5 times
Joined: Oct 02, 2013 9:47 am
Full Name: KWA-GJ
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Kwa-GJ » 2 people like this post

Workaround implemented , and achieved a speed increase from 16 MB/s to 129 MB/s
Andreas Neufert
VP, Product Management
Posts: 6707
Liked: 1401 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Andreas Neufert »

Thanks for sharing. Do you know what it was before with older ESXi version?
Kwa-GJ
Novice
Posts: 9
Liked: 5 times
Joined: Oct 02, 2013 9:47 am
Full Name: KWA-GJ
Contact:

Re: V11 + ESXi 7.0 U2: extremely slow replication and restore over NBD

Post by Kwa-GJ »

that was around 60 MB/s
But is not 100% comparable because we did with the upgrade to 7 (from the latest 6.7) a complete hardware refresh at the vmware destination side (hosts + storage)
Post Reply

Who is online

Users browsing this forum: KonstantinS and 86 guests