Host-based backup of VMware vSphere VMs.
Post Reply
widmerkarl
Expert
Posts: 122
Liked: 29 times
Joined: Jan 06, 2015 10:03 am
Full Name: Karl Widmer
Location: Switzerland
Contact:

Slow Cross Host Replication

Post by widmerkarl »

Hello

Since a few weeks we are testing and trying to figure out why a cross host replication at a customer is so extremely slow.

Support Case ID: 02262033

Hardwarelist:
- 2x HPE ProLiant DL380 dual socket server (2 active CPU per server)
- each server with 256MB Memory (128MB per CPU)
- each server with 12x 800 GB SAS SSD (Raid 5)

Networking:
We have two 10Gbit switches and two 10Gbit network cards per server for redundancy.

Veeam:
The Veeam backup server runs as a virtual machine on one of these servers.

Issue:
When i do a cross host replication (replicate a VM from one host to another) i get only about 50MB/s throughput. I'd expect far more than 50MB/s, probably around 500MB/s or more. But we're far away from this value.

What we discovered so far:
- MTU size doesn't matter
- Blocksize in Veeam (Storage Optimization) doesn't matter
- Compression setting in Veeam (Storage Optimization) doesn't matter
- esxtop confirms replication throughput what we see in Veeam
- iperf from VM to VM (different host) shows 8.7 Gbit/s
- iperf from VM to VM (same host) shows 1.8 Gbit/s
- esxtop confirms iperf throughput
- Again MTU size doesn't matter
- Backup to a NAS (1 Gbit/s) works fine
(with much higher read speed than replication)
- Powered off VM vmotions through ESXi Mgmt port / interface
(Traffic flow through management interface seems to be by design when doing cold migration)

Next step is to open a VMware support case. But i want to make sure we've covered all what could probably Veeam related.

So any ideas what i've missed?

Thank you very much!
Karl Widmer
IT System Engineer

vExpert 2017-2024
VMware VCP-DCV 2023 / VCA6-DCV / VCA5-DCV / VCA5-Cloud / VMUG Leader
Former Veeam Vanguard / VMCE v9 / VMTSP v9 / VMSP v9
Personal blog: https://www.driftar.ch
Twitter: @widmerkarl
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Slow Cross Host Replication

Post by foggy »

Hi Karl, what are the bottleneck stats for the replication jobs? What transport mode is used to read the source VM data and populate replica VM data to the target datastore?
NightBird
Expert
Posts: 245
Liked: 58 times
Joined: Apr 28, 2009 8:33 am
Location: Strasbourg, FRANCE
Contact:

Re: Slow Cross Host Replication

Post by NightBird »

Probably ESX6.5 with nbd (ssl) ;)
widmerkarl
Expert
Posts: 122
Liked: 29 times
Joined: Jan 06, 2015 10:03 am
Full Name: Karl Widmer
Location: Switzerland
Contact:

Re: Slow Cross Host Replication

Post by widmerkarl »

Hello,

All proxies involved are running in Virtual Appliance mode. At least in the recent tests bottleneck was always Target with about 99%.

At the moment i'm still testing and i'll keep you posted.

Thank you!
Karl Widmer
IT System Engineer

vExpert 2017-2024
VMware VCP-DCV 2023 / VCA6-DCV / VCA5-DCV / VCA5-Cloud / VMUG Leader
Former Veeam Vanguard / VMCE v9 / VMTSP v9 / VMSP v9
Personal blog: https://www.driftar.ch
Twitter: @widmerkarl
widmerkarl
Expert
Posts: 122
Liked: 29 times
Joined: Jan 06, 2015 10:03 am
Full Name: Karl Widmer
Location: Switzerland
Contact:

Re: Slow Cross Host Replication

Post by widmerkarl »

I've been thinking about using preferred networks. But obviously it doesn't work. Well, i'm not 100% sure how to set this up.

We've added a seperate "vMotion" network, we just called it like this and set the VMKernel to vMotion for this network. But Veeam doesn't recognize this, probably because the backup server doesn't have a connection to this network. It's just between ESXi hosts.

I'm still at about 40-50 MB/s. No improvements so far.

Cheers,
Karl
Karl Widmer
IT System Engineer

vExpert 2017-2024
VMware VCP-DCV 2023 / VCA6-DCV / VCA5-DCV / VCA5-Cloud / VMUG Leader
Former Veeam Vanguard / VMCE v9 / VMTSP v9 / VMSP v9
Personal blog: https://www.driftar.ch
Twitter: @widmerkarl
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Slow Cross Host Replication

Post by foggy »

Bottleneck target means the issue is the target datastore population speed. Please check whether the correct proxy server (the one running on the target host) is being effectively selected in the job and that it uses hotadd. For that you can look at the job session, select the particular VM in the list to the left and look for the proxy name (and transport mode tag after it) in the right pane.
widmerkarl
Expert
Posts: 122
Liked: 29 times
Joined: Jan 06, 2015 10:03 am
Full Name: Karl Widmer
Location: Switzerland
Contact:

Re: Slow Cross Host Replication

Post by widmerkarl » 1 person likes this post

Hello

I did some testing as requested by Veeam support. Support asked to make up- and download tests with a VMDK file.

The backup server runs as a VM on ESX1. The VMDK file was about 10.7 GB. I connected through vSphere client directly to ESXi servers, not through vCenter.

1.)Download from ESX1 to backup server: 54sec / ca. 1553 MB/s
2.)Upload von backup server to ESX1: 5min 50sec / ca. 239 MB/s
3.)Upload from backup server to ESX2: 5min 55sec / ca. 236 MB/s
4.)Download from ESX2 to backup server: 1min 44sec / ca. 806 MB/s

Previously i did the same tests but connected with vSphere client through vCenter. There the up- and download speed was way slower than connected directly to the ESXi servers.

I'm now waiting for an answer from support. Probably it's now a Veeam related issue but more in the VMware world to look for.
Karl Widmer
IT System Engineer

vExpert 2017-2024
VMware VCP-DCV 2023 / VCA6-DCV / VCA5-DCV / VCA5-Cloud / VMUG Leader
Former Veeam Vanguard / VMCE v9 / VMTSP v9 / VMSP v9
Personal blog: https://www.driftar.ch
Twitter: @widmerkarl
widmerkarl
Expert
Posts: 122
Liked: 29 times
Joined: Jan 06, 2015 10:03 am
Full Name: Karl Widmer
Location: Switzerland
Contact:

Re: Slow Cross Host Replication

Post by widmerkarl » 1 person likes this post

Hello guys,

I'm sorry for the delay. I was on holidays the last two weeks and thus the support ticket was closed. But we were able to finally solve the problem.

We created a separate network (different IP range than LAN and vSphere management network) and installed two new VMs to use them as Veeam proxies (Win 7 64 Bit / 4 cores / 8 GB Ram). These two proxies are running in "Virtual Appliance" mode (hotadd) with failback to network mode if necessary.

(Now the replication runs fast with processing rates at about 700-800 MB/s and disk read speed within the tasks with up to 920 MB/s.

So in my eyes that's fine and i'm happy now. Next step: scheduling ;-)
Karl Widmer
IT System Engineer

vExpert 2017-2024
VMware VCP-DCV 2023 / VCA6-DCV / VCA5-DCV / VCA5-Cloud / VMUG Leader
Former Veeam Vanguard / VMCE v9 / VMTSP v9 / VMSP v9
Personal blog: https://www.driftar.ch
Twitter: @widmerkarl
Post Reply

Who is online

Users browsing this forum: Baidu [Spider] and 14 guests