Host-based backup of VMware vSphere VMs.
Post Reply
DavidAno
Influencer
Posts: 10
Liked: never
Joined: Apr 28, 2022 2:20 pm
Full Name: David Ano
Contact:

Veeam Backup Job Performance Issues

Post by DavidAno »

Hello Everyone,

I am having an issue with Veeam backup performance. It lists our bottleneck as Target for every backup job, and our backup jobs are taking way longer than expect. We just built out an entirely new Veeam/backup infrastructure a few weeks ago and want to see if anyone has an ideas on how we can improve our speed. Here is an overview of our setup:

We have 4 separate physical locations across the US. Sites 1-3 are production sites, and site 4 is a dedicated DR site. All sites are connected with IPSec VPN Tunnels.

Each site has a Dell R740XD2 server with the following specs:
- Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz
- 4 x 32GB DDR4 ECC Ram
- 24 x 18TB storage drives
- 2 x Samsung 256GB SSD boot drives
- Dual 10G NIC cards

Each of these 4 servers is running TrueNAS Core with all of the storage drives in a single pool (2 x RAIDZ2 VDEVs) totaling about 300TB of useable space.

Our Veeam server is running as a VM on a separate host at the DR (4th) site. Each of the 3 production sites has a Veeam Proxy / Wan Accelerator VM setup so that the backups can process locally (and then subsequently be copied offsite).

Here is an example of the issue I am facing. A job that processes about 15VMs took 23 hours and 43 minutes to run an INCREMENTAL backup. The job had a process rate of 17MB/s which I feel is really slow for an environment that is all SSD storage and 10G network. It lists the Target as the bottleneck; however, when I look at the TrueNAS metrics I barely see any utilization.
- CPU usage in TrueNAS is always below 10% (most of the time sitting at 0%)
- RAM Usage is below 70%
- The 10G NIC is hovering between 1-7MiB/s inbound traffic when processing backups
- Disk Latency is below 10ms during backup
- Disk Busy is below 1% on average during backup

A few notes about my setup
- The repositories are setup using a NFS connection
- My backups are in reverse incremental mode
- Compression level is set at optimal
- Backup proxy is on a different VLAN than the hosts being backed up, but if that was causing an issue I would assume it would list Proxy or Network as the bottleneck


I am beating my head against the wall trying to figure out why TrueNAS shows practically no utilization, but Veeam is listing it as 99% of the bottleneck of the job.

Maybe I am just over-estimating the speed at which these backups are able to processes, I know reverse incremental is slower, but almost 24 hours to run a incremental backup seems insane. I don’t know if I need to change the repository to SMB instead of NFS as a test? I have seen people saying to NEVER use SMB because it can cause corrupt backups, but other people say NFS is not developed well and can lead to performance issues. Any input or troubleshooting steps would be greatly appreciated.
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Veeam Backup Job Performance Issues

Post by Gostev »

Try backing up one small VM to a repository created on a Windows or a Linux server located in the same site with TrueNAS. If it is also as slow, then it would indicate an issue with networking. Otherwise, you will confirm it's with TrueNAS and will need to open a support case with the vendor.
DavidAno
Influencer
Posts: 10
Liked: never
Joined: Apr 28, 2022 2:20 pm
Full Name: David Ano
Contact:

Re: Veeam Backup Job Performance Issues

Post by DavidAno »

Hello Gostev,

I just setup a repository on the Proxy server for that site, and the backup processed a 16GB VM in 3 minutes (rate of 504MB/s). Since the proxy server is on the same host as the VM that was being backed up im not sure that rules out some networking issues, but will keep trying.
tyler.jurgens
Veeam Legend
Posts: 290
Liked: 128 times
Joined: Apr 11, 2023 1:18 pm
Full Name: Tyler Jurgens
Contact:

Re: Veeam Backup Job Performance Issues

Post by tyler.jurgens »

Out of curiosity, why TrueNAS instead of building that repo as a Veeam Hardened Repository? You could use mdadm to create your raid 60, format it appropriately with XFS and use that as repository directly rather than over any kind of file share (NFS or SMB).
Tyler Jurgens
Veeam Legend x2 | vExpert ** | VMCE | VCP 2020 | Tanzu Vanguard | VUG Canada Leader | VMUG Calgary Leader
Blog: https://explosive.cloud
Twitter: @Tyler_Jurgens BlueSky: @tylerjurgens.bsky.social
rennerstefan
Veeam Software
Posts: 628
Liked: 146 times
Joined: Jan 22, 2015 2:39 pm
Full Name: Stefan Renner
Location: Germany
Contact:

Re: Veeam Backup Job Performance Issues

Post by rennerstefan »

If in your test the VM and the proxy were on the same host you most likely used hot-add mode where as I’m not sure what was used in your initial backup. Hot-add means that there is no network traffic from source vm to proxy but only from proxy to repo which usually is the fastest way of the VMware backup options outside san based backup.
In your initial setup can you explain what machine runs the proxy role and where it sits. I guess it’s either network (like what Gostev said) or some kinds of proxy/repo misconfig.
It’s very hard to “troubleshoot” via forum without a full picture. So the more details you provide the better and you can always have a ticket with Veeam support to have a look.
Stefan Renner

Veeam PMA
DavidAno
Influencer
Posts: 10
Liked: never
Joined: Apr 28, 2022 2:20 pm
Full Name: David Ano
Contact:

Re: Veeam Backup Job Performance Issues

Post by DavidAno »

tyler.jurgens wrote: Feb 15, 2024 7:29 pm Out of curiosity, why TrueNAS instead of building that repo as a Veeam Hardened Repository? You could use mdadm to create your raid 60, format it appropriately with XFS and use that as repository directly rather than over any kind of file share (NFS or SMB).
I would gladly switch to a Linux repository if it would resolve these performance issues. I just went with TrueNAS because it is what I am familiar with for network storage. That being said I manage several Linux web servers and am comfortable working with them.

Any guides pointers on how to setup a Linux OS to serve as the REPO? Does Veeam have an OS built for this? If not is there any particular Distro that i should be looking for?
DavidAno
Influencer
Posts: 10
Liked: never
Joined: Apr 28, 2022 2:20 pm
Full Name: David Ano
Contact:

Re: Veeam Backup Job Performance Issues

Post by DavidAno »

rennerstefan wrote: Feb 15, 2024 7:35 pm If in your test the VM and the proxy were on the same host you most likely used hot-add mode where as I’m not sure what was used in your initial backup. Hot-add means that there is no network traffic from source vm to proxy but only from proxy to repo which usually is the fastest way of the VMware backup options outside san based backup.
In your initial setup can you explain what machine runs the proxy role and where it sits. I guess it’s either network (like what Gostev said) or some kinds of proxy/repo misconfig.
It’s very hard to “troubleshoot” via forum without a full picture. So the more details you provide the better and you can always have a ticket with Veeam support to have a look.
I didn't want to provide TOO much detail in the initial post, but glad to elaborate.

VBR server is located at Site 4 (DR site).

Backup example used takes place at site 2
Production server is running VMware and all SSD storage W/ 10G networking. VMware management is on VLAN 7
The repository details are covered above, and is on VLAN 8
VBR Proxy server is setup at site 2 running as a VM on the Production server. The proxy server is on VLAN 8 along with the repository. The backup did use Hot-add mode
The 10G switch that all of these devices are connected to serves as the Gateway and Router between VLANS, so even though in theory no traffic should be going over it as long as hot-add mode is used, i figured I would mention that it should not be a firewall issue.
tyler.jurgens
Veeam Legend
Posts: 290
Liked: 128 times
Joined: Apr 11, 2023 1:18 pm
Full Name: Tyler Jurgens
Contact:

Re: Veeam Backup Job Performance Issues

Post by tyler.jurgens » 1 person likes this post

DavidAno wrote: Feb 15, 2024 7:49 pm I would gladly switch to a Linux repository if it would resolve these performance issues. I just went with TrueNAS because it is what I am familiar with for network storage. That being said I manage several Linux web servers and am comfortable working with them.

Any guides pointers on how to setup a Linux OS to serve as the REPO? Does Veeam have an OS built for this? If not is there any particular Distro that i should be looking for?
Ubuntu 22.04 works great. Its not the only OS you can use, but I've not had issues with it.
https://www.veeam.com/blog/installing-u ... itory.html
Tyler Jurgens
Veeam Legend x2 | vExpert ** | VMCE | VCP 2020 | Tanzu Vanguard | VUG Canada Leader | VMUG Calgary Leader
Blog: https://explosive.cloud
Twitter: @Tyler_Jurgens BlueSky: @tylerjurgens.bsky.social
karsten123
Service Provider
Posts: 370
Liked: 82 times
Joined: Apr 03, 2019 6:53 am
Full Name: Karsten Meja
Contact:

Re: Veeam Backup Job Performance Issues

Post by karsten123 »

and please use a decent RAID controller. There is a sizing guide from Hannes in the Veeam blogs as well.
a-tome
Influencer
Posts: 10
Liked: never
Joined: Jul 18, 2020 8:13 pm
Full Name: Jan Levin
Contact:

Re: Veeam Backup Job Performance Issues

Post by a-tome »

Nice specs on your backup server. Are you using standard packet sizes (1515), or jumbo packet sizes. I realize the "bottleneck is identified as the target. I am just removing "retransmits" and resizing packets before digging into the server itself. You have a 420TB drive in a single drive pool, so you are not taking advantage of splitting the entire pool across multiple processors. While your storage server may be able to handle 18 or 19 threads of backup data, you disk pool may be running on only one. If you look at queuing on your disks, what do you see? What is the commit time? I think both are available through Windows Task Manager (advanced). Someone remarked that you could use Linux rather than an NFS share through TrueNAS. The question that really needs to be addresses is what do you require for a "mount" server. If all of your local VMs are Linux, then switching to a Linux backup repository makes sense. However, if they are all Windows, then you need the Windows host as a mount server. If you have a Windows VM that is a proxy on the Vmware host, then it can be your mount server and switching to Linux will yield performance benefits rather than sharing NFS through TrueNAS. You mentioned that the CPU utilization was really low. That is not necessarily surprising. If you were performing an initial backup, then all of the stats are disconcerting. If you are performing an incremental, then you get low bandwidth use, low CPU and low Bytes/sec on your stats because Veeam computes stats based on the bytes moved rather than the total bytes of the source size. Are you scanning the backup data with antivirus? Are you scanning the destination repository with antivirus? Are the scans configured for whenever the file is accessed, or just read and write? The first two will scan every block you write, and the whole file every time a block is written. Security of your backups is a real world concern, so I will not recommend a "fix". I will just point to the behavior as a potential cause. The last thought / question is how long ago did you create your storage pool? If it has been less than two weeks, then check the stats on the state of the pool. You may be writing data to a pool that is still in the process of formatting. Good Luck.
Servior
Novice
Posts: 6
Liked: 1 time
Joined: May 19, 2021 7:11 am
Contact:

Re: Veeam Backup Job Performance Issues

Post by Servior »

Veeam bootleneck detection is not accurate. Showing target as the bottleneck can be misleading.

What backup mode is used?
Are Hosts, VMs, Proxies and repos in the same subnet or different ones?
Are you sure veeam uses the correct proxies?

Had such issues before, the traffic were routed through a firewall with 1) only 1 gbps bandwidth and 2) all security features enabled.

On other customers, we had several sites as well. Veeam configured in auto detect mode for proxies. Veeam used the proxy from site b to process VMs from site a and stuff the data on site a, which is a major bottleneck. Setting the correct proxy as fixed solved it.
Aniz Daher
Lurker
Posts: 1
Liked: never
Joined: Nov 14, 2022 10:51 am
Full Name: Aniz Daher
Contact:

Re: Veeam Backup Job Performance Issues

Post by Aniz Daher »

I would change this configuration:
- The repositories are setup using a NFS connection
- My backups are in reverse incremental mode
NFS over spinning disks and reverse incremental backups. Change it to forward incremental and compare.
Entropy
Novice
Posts: 7
Liked: 4 times
Joined: Nov 03, 2020 1:29 pm
Full Name: Ryan
Contact:

Re: Veeam Backup Job Performance Issues

Post by Entropy » 1 person likes this post

With RAIDZ you get 1 disk worth of IOPS.

Run FIO local and over the network against your array and see if the results pass the sniff test:
https://www.veeam.com/kb2014

Remember to divide by 3 for reverse incremental. With a single mirror in your pool, after dividing by 3, you should net about 2/3 of 1 disk of IOPS.

If you must stick with reverse incremental, you may need to go with a RAID 10 equivalent. Because I think you need to more than double or triple your performance and mirroring more RAIDZ vdevs will only take you so far (e.g. you could do 6, 4 drive RAIDZ1 vdevs in a pool to triple your IOPS).

XFS gets you fast clone, we use forever forward with GFS restore points for retention and weekly synthetic fulls. Without fast clone, the synthetic fulls would take too long. This is on 18 drives in a RAID 60.
DavidAno
Influencer
Posts: 10
Liked: never
Joined: Apr 28, 2022 2:20 pm
Full Name: David Ano
Contact:

Re: Veeam Backup Job Performance Issues

Post by DavidAno »

I was able to resolve this by changing the configuration of my TrueNAS Repository. Specifically I changed the following:

- Change the Pool to consist of 4 x Vdevs of 6 disks in RaidZ1 (Previously was 2 x Vdevs with 12 disks in RaidZ2)

- Turned off Compression on the Storage Pool

- Turned off Sync option on the Storage Pool



Now instead of getting sub-20MB/s processing rates I am getting between 80-120MB/s on average.
Post Reply

Who is online

Users browsing this forum: No registered users and 57 guests