-
- Novice
- Posts: 6
- Liked: never
- Joined: Oct 11, 2022 11:50 pm
- Full Name: ARYA IT Support
- Contact:
Packet Loss
Hi,
Packet losses are common during Veeam backups. What could be the cause and solution? Ticket ID: #02657022
Veeam Version: 11.0.1.1261
Bottleneck: Load: Source 88% > Proxy 68% > Network 41% > Target 32%
Proxy transport mode: Automatic
Proxy concurrent tasks: 8
Proxy count: 1 physically bare metal server (10G network)
Backup repository concurrent tasks: 1
Job objects: linux servers (vmxnet3)
Thanks.
Packet losses are common during Veeam backups. What could be the cause and solution? Ticket ID: #02657022
Veeam Version: 11.0.1.1261
Bottleneck: Load: Source 88% > Proxy 68% > Network 41% > Target 32%
Proxy transport mode: Automatic
Proxy concurrent tasks: 8
Proxy count: 1 physically bare metal server (10G network)
Backup repository concurrent tasks: 1
Job objects: linux servers (vmxnet3)
Thanks.
-
- Veeam Legend
- Posts: 198
- Liked: 55 times
- Joined: Mar 22, 2017 11:10 am
- Full Name: Mark Boothman
- Location: Darlington, United Kingdom
- Contact:
Re: Packet Loss
I assume this is using VBR rather than agent and vmware?
When does the packet loss occur is it when the snapshots are taken and removed?
Is there a reason your repository only has 1 concurrent task?
When does the packet loss occur is it when the snapshots are taken and removed?
Is there a reason your repository only has 1 concurrent task?
-
- Novice
- Posts: 6
- Liked: never
- Joined: Oct 11, 2022 11:50 pm
- Full Name: ARYA IT Support
- Contact:
Re: Packet Loss
Hi,
Yes, Vmware and VBR products are used. Packet losses occur while taking and deleting snapshots.
Edit: Latency and packet losses increase when working with more concurrent tasks.
Thanks.
Yes, Vmware and VBR products are used. Packet losses occur while taking and deleting snapshots.
Edit: Latency and packet losses increase when working with more concurrent tasks.
Thanks.
-
- VeeaMVP
- Posts: 1007
- Liked: 314 times
- Joined: Jan 31, 2011 11:17 am
- Full Name: Max
- Contact:
Re: Packet Loss
With packet loss, do you mean that some VMs aren't reachable during snapshot creation/consolidation?
Then it's probably either the load of the VM or a hardware performance limitation. At some point the ESXi host will need to stun the VM and depending on those factors, such a stun can take multiple seconds.
I would check the vmware.log of affected VMs and search for the stun time.
Also to rule out Veeam, try to manually create a snapshot, leave it active for the same time the backup job takes and then delete it.
Then it's probably either the load of the VM or a hardware performance limitation. At some point the ESXi host will need to stun the VM and depending on those factors, such a stun can take multiple seconds.
I would check the vmware.log of affected VMs and search for the stun time.
Also to rule out Veeam, try to manually create a snapshot, leave it active for the same time the backup job takes and then delete it.
-
- Novice
- Posts: 6
- Liked: never
- Joined: Oct 11, 2022 11:50 pm
- Full Name: ARYA IT Support
- Contact:
Re: Packet Loss
Sometimes the virtual server cannot be accessed during snapshot creation and deletion. When we take snapshots and control this situation manually, we rarely see that the latency time increases.
The servers is detected as down and services are affected because it does not receive a response during the load balancing health check.
When I examine the old vmware-0.log and vmware-1.log files, it seems that the stuns last between 0,3-4,5 seconds. (Checkpoint_Unstun: vm stopped for 4511435 us)
I could not access the vmware.log file. (can't open 'vmware.log': Device or resource busy)
What are the conditions affecting the stun duration, how can it be improved? Is it related to the ram and cpu in the virtual server, is it related to the I/O values of the physical disks, or what capacity are the physical servers insufficient?
Physical servers have 16Gx2 multipath san switchs connections, I don't think there is a bottleneck on the 3par storage side.
As I said, these negative situations are not always experienced, they are experienced very variable. For example, there are 3 virtual servers in 1 job, only 1 virtual server is negatively affected while this task is running. And the next day no problem.
Thanks.
The servers is detected as down and services are affected because it does not receive a response during the load balancing health check.
When I examine the old vmware-0.log and vmware-1.log files, it seems that the stuns last between 0,3-4,5 seconds. (Checkpoint_Unstun: vm stopped for 4511435 us)
I could not access the vmware.log file. (can't open 'vmware.log': Device or resource busy)
What are the conditions affecting the stun duration, how can it be improved? Is it related to the ram and cpu in the virtual server, is it related to the I/O values of the physical disks, or what capacity are the physical servers insufficient?
Physical servers have 16Gx2 multipath san switchs connections, I don't think there is a bottleneck on the 3par storage side.
As I said, these negative situations are not always experienced, they are experienced very variable. For example, there are 3 virtual servers in 1 job, only 1 virtual server is negatively affected while this task is running. And the next day no problem.
Thanks.
-
- Veeam Software
- Posts: 1494
- Liked: 655 times
- Joined: Jul 17, 2015 6:54 pm
- Full Name: Jorge de la Cruz
- Contact:
Re: Packet Loss
Hello Arya,
I can see that you have a physical proxy and 3par as well, which is all great hardware. Are you using by any chance the Storage Integration? DirectSAN backup, at least? I can see that you let it automatic. But that would be what I will try first, enabling the Storage Snapshot integration to see if that helps reduce the stun. VMware Tools versions all good and up to date on all VMs?
I can see that you have a physical proxy and 3par as well, which is all great hardware. Are you using by any chance the Storage Integration? DirectSAN backup, at least? I can see that you let it automatic. But that would be what I will try first, enabling the Storage Snapshot integration to see if that helps reduce the stun. VMware Tools versions all good and up to date on all VMs?
Jorge de la Cruz
Senior Product Manager | Veeam ONE @ Veeam Software
@jorgedlcruz
https://www.jorgedelacruz.es / https://jorgedelacruz.uk
vExpert 2014-2024 / InfluxAce / Grafana Champion
Senior Product Manager | Veeam ONE @ Veeam Software
@jorgedlcruz
https://www.jorgedelacruz.es / https://jorgedelacruz.uk
vExpert 2014-2024 / InfluxAce / Grafana Champion
-
- Novice
- Posts: 6
- Liked: never
- Joined: Oct 11, 2022 11:50 pm
- Full Name: ARYA IT Support
- Contact:
Re: Packet Loss
Hi jorgedlcruz,
You can think of the topology as below, 3par is not for storage backup. Backups are made to the disks on the backup server, repository connected with direct attach. (Avg. 1000-1200 MB/s)
https://freeimage.host/i/bJsuou (Link health: https://www.virustotal.com/gui/url/e020 ... ?nocache=1)
VMtools versions seem to be up to date. (2:11.3.0-2ubuntu0~ubuntu20.04.3)
You can think of the topology as below, 3par is not for storage backup. Backups are made to the disks on the backup server, repository connected with direct attach. (Avg. 1000-1200 MB/s)
https://freeimage.host/i/bJsuou (Link health: https://www.virustotal.com/gui/url/e020 ... ?nocache=1)
VMtools versions seem to be up to date. (2:11.3.0-2ubuntu0~ubuntu20.04.3)
-
- VeeaMVP
- Posts: 1007
- Liked: 314 times
- Joined: Jan 31, 2011 11:17 am
- Full Name: Max
- Contact:
Re: Packet Loss
Thr stun time is often affected by the load of the VM. If too much is happening there, at some point jt needs to be stopped so that the consolidation can occure.
How current is the version of your ESXi hosts? There have been improvements in newer releases.
What Jorge is referring to is the backup via storage snapshots. If you have an Enterprise Plus license or Veeam Universal license, you can use those and reduce the active snapshot time on the VM. If the stuns only occurs when the snapshot is growing, this could solve your problem.
https://helpcenter.veeam.com/docs/backu ... ml?ver=110
How current is the version of your ESXi hosts? There have been improvements in newer releases.
What Jorge is referring to is the backup via storage snapshots. If you have an Enterprise Plus license or Veeam Universal license, you can use those and reduce the active snapshot time on the VM. If the stuns only occurs when the snapshot is growing, this could solve your problem.
https://helpcenter.veeam.com/docs/backu ... ml?ver=110
Who is online
Users browsing this forum: Bing [Bot], Google [Bot], Semrush [Bot] and 38 guests