Host-based backup of VMware vSphere VMs.
Post Reply
lavicky
Influencer
Posts: 17
Liked: 6 times
Joined: Sep 03, 2020 11:06 am
Full Name: Tomáš Lavický
Contact:

Linux Backup Proxies problems

Post by lavicky »

Along to Windows Backup Proxies we begun to use Linux Backup Proxies for replication and backup jobs. We deployed 3 VM's (4 vCPU, 4 GB vRAM, 16 GB disk, VMXNET 3 network adapters) running CentOS 8, Ubuntu 20.04, Debian 10. We noticed occasionally two types of problems on all three proxies for different jobs and VM's. Retry is usually successful.

1. Either "Getting VM info from vSphere" or "Preparing source proxy X for disk Hard disk Y [hotadd]" task fails with "Error: Client not connected."

The VeeamAgentxxxxxxxx-yyyy-zzzz-aaaa-bbbbbbbbbbbb executable file remains in the /tmp directory of the proxy.
Sometimes other tasks in the job are influenced and fail as well (despite using a different proxy).

2. Copying a specific disk remains hanging with some of these errors:

Code: Select all

[i]Error: Unstable connection: unable to transmit data.
Failed to upload disk.
Agent failed to process method {DataTransfer.SyncDisk}.
Exception from server: End of file
Unable to retrieve next block transmission command. Number of already processed blocks: [2664].
Failed to download disk 'Device '\\.\PhysicalDrive2''.[/i]
  
[i]Error: Transmission pipeline hanged, aborting process.[/i]
  
[i]Error: Exception of type 'Veeam.Backup.AgentProvider.AgentClosedException' was thrown.[/i]  

[i]Error: Connection timed out
read: End of file
Agent failed to process method {Signature.StartReversedSignatureUpdateSession}.[/i]
There are three /tmp/VeeamAgent*data directories with libraries on the proxy as long as the task remains hanging.
The copying usually remains hanging either for about 47 minutes or for about 4:30 hours. But several times I had to remove disk and reboot proxy via vSphere console after many hours.

Any idea, please?

Support Case #04367970
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Linux Backup Proxies problems

Post by foggy »

Hi Tomáš, thanks for posting the case ID for this issue. This is a sort of an issue that requires thorough log analysis so let's see what our engineers come up with after doing that.
lavicky
Influencer
Posts: 17
Liked: 6 times
Joined: Sep 03, 2020 11:06 am
Full Name: Tomáš Lavický
Contact:

Re: Linux Backup Proxies problems

Post by lavicky » 1 person likes this post

Hi Alexander, thanks for your replay. They have already contacted me and I've just added information so I'll wait for solution.
soncscy
Veteran
Posts: 643
Liked: 312 times
Joined: Aug 04, 2019 2:57 pm
Full Name: Harvey
Contact:

Re: Linux Backup Proxies problems

Post by soncscy » 3 people like this post

Heya Tomáš,

>Unstable connection:

I have seen this on __many__ client environments, and it's your firewall. I bet you a pizza.

Do a pcap dump on your proxy, look for the RST packets you'll start getting and you'll see it coming from one of your servers/firewalls (check the TTLs on the packets).

Veeam has a great article on this: https://www.veeam.com/kb2140 But trust me, it's the firewall. I've dealt with this many many many times.
lavicky
Influencer
Posts: 17
Liked: 6 times
Joined: Sep 03, 2020 11:06 am
Full Name: Tomáš Lavický
Contact:

Re: Linux Backup Proxies problems

Post by lavicky »

Hi Harvey,
thanks for your advice.
But we have a dedicated VLAN for VBR traffic and vCenter, vSphere hosts and proxies are in the same subnet with no firewall among them.
Sometimes the similar problem occurs on windows proxy too so it could be network problems time to time. Anyway the copying remains hanging several times a day on linux proxies for some tasks in the same time the other tasks end successfully.
PetrM
Veeam Software
Posts: 3229
Liked: 520 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Linux Backup Proxies problems

Post by PetrM »

Hi Tomáš,

Anyway, I would follow Harvey's recommendation, it's a good idea to collect a traffic dump, I'm pretty sure it will give you a lot to think about.

By the way, you may ask our support engineers to help you to collect and examine network traffic dump.

Thanks!
DonZoomik
Service Provider
Posts: 368
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Linux Backup Proxies problems

Post by DonZoomik » 1 person likes this post

I have a similar thread here vmware-vsphere-f24/transient-errors-wit ... 68281.html Most of problems described by you were fixed by modifying SSH configuration and changing vNIC buffers. However it's still under investigation for unexplained timeouts.
lavicky
Influencer
Posts: 17
Liked: 6 times
Joined: Sep 03, 2020 11:06 am
Full Name: Tomáš Lavický
Contact:

Re: Linux Backup Proxies problems

Post by lavicky »

PetrM wrote: Sep 07, 2020 2:13 pm Hi Tomáš,

Anyway, I would follow Harvey's recommendation, it's a good idea to collect a traffic dump, I'm pretty sure it will give you a lot to think about.

By the way, you may ask our support engineers to help you to collect and examine network traffic dump.

Thanks!
Hi Peter,
I tried to collect a traffic dump between the VBR server and the proxy but tcpdump log file increased to 5 GB before the first occasion of the problem. I asked support what exactly I should monitor but no answer yet (support case ID #01734676).

1. The first type of problem is broken uploading of C:\Program Files\Veeam\Backup and Replication\Backup\VeeamAgent64 from VBR server to linux proxy. Part of this binary remains in /tmp directory on proxy as VeeamAgent{ID} file. I can see this type of errors in log file on VBR server https://docs.google.com/document/d/1L2I ... sp=sharing

Main (proxies and production VM's) and DR (VBR server, replicas and users) DC's are connected via leased dual 10Gbps circuits. We occasionally experience some network problems but relatively rarely. Our file shares are located in main DC, users use them from DR area and we didn't noticed problems with copying files and so on (including a lot of quite extensive SSH rsync).

2. The second type of problem is "frozen disk" with two kinds of behaviour.
a) The task starts and I can see part of disk as read before the task fails.
VBR server log https://docs.google.com/document/d/1FrV ... sp=sharing
Proxy log https://docs.google.com/document/d/1Wuy ... sp=sharing

b) The task starts and nothing is read.
VBR server log https://docs.google.com/document/d/1x8A ... sp=sharing
Proxy log https://docs.google.com/document/d/1Dqq ... sp=sharing

I have no idea why it takes so long (about 47 minutes or about 4 and half hours) for a job to crash. It blocks replaying job.
lavicky
Influencer
Posts: 17
Liked: 6 times
Joined: Sep 03, 2020 11:06 am
Full Name: Tomáš Lavický
Contact:

Re: Linux Backup Proxies problems

Post by lavicky »

DonZoomik wrote: Sep 09, 2020 9:14 pm I have a similar thread here vmware-vsphere-f24/transient-errors-wit ... 68281.html Most of problems described by you were fixed by modifying SSH configuration and changing vNIC buffers. However it's still under investigation for unexplained timeouts.
Hi,
thanks for your reply.
Following support advice I set /etc/ssh/sshd_conf settings this way yesterday:

ClientAliveInterval 300
TCPKeepAlive yes
ClientAliveCountMax 99999
MaxSessions 200
MaxStartups 1000:30:2000

But there were another failed jobs since.

Following you I've modified network card buffer settings now. I'll see if it helps.
nitramd
Veteran
Posts: 297
Liked: 85 times
Joined: Feb 16, 2017 8:05 pm
Contact:

Re: Linux Backup Proxies problems

Post by nitramd »

Depending on your server hardware you might be able to adjust NIC buffers in a server's BIOS instead of the OS - this way the buffer changes will survive reboots.
PetrM
Veeam Software
Posts: 3229
Liked: 520 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Linux Backup Proxies problems

Post by PetrM »

Hello,

@lavicky Thanks for sharing the support case ID with us! Please don't hesitate to point all questions about problems related to collecting or uploading of debug information to our support team, we cannot troubleshoot technical issues over forum posts.

Thanks!
lavicky
Influencer
Posts: 17
Liked: 6 times
Joined: Sep 03, 2020 11:06 am
Full Name: Tomáš Lavický
Contact:

Re: Linux Backup Proxies problems

Post by lavicky »

nitramd wrote: Sep 10, 2020 1:42 pm Depending on your server hardware you might be able to adjust NIC buffers in a server's BIOS instead of the OS - this way the buffer changes will survive reboots.
Thanks for your advice. But server is ESXi host with more production VM's so I prefer to change only the proxy settings. But it had no greater effect yet.
lavicky
Influencer
Posts: 17
Liked: 6 times
Joined: Sep 03, 2020 11:06 am
Full Name: Tomáš Lavický
Contact:

Re: Linux Backup Proxies problems

Post by lavicky » 1 person likes this post

PetrM wrote: Sep 10, 2020 5:00 pm Hello,

@lavicky Thanks for sharing the support case ID with us! Please don't hesitate to point all questions about problems related to collecting or uploading of debug information to our support team, we cannot troubleshoot technical issues over forum posts.

Thanks!
Hello Petr,
logs and other info are continuously uploaded to support via https://my.veeam.com/my-cases. We agreed on a webex session tomorrow.
I consider the forum important for finding out if anyone had the same problem.
Post Reply

Who is online

Users browsing this forum: No registered users and 96 guests