Host-based backup of VMware vSphere VMs.
Post Reply
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Transient errors with Linux hot-add proxy

Post by DonZoomik »

I've noticed that using Linux hot-add proxies results in transient quite random backup/replication failures.

Code: Select all

Debug logs removed by moderator
No support ticket yet as I'm not quite sure that our network is blameless, but I'm not scratching my head a bit about where to look further to provide some hints to support. However it mostly happens to proxy that is closest to backup server (same datacenter, same subnet/VLAN). Remote proxies (another country over IPsec) have much lower error rate (but not zero), also SAN proxies see almost no errors. Nothing dramatic (always recovers during retry) but it's quite annoying as it consistently triggers monitoring alarms.
So far I've applied SSH changes here https://www.veeam.com/kb2985 and it seems to have reduced at least some errors but not fully. KB article is about repositories but proxies have a lot of SSH connections as well and it's possibly not updated as Linux proxies is quite new functionality.

Has anyone noticed anything similar?
HannesK
Product Manager
Posts: 14844
Liked: 3086 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Transient errors with Linux hot-add proxy

Post by HannesK »

Hello,
probably yes... but please open a support case, because us guessing over forum posts will not give you the solution any time soon.

veeam-backup-replication-f2/rules-of-po ... -t755.html

Best regards,
Hannes
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Transient errors with Linux hot-add proxy

Post by DonZoomik »

Yes, I'll create a case.

IMHO redacting any errors or logs snippets is not a good policy. For many people, it's first instinct to Google the error messages and finding some results can sometimes help (at least to know that it's a known issue - or hint at possible problem spot). But that's just my opinion.
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Transient errors with Linux hot-add proxy

Post by DonZoomik »

#04296609
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Transient errors with Linux hot-add proxy

Post by tsightler »

Can you share a few more details about your proxy configuraiton, i.e. what distro, RAM/vCPU, how many tasks are you running per proxy, etc? Since these are obviously virtual, also, are you using VMXNET3 or some other virtual NIC.
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Transient errors with Linux hot-add proxy

Post by DonZoomik »

Debian 10 with UEFI boot, 7 vCPU, 4GB RAM, pvscsi, vmxnet3, 32GB disk (I ran out of disk space for logs with just 16GB). 7 task limit. Doc says that that you need number of tasks + 2 vCPUs but it sounds a bit of an overkill.
Also proxy is set to low CPU shares in VMware, so under contention it should get less CPU throughput. It's usually moot but occasionally there are CPU peaks on production VMs so it's there just in case as we don't have DRS and our load balancing script runs only once a day.
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Transient errors with Linux hot-add proxy

Post by tsightler »

My first gut would be memory. While 4GB is technically enough for minimum requirements, it's well below best practice. I saw behavior very similar to what you describe in my early testing with Linux proxies when the Linux OOM killer would kill the process due to lack of memory. I increased swap but eventually had to increase memory to get the errors to go away. Have you monitored peak memory/swap usage or checked to logs to see of the OOM killer is getting triggered and killing a veeam process? I'd at least try with 8GB and see if it gets better/goes away.
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Transient errors with Linux hot-add proxy

Post by DonZoomik »

We don't monitor proxies as they're considered almost ephemeral and only support for SAN proxies. No sign of OOM in logs. Will increase RAM for testing overnight.
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Transient errors with Linux hot-add proxy

Post by DonZoomik »

Checked midnight job, still seeing the same errors after increasing RAM to 8GB.
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Transient errors with Linux hot-add proxy

Post by DonZoomik »

I also tried reducing tasks to 5 in accordance with minimum requirements (2vCPU base + one per task) but this has not cleared the problem.
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Transient errors with Linux hot-add proxy

Post by tsightler »

Unfortunately I don't have any other generic suggestions and I haven't seen this issue in other client deployments. I'd suggest continuing to work with support and hopefully they will find something.
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Transient errors with Linux hot-add proxy

Post by DonZoomik »

While responding to support, I was reminded of an ancient problem with NIC buffer exhaustion. I checked statistics and indeed proxies seem to be running out of vmxnet3 ring buffers. I increased them to max and... let's wait and see.
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Transient errors with Linux hot-add proxy

Post by DonZoomik » 1 person likes this post

Still working with support but the resolution (results are improving, we're checking on other proxies for comparison) seems to be a combination of modified SSH configuration and buffer changes.
https://www.veeam.com/kb2985 we've modified session count and startups to 250.
I've modified interface config

Code: Select all

iface ens192 inet dhcp
        pre-up /sbin/ethtool -G $IFACE rx 4096 tx 4096 rx-jumbo 4096 rx-mini 1024
Setting rx-mini to max seems to make it not work at all, common to this https://access.redhat.com/solutions/4772921 (paywall, in short there's a driver bug if rx-mini is set to max)
nitramd
Veteran
Posts: 298
Liked: 85 times
Joined: Feb 16, 2017 8:05 pm
Contact:

Re: Transient errors with Linux hot-add proxy

Post by nitramd »

Here are the default buffer settings on my CentOS 7.8 test server VM; NIC is VMXNET3:

Current hardware settings:
RX: 1024
RX Mini: 128
RX Jumbo: 256
TX: 512

In case anyone finds this useful.
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Transient errors with Linux hot-add proxy

Post by DonZoomik »

Defaults are the same for Debian and I presume pretty much anything as it is likely defaults from upstream kernel.
nitramd
Veteran
Posts: 298
Liked: 85 times
Joined: Feb 16, 2017 8:05 pm
Contact:

Re: Transient errors with Linux hot-add proxy

Post by nitramd »

I checked an Ubuntu 18.04.4 LTS server and saw the same settings.

RHEL 8 is different though, values of 0 for RX Mini and Jumbo with RX set to 512 and TX 256. Another RHEL 8 box showed 0 for Mini & Jumbo and a value of 256 for RX & TX.
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Transient errors with Linux hot-add proxy

Post by DonZoomik » 1 person likes this post

A small update.
The issue has not been fully resolved (though much reduced compared to before). Still experiencing occasional timeouts despite SSH and buffer changes. Also tried kernel parameter vsyscall=emulate, but that was specific to v9.5U4.
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Transient errors with Linux hot-add proxy

Post by DonZoomik »

A small update again.
The issue has not been resolved and is still under investigation. According to support engineer, there are a few other cases out there...
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Transient errors with Linux hot-add proxy

Post by DonZoomik » 1 person likes this post

AFAIK this was an issue with sudo elevation, fixed in v11.
Post Reply

Who is online

Users browsing this forum: No registered users and 29 guests