Recommendation for Linux Proxy

garth1138 · Mar 18, 2020 1:27 pm

Hello,

I'm getting ready to spin up a v10 Linux Proxy for my B&R on an ESXi cluster. This will be a virtual machine.
From a post on Reddit three months ago, Gostev mentions this:

...here are some curious backup proxy stress testing results with the current v10 build, with different job and compression settings.
WS2019: Most consistent performance overall regardless of proxy load.
Ubuntu 19, Debian 10: By far top performance specifically under heavy load, worse than WS2019 under lighter loads.
openSUSE 15.1: Very consistent and only slight worse performance than WS2019.
CentOS8: Most inconsistent performance (all over the place depending on proxy load).
CentOS7: Most terrible performance overall, not recommended

For CentOS he goes on to say:
"This is likely due to the default tune of CentOS. Experienced users will apply throughput oriented tuned profile to solve the issue. But since almost all users just go with the default OS settings, we need to establish some general recommendations for out of the box configurations."

From the v10 user guide, I am aware of the following requirements and limitations that apply to Linux backup proxies:
• bash shell and SSH are required
• The user account that you specify for the Linux server must be a root user or user elevated to root.
• The disk.EnableUUID parameter of the Linux server must be set to TRUE in the VMware vSphere client.
• Only the Virtual appliance transport mode is available.

So, I am further wondering:
1. What is the recommended distro for best performance?
2. What is the recommended configuration for memory/disk?
3. What is the recommended performance tuning?

anything else?

Post by **HannesK** » Mar 19, 2020 5:48 am this post

Hello,
and welcome to the forums.

1. Windows

You probably can tune all distros to Ubuntu / Debian level (which are obviously the recommendation according to our QA tests because you don't need to do anything).
2. Same like for Windows (1 CPU & 2 GB RAM per task)
3. I don't have that information as I use Debian since 2002

Best regards,
Hannes

garth1138 · Post by **garth1138** » Mar 26, 2020 12:33 pm this post

Update:
Decided to go with Ubuntu 19.10
Created VM with "ubuntu-19.10-live-server-amd64.iso"
• 8 Gigs RAM
• 40 Gig HD
• 4 CPU
• VMWare ParaVirtual SCSI controller and VMXNET3 NIC

Installed with no extra packages, just straight up server-only install.
Perl is installed by default so no worries.
Set the "disk.EnableUUID" advanced config parameter of the Linux server to TRUE in the VMware vSphere client before reboot

Added all the necessary port to the firewall and activated the firewall.

Code: Select all

$ sudo ufw allow OpenSSH
$ sudo ufw allow 2500:3300/tcp
$ sudo ufw allow 443/tcp
$ sudo ufw allow 902/tcp
$ sudo ufw allow 49152:65535/tcp
$ sudo ufw enable
$ sudo ufw status
Status: active

To                         Action      From
--                         ------      ----
OpenSSH                    ALLOW       Anywhere
2500:3300/tcp              ALLOW       Anywhere
443/tcp                    ALLOW       Anywhere
902/tcp                    ALLOW       Anywhere
49152:65535/tcp            ALLOW       Anywhere
OpenSSH (v6)               ALLOW       Anywhere (v6)
2500:3300/tcp (v6)         ALLOW       Anywhere (v6)
443/tcp (v6)               ALLOW       Anywhere (v6)
902/tcp (v6)               ALLOW       Anywhere (v6)
49152:65535/tcp (v6)       ALLOW       Anywhere (v6)

Also installed tuned and set default profile to "throughput-performance"

Code: Select all

$ sudo apt install tuned
$ sudo tuned-adm profile throughput-performance
$ sudo tuned-adm active
Current active profile: throughput-performance

I added the new linux proxy server to my Veeam 10 infrastructure on the Veeam server and set it as an active proxy.

I use three total proxies in my infrastructure, including the Veeam Server, so I de-activated one of the Windows proxies. The first incremental backup went flawless.
The second day I created another Linux Proxy same as above and de-activated the other Windows proxy. Same flawless performance and incremental backups have been just as fast as the Windows ones.

I will see this weekend, when I do an active-full backup of all my servers, if the overall performance will match the previous months. (8TB read, 5.5TB transfer)

I'll try to report my result.

garth1138 · Post by **garth1138** » Mar 30, 2020 5:28 pm this post

Okay,
The results are in. It turned into a minor disaster.
First Backup Job finished with three VMs backed up. Then the next job with 16 VMs had 4 failures. Not telling if this was related to the Linux Proxies. Snapshots were failing to be removed. The vCenter server was failing with this error:

"Failed to create VM snapshot. Error: CreateSnapshot failed, vmRef vm-36, timeout 1800000, snName VEEAM BACKUP TEMPORARY SNAPSHOT, snDescription Please do not delete this snapshot. It is being used by Veeam Backup., memory False, quiesce False
Error: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 10.x.x.x:443

Again, no idea if this was due to Linux proxies. 2nd and 3rd retries failed (I was not watching at this time... )
So, I stopped all jobs and rebooted the Veeam Server, both Linux proxies, found multiple Veeam snapshots on the vCenter server so I deleted those (took a while) and rebooted the vCenter server appliance (a Vm in my ESXi environment). WOW, this has never happened to me before. My normal routine is to simply reboot the Veeam servers and the gateway server (which I had done before the first run), never the vCenter appliance.

After all was rebooted I started the Active full backups again. All Jobs worked this time through. I watched the progress for about 4 hours. What I noticed is that the Linux proxies were processing my hard disks at a much slower rate than the Veeam-server-proxy (Windows). This is revealed in the overall time my full backup took.

Essentially, every backup job was slower except for a 16VM job of mostly Linux VMs (48 minutes today compared to 58 minutes last time) with roughly the same amount of data.

I have 5 backup jobs that run in sequence. I am transferring approximately 5.5TB (after dedupe and compression). My time for the total Active Full backup last month with all Windows proxies (Server 2012 R2) on v9.5 was 14:54 (HH:mm). This month on v10 with my Linux proxies, my overall time was 19:43 (HH:mm). That is an increase of 4:49 or a 32% increase in time.

Added to this I am seeing these messages on the consoles of my Linux proxies: Buffer I/O error on dev dm-[x], logical block [x], async page read. Pic here:

(https://imgur.com/ebqb0pr)

I am now also worried about the data. I never had any issues such as this with my Windows proxies.

So, I am going back to my Windows proxies. I'll be waiting to see if the Linux proxy gets better. It was worth a shot.

nitramd · Post by **nitramd** » Apr 02, 2020 3:33 pm this post

@garth1138,

I've found that adjusting the RX & TX buffers upward helps performance issues. Also, enabling Receive Side Scaling helps.

orange · Post by **orange** » Mar 05, 2021 5:20 pm this post

Hello
Could you please specify how did you adjust the RX buffers and recueved side scaling? Also on which linux disro?

Thnx

nitramd · Post by **nitramd** » Mar 08, 2021 4:34 pm this post

@orange,

I'm unable to locate my notes but I would suggest that you simply launch your search engine of choice and search accordingly.

IIRC, the buffers I've tuned were for CentsOS/RHEL and Ubuntu.

Ubuntu 18.04 was an odd case, IMO, in that any buffer changes made are transient, i.e. the changes are gone after a reboot. I think you have to create a script to make the buffer changes more permanent. Ubuntu 20.04 was the same if memory serves me correctly.

orange · Post by **orange** » Mar 08, 2021 4:45 pm this post

Do you suggest to got for redhat?

nitramd · Post by **nitramd** » Mar 08, 2021 10:46 pm this post

If you have RHEL Linux installed then the Red Hat portal is where I'd start first. You can either find the documentation in their portal or you could open a support case.

Hope this helps and good luck.

Post by **andre.simard** » Mar 09, 2021 6:42 pm this post

Hi,

I have deploy multiple hardened Linux repositories last week on Ubuntu 20.04 and now i see this post where it's talking about using a package call Tuned to optimise the performance.

I'm not an heavy user of Linux so I would like to get the input from Veeam on that package to know if it's a good practice to use it?

Thank you

Mar 09, 2021 7:58 pm

The tuned utility is a simple package which allows you to configure default "tunes" using various pre-defined profiles (or you can make a custom profile). Most of these tunes are pretty minor, but some can make a little bit of a difference, based on the exact mode. For example, the "throughput-performance" mode does things like set the default readahead for block devices (disks) to 4MB, up from the Ubuntu default of 128K. Interestingly, Redhat seems to default to 4MB readahead. This can improve backup throughput of modes that read from attached block devices like hotadd or Direct SAN mode, but is unlikely to be major increase. In my lab I've seen maybe a 15-20% increase in single stream backups with this tune, but not much if your are reading mulitple VMs/disks.

The other thing the performance/throughput tune does is set the CPU governor to performance mode and locks the clock to maximum performance (basically, the equivalent of the BIOS "performance" settings for the CPU) so be aware that it can use more energy and generate more heat.

It also tunes the kernel scheduler to make it less likely to be preempt already running tasks, which helps with throughput while increases latency (constantly switching between tasks is less efficient than given each tasks slightly longer schedule time and switching less). It also tunes the memory and swappiness settings to more typical "server" workloads. In general this tune is likely good for overall throughput, but, other than the readahead tweaks, the other stuff is pretty minimal benefit for the typical Veeam workload.

Tuned also has a network-throughput tune which mostly just increases the maximum rmem/wmem values, and, honestly, this seems not very useful unless you are at >25Gb speeds or perhaps in some weird case where the proxy and repo are connected via high speed links with higher latency, I think this tune is a little outdated at this point because modern linux system set the rmem/wmem values based on available memory and, even on my fairly small lab systems, this tune actually sets the "default" values slightly smaller than the out-of-box default, which is crazy, although admittedly it does set the maximum size to the absolute maximum value. Personally I've been unable to measure any benefit from this tune.

Being designed as a shared use system, for the most part, Linux systems are tuned for fairness between processes, not for absolute maximum performance of any single process or network stream, and it usually does a pretty good job of balancing the load for all of the different Veeam processes. The only tuned profiles I recommend are throughput-performance or virtual-guest (which includes throughput-performance) if running on a VM, but other than the readahead values I've found very little difference overall.

Regarding the rx/tx buffers, my guess is this is referring to the rx/tx buffers allocated to the device driver which cat be read and configured with ethtool:

Code: Select all

# ethtool -g ens192
Ring parameters for ens192:
Pre-set maximums:
RX:             4096
RX Mini:        2048
RX Jumbo:       4096
TX:             4096
Current hardware settings:
RX:             1024
RX Mini:        128
RX Jumbo:       256
TX:             512

You can see in this case that both my RX and TX buffers can be set to a maximum of 4096, but the default is TX 512 and RX 1024. Indeed increasing these can improve throughput on 10Gb and faster networks. Tuning these are easy, just use the following command as an example:

Code: Select all

ethtool -G ens192 rx 2048 tx 2048

If things improve, you can make the changes persistent by just adding these commands to rc.local. Well, there are probably better ways, but that's the "easy" way and should work for all distros while exactly how to persist ethtool options otherwise varies by distro. You can always Google "make ethtool settings persistent" for your distro to find the preferred distro specific way.

Also, if it's possible, always use jumbo frames, this really increases the throughput for speeds of 10Gb or greater but giving the system a lot less to do but also by increasing the maximum possible efficiency of the TCP streams from 94% to 99%. So many environments ask me about performance but then don't use Jumbo frame while it's the single biggest thing you can do to improve throughput on >=10Gb networks as well as for improving NBD mode performance with VMware. Admittedly, do it right, don't mix standard and jumbo frame devices on the same layer-2 network/VLAN, and make sure your equipment supports it, but it's 2021, hopefully everything out there supports it well enough at this point.

The only other tuning example I've seen be useful with Linux proxies is to increase the default NFS read-ahead when using Direct NFS. Linux default to a pretty conservative value of 128K, which is probably great for most workloads but not so great for Veeam which typically reads in 1MB chunks. I've seen performance improve by 30-40% for a single stream VMDK backup just by increasing the readahead to 2MB instead. Since Veeam automatically mounts and dismount NFS volumes when using direct NFS and there doesn't appear to be any way to globally force NFS readhead (maybe somebody can tell me I'm wrong on that), the best way I've found is to use a udev rule to set the readhead on the virtual block device that is automatically created whenever any NFS share is mounted. I just drop the following line into /etc/udev/rules.d/99-nfs-readahead.rules

Code: Select all

SUBSYSTEM=="bdi", ACTION=="add", PROGRAM="/bin/awk -v bdi=$kernel 'BEGIN{ret=1} {if ($4 == bdi) {ret=0}} END{exit ret}' /proc/fs/nfsfs/volumes", ATTR{read_ahead_kb}="2048"

Good luck and feel free to share any of your own tips on Linux proxies.

Post by **andre.simard** » Mar 10, 2021 1:46 pm this post

Wow thank you tsightler for all that very useful information.
It's very appreciate.

Thank you

orange · Post by **orange** » Mar 10, 2021 1:56 pm this post

One of the best answers i have ever seen. Thanks so much for the info. It will take me sometime to cover all if it

Much appreciated. Will keep this updated with my findings.

mkretzer · Post by **mkretzer** » Apr 30, 2021 4:26 am this post

Since we previously had some problems with windows proxies that enabled hot-added disks instead of leaving them disabled we want to switch to linux proxies with V11 (even knowing that the last corruption issues we had were not caused by this).
We are super-paranoid - would it be useful to disable LVM support in linux completely so that linux won't automatically discover/use hot-added linux LVMs? Is there something like "offline shared/automount off" we can do in linux so it is 100 % safe that linux won't wrongly mount the hot-added disks at backup time?

Post by **HannesK** » Apr 30, 2021 6:39 am this post

uhm, Linux never had that Windows "automount" feature which hit you some time ago.

If you don't install the lvm / lvm2 package, then there is no LVM support.

Post by **tsightler** » Apr 30, 2021 6:53 pm this post

I don't know any Linux distro that does automount of SCSI devices by default. There is of course some concept of automount in Linux, for example for USB or CD/DVD, and you can configure almost anything to automount, but the default Linux behavior for non-removable disks is to require manual scanning. But even LVM scanning, if it happened, wouldn't hurt anything as that's just reading things. An attached VG/LV would have to be activated before anything could mount it and it won't be active by default.

That being said, I understand being paranoid. I think you could just set "scan" parameter in lvm.conf to null if you wanted and disable dbus notifications, should be the first few lines.

mkretzer · Post by **mkretzer** » May 01, 2021 11:52 am this post

Ok, we will uninstall lvm package and disable dbus notifications - what exactly to dbus notfications do? Are they not needed for the proxy functionality?

Also, is Debian 10 still a good OS as proxy in V11?

Post by **HannesK** » May 03, 2021 7:00 am this post

Debian 10 is fine as proxy in V11, yes.

In general, I recommend taking the distribution a customer knows best instead of trying to squeeze out a few percent speed. The additional work probably never justifies the effort.

agree that there is a general concept of "automount" in Linux. But it's different to the Windows "automount" feature where Markus had issues with. That's why I would say that we are going off-topic.

mkretzer · Post by **mkretzer** » May 17, 2021 3:34 pm this post

We have done first tests with Deb10 + V11. In the kernel log we get:
[ 1239.727194] vmw_pvscsi: msg type: 0x1 - MSG RING: 3/2 (5)
[ 1239.727196] vmw_pvscsi: msg: device removed at scsi0:2:0
[ 1239.727199] vmw_pvscsi: failed to lookup scsi0:2:0
[ 1536.700446] vmw_pvscsi: msg type: 0x1 - MSG RING: 4/3 (5)
[ 1536.700447] vmw_pvscsi: msg: device removed at scsi0:3:0
[ 1536.700449] vmw_pvscsi: failed to lookup scsi0:3:0

Is that "normal"?

May 17, 2021 3:58 pm

Definitely normal, especially if you were using hotadd. Basically, it's just a hot remove event. If you look up the message value of 0x1 in the vmw_pvscsi source it's PVSCSI_MSG_DEV_REMOVED which happens any time a SCSI device is detached, thus creating the device removed and rescan events that follow. When the backup started there should have been an equivalent 0x0 message type event that would have been the hotadd event, and an equivalent device added event.

Post by **ashleyw** » Oct 20, 2024 8:51 pm this post

hi, Sorry for bringing up an old post.
We are running our proxies on Rocky9.
I understand that these are Linux hot add events, but how do we configure Linux to suppress these messages?
Reason why is that these messages go through to the console, and this creates confusion.

Code: Select all

Oct 21 05:00:48 VeeamProxy1 kernel: vmw_pvscsi: msg type: 0x0 - MSG RING: 803/802 (5)
Oct 21 05:00:48 VeeamProxy1 kernel: vmw_pvscsi: msg: device added at scsi0:1:0
Oct 21 05:00:48 VeeamProxy1 kernel: scsi 0:0:1:0: Direct-Access     VMware   Virtual disk     2.0  PQ: 0 ANSI: 6
Oct 21 05:00:48 VeeamProxy1 kernel: sd 0:0:1:0: [sdb] 101842944 512-byte logical blocks: (52.1 GB/48.6 GiB)
Oct 21 05:00:48 VeeamProxy1 kernel: sd 0:0:1:0: Attached scsi generic sg1 type 0
Oct 21 05:00:48 VeeamProxy1 kernel: sd 0:0:1:0: [sdb] Write Protect is off
Oct 21 05:00:48 VeeamProxy1 kernel: sd 0:0:1:0: [sdb] Write cache: disabled, read cache: disabled, doesn't support DPO or FUA
Oct 21 05:00:48 VeeamProxy1 kernel: sdb: sdb1 sdb2 sdb3 sdb4
Oct 21 05:00:48 VeeamProxy1 kernel: sd 0:0:1:0: [sdb] Attached SCSI disk

and here to show the verison of vmw_pvscsi

Code: Select all

# lsinitrd | grep -i vmw_pvscsi
-rw-r--r--   1 root     root        15752 Apr  8  2024 usr/lib/modules/5.14.0-427.40.1.el9_4.x86_64/kernel/drivers/scsi/vmw_pvscsi.ko.xz

I've searched and can't find anything obvious on how to pressures the info messages.

Post by **ashleyw** » Oct 20, 2024 10:20 pm this post

* I've searched and can't find anything obvious on how to suppress the info messages I meant.

dtraenapp · Post by **dtraenapp** » Oct 21, 2024 7:13 am this post

I don't know any reason why you should suppress these messages? It's a normal Kernel-Event and has to be logged! It's up to you to catch error-events to handle.

R&D Forums

Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Re: Recommendation for Linux Proxy

Who is online