I opened this Thursday, but haven't heard back yet. Posting info here hoping someone has a clue. This literally changed overnight. Anyway:
I have a 3-host 6.7 cluster (7.0 VCSA) with 2 1-TB NVME cards participating in the VSAN datastore. Performance for guests is extremely good, reads and writes. Until a couple of days ago, I was getting 400+ MB/sec backup to my OmniOS ZFS NAS (8 1TB spinners in a raid-10). Suddenly, a couple of days ago, the performance went down by 10X. It seems to be related to backing up VMs on the VSAN datastore. I've been trying to isolate the cause. I have a CentOS8 guest with 60GB+. I'm using as a testcase. I've done the following:
Migrate the guest to a JBOD datastore on the same NAS. 400MB or so per second. Migrate back to VSAN. 20-40 MB/sec. Veeam B&R reports target is the bottleneck.
While the backup is crawling along, I logged into the Linux backup proxy (I have 3 but I hardcoded the test backup job to use one specific one.) I see this:
Code: Select all
iostat:
Device tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 0.90 0.00 8.55 0 85
sda2 0.00 0.00 0.00 0 0
sda3 0.90 0.00 8.55 0 85
sda1 0.00 0.00 0.00 0 0
sds 19.60 20070.40 0.00 200704 0
(sds is the hot-plugged guest drive)
192.168.3.44:/jbod/veeam 3.5T 890G 2.6T 26% /mnt/Veeam/{ade978d1-fae0-49b1-88ce-e883d964b241}
I then tried writing a huge block of data to a file there:
Code: Select all
[root@veeam-proxy3 ~]# time dd if=/dev/zero bs=1M count=8K of=/mnt/Veeam/{ade978d1-fae0-49b1-88ce-e883d964b241}/FOO bs=1M count=4K conv=sync
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 6.08942 s, 705 MB/s
Code: Select all
[root@veeam-proxy3 ~]# time dd bs=1M count=8K of=/dev/null if=/dev/sda
8192+0 records in
8192+0 records out
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 6.22564 s, 1.4 GB/s
Code: Select all
[root@veeam-proxy3 ~]# time dd if=/dev/sds bs=1M count=8K of=/mnt/Veeam/{ade978d1-fae0-49b1-88ce-e883d964b241}/FOO bs=1M count=4K conv=sync
4096+0 records in
4096+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 6.76519 s, 635 MB/s