Bottleneck: Source = Storage or Hyper-V?

JRRW · Post by **JRRW** » Apr 22, 2021 4:59 pm this post

This is a general question on Hyper-V and on-host bottlenecks.

We have a 2019 Cluster using CSVs connected through MPIO 8Gb FC to a PureStorage all NVMe ("Direct Flash") which makes the bottleneck on reads basically 8Gbps (2x4gbps with MPIO) - however our jobs claim 'source' as the bottleneck and don't even hit 1GB/s.

I'm positive the PureStorage / cluster can transfer and read at a faster rate, so I started to think it might be something with how Veeam interacts with Hyper-V in general that 'slows' things down - but if it were things like compression, dedupe etc it'd be 'proxy' showing as the bottleneck not the Source, correct?

Additionally we've started to deploy ROBO locations with SAS-SSD raid arrays which have absolutely stunning speed on disk performance.

On every one of our hosts, CPU/Memory are all under utilized with no egregious CPU wait times, and run all at least at a 97% relative memory bandwidth (meaning we optimize our DIMMs for the most memory throughput depending on the CPU generation being 4 or 6 channel)

Our repositories never show as bottleneck and all of them have at least an SSD tier or are all SSD, so they can sustain sequential write speeds of >900MB/s - the limiting factor at that point being that most have 2x10Gbps networking between them.

The biggest impact this shows to me is when we do things like replication jobs, or small backups that shouldn't take that long but still might take 5-10 minutes to run simply because it takes Veeam literally 2+ minutes just to 'start' the process on the hosts. When it comes to reading the CBT data it seems to go pretty fast - but even then I feel like it 'should' be faster based on the hardware in question.
As an example - raid-5 SAS-SSD (mixed use) host that I know can run sequential read speeds of >3GB/s:

Code: Select all

Hard disk 1 (450 GB) 118.9 GB read at 135 MB/s [CBT]

On the cluster here are two examples that really make me just scratch my head:
PureStorage SAN volume:

Code: Select all

Hard disk 4 (10 TB) 1.3 TB read at 391 MB/s [CBT]

Same Cluster but different SAN - 3PAR with data on both SAS-SSD and 10k:

Code: Select all

Hard disk 4 (17 TB) 14 TB read at 444 MB/s [CBT]

So in theory the PureStorage has the fastest disk backened, followed by the 3PAR, and about equally the SAS-SSD Raid - yet drastically different speeds.

SO is it just that Veeam isn't as 'fast' with Hyper-V? VMWare seemed to not have these sort of delays in getting things going.

wishr · Post by **wishr** » Apr 22, 2021 5:15 pm this post

Hi Ryan,

Could you please take a look at that article on bottleneck analysis and let us know if it answers your question.

Thanks

JRRW · Post by **JRRW** » Apr 24, 2021 7:33 pm this post

Not really. Or rather, I understand what it means by source, but it doesn't make sense in terms of real world implementation.

Again, these source disks are all-flash - they can read sequential data faster than that. That's why I'm wondering if it's different when using an on-host proxy and somehow influencing the overall 'source' due to the proxy component.

We mitigate our proxy impact on the hosts - in example limiting the number of threads it can use as to not overload the host CPUs - but that seems (when looking at a resource monitor) to be the highest actual impact/metric.

I just feel like 'source' being purely disk can't be accurate, unless here's substantial disk overhead with Veeam and Hyper-V. When I compare - in example - how well Hyper-V Replication is done vs Veeam's replication of Hyper-V, I know that hyper-v itself doesn't have a lot of overhead for tasks like that, so it must be the veeam interaction itself.

Post by **PetrM** » Apr 26, 2021 11:42 am this post

Hi Ryan,

The Source means that the job spends most of time on reading data blocks. However, it's not so easy to explain what exactly can cause our Source Data Mover to become the "slowest" worker in the data transmission conveyor. It might be an infrastructure related specific or all other components of this conveyor are just faster. Also, I wouldn't expect to have a backup processing rate 100 % equals to the hardware sequential read speed because backup is a complicated process and involves additional "logical" operations, f.e WinAPI read etc. By the way, I'd expect more random I/O on CBT runs because only changed blocks are processed.

I'm wondering to know what problem are you trying to solve? The jobs do not fit the backup window?

Thanks!

wishr · Apr 26, 2021 11:50 am

Hi Ryan,

To add to what Petr said.

Additional network and CPU overhead is expected when using an on-host Hyper-V proxy. This is even stated in our user guide. If this is a concern you may switch to the off-host proxy, especially keeping in mind you are using SAN.

In the case of replication, if the issue is the source proxy, the bottleneck would be "proxy" and not "source", regardless of whether it's an on-host or off-host proxy.

Personally, I would recommend raising a support case to perform a deeper analysis of the situation. I'm sure our tech. support engineers can confirm where the bottleneck comes from.

Thanks

JRRW · Apr 29, 2021 6:05 pm

Thanks @Wishr - an off-host proxy has been discussed and we might put that in place when we refresh our Compute this summer.

Appreciate the insight, and open dialog.

R&D Forums

Bottleneck: Source = Storage or Hyper-V?

Re: Bottleneck: Source = Storage or Hyper-V?

Re: Bottleneck: Source = Storage or Hyper-V?

Re: Bottleneck: Source = Storage or Hyper-V?

Re: Bottleneck: Source = Storage or Hyper-V?

Re: Bottleneck: Source = Storage or Hyper-V?

Who is online