Hi all,
One of our clients who uses Veeam on their 1Gbit network between 2 Dell 720 servers is seeing "target" as the bottleneck during replication. Both systems are Dell R720 with plenty of CPU, RAM, Perc with 1GB cache and 12x2TB drives in RAID5. (i don't want to get into a discussion on dangers of RAID5, we are aware but client made certain decisions). It runs vSphere 5.1 with Windows 2012 guest VM and a local Veeam 6.5 server/proxy VM. Storage is all local and dedicated to just this one server VM and Veeam proxy. Target system is purely replication target and no other systems write to it. Veeam has exclusive access and runs 1 job at a time. This particular VM/job processes 4TB volume.
We get aggregate processing rate of about 20-80MB/s with individual VMDKs (2TB each) being replicated at 37-50MB/s with CBT (not first run, but recurring daily runs) - see image below.
This seems like a pretty low transfer/processing rate. We've contacted Dell and their storage/server rep is telling us that this configuration is definitely can't be the bottleneck. If we do simple sequential write, we can saturate 1Gb link and bottleneck becomes the network.
How is the bottleneck checked/calculated? Is this realistic speed for 12 spindles on a hardware RAID controller? (i would expect more).
Hi Yuki, in fact this may have nothing to deal with the storage performance, but rather with processing mode. Can you please check what mode the data is ingested into replica VM with by target backup proxy? This information is actually available in the log you posted, just a few lines up. Thanks!
Does the PERC have BBWC and is writeback caching enabled? If so, as suggested above, configuring a target proxy on the host so that it can use hotadd might be faster (honestly, I'd recommend having a proxy on the target host in any case). Is the screenshot above from an incremental run (it appears so based on the removal of the restore point)? If so, that's a lot of changed data for a file server (I'm assuming it's a file server based on the name FS01, I guess it could be something else). What's changing so much data?
What speed is the ESXi management network of the target host? When using NBD mode "target" isn't just referring to the target storage itself, but rather then speed of writing to the target host which includes the speed of sending data to the management interface on the ESXi host.