Is my WAN accelerated copy job on LAN too slow?

cparker4486 · Post by **cparker4486** » Nov 22, 2013 11:22 pm this post

I've been experimenting with the WAN accelerated copy jobs and I've been getting very low processing rates: ~9MB/s. I've tried using different WAN accelerator targets, one virtual (with varying values of RAM and vCPU) and one physical (4GB RAM, quad Xeon 3Ghz). The bottleneck is always labeled as "Target WAN" but I don't know by how much or why. For all that I can tell the accelerators are not being taxed. That is, they appear to have a lot of capacity when it comes to the amount of data they can process.

I've experiment with different size WAN target caches just to see if that would have an effect and it hasn't. The test I performed where the cache was only 10GB performed just as well as the one where the cache was 50GB and 100GB.

In the screenshot below you can see that there appears to be a pretty hard read limit of ~8.6MB/s (I believe the dark green is read). Which, if true, I would expect to have result in a "Source WAN" bottleneck.
(The WAN accelerator in the screenshot is a Server 2012 R2 VM with 8GB RAM and 4 vCPU with 100GB cache.)

cparker4486 · Post by **cparker4486** » Nov 22, 2013 11:46 pm this post

Looks like the image is being cut off due to its width. The black line on the graph is at 8.9MB/s.

Post by **tsightler** » Nov 23, 2013 2:19 am this post

"Target WAN" means that the time is spent waiting on the target to process packets, typically due to I/O time on the global cache. If your global cache is just "spinning disk" then this isn't really an unexpected results, it's actually pretty good. If your global cache is on SSD then I'd expect faster. This status you posted looks like an initial full, is it? That's not exactly the scenario where I'd expect to see huge savings, most of the bandwidth savings come on incremental runs. How fast is your link? Also, is this "Patch 2"?

cparker4486 · Post by **cparker4486** » Nov 23, 2013 9:03 pm this post

Hi tsightler. That's right, this is an initial full. The WAN accelerator is running off an Equallogic which has the capability of much greater than 9MB/s read/write. This is partly why I'm so unsure about the numbers I'm seeing. This is not patch 2 (v7.0.0.715). Should I upgrade before continuing to test?

edit: Just looked at the improvements in R2. Will definitely be upgrading before running the job again.

Post by **tsightler** » Nov 23, 2013 9:45 pm this post

cparker4486 wrote:The WAN accelerator is running off an Equallogic which has the capability of much greater than 9MB/s read/write.

It's not about MB/s, it's about I/O latency. The WAN accelerator doesn't read/write a lot of data from a MB/s perspective, but it does read/write many small chunks of data so the faster those requests can be serviced the better. If you want to get the best throughput from the WAN accelerator you need fast SSD for the global cache with a low latency profile to be able to service small block read/writes very quickly. If you have both your repository and global cache on the same disks, this is only going to make things even worse, and if you have both the source and target systems backed by the same disks, things are just getting slower.

But yes, R2 may offer some improvements because of some additional optimizations, however, you'll only get so far if you're using spinning media for the global cache due to I/O latency. If you don't have an option for SSD, then using a global cache that's roughly half the size of RAM seems to offer a small boost. Also, the majority of savings come during incremental runs, not full runs. It will be interesting to see your results from R2.

The design of WAN accelerator is to reduce the amount of data that had to be transferred over slow WAN links, effectively increasing throughput. For customers with links that are less than 40Mbps even spinning media can offer significant increases in performance, but it won't necessarily saturate that link. For example your seeing ~72Mbps, so if your real link is slower than that, it's still a benefit. Even if your link is faster than that, if it's shared with other traffic the reduced bandwidth may be desired. If you want to get maximum performance from links more than about 10Mbps, you'll likely need SSD for global cache, but you still won't likely see performance greater than around 40-50MB/s for total throughput, which is still ~4x faster than a 100Mbps link.

cparker4486 · Post by **cparker4486** » Nov 23, 2013 11:06 pm this post

Makes sense regarding the disk latency. And because of that I've been considering putting together a machine suited to be the WAN target. I was thinking some decently powerful quad core CPU, 8GB RAM, single disk drive for OS (Win 8.1), and 2 x 120GB SSDs in RAID 0 for the cache. Pretty good setup?

My outbound max throughput is 35Mbps so getting an effective rate above that is great. The problem I'm having with this initial full is that it takes so long it doesn't get a chance to finish before the next job runs.

I cancelled (disabled) the copy job, deleted the copy job files on disk (via B&R UI), upgraded to R2, and then restarted the job. Screenshot below. Seems no change.
(Also, one correct to make is that this WAN has been running from a 50GB cache. The break near the beginning was me increasing the cache to 100GB and then the job partially failing due to the WAN accelerator service restart. That too made no difference.)

Post by **tsightler** » Nov 23, 2013 11:27 pm this post

cparker4486 wrote:I was thinking some decently powerful quad core CPU, 8GB RAM, single disk drive for OS (Win 8.1), and 2 x 120GB SSDs in RAID 0 for the cache. Pretty good setup?

I would probably do more RAM, but otherwise good. I'd also highly suggest picking SSDs that are focused on delivering low service times. There may be more overhead in the RAID0 that in just using a single, high-performance SSD and it just doubles the odds of having one fail.

cparker4486 wrote:My outbound max throughput is 35Mbps so getting an effective rate above that is great. The problem I'm having with this initial full is that it takes so long it doesn't get a chance to finish before the next job runs.

You can always seed the backup copy, or slowly build the size by excluding VMs and then slowly remove a few VMs from the exclude list each day until they're all copied.

cparker4486 · Post by **cparker4486** » Nov 24, 2013 9:13 am this post

...seed the backup copy...

How is that done? Do you mean by using Direct mode first?

And I really like the idea of using exclusions.

Post by **Gostev** » Nov 24, 2013 9:18 am this post

Using map backup functionality to point the job to the existing backup file. How you deliver it to the target site is up to you (normally it is an external hard drive).

cparker4486 · Post by **cparker4486** » Nov 24, 2013 5:15 pm this post

Hi Gostev. Thanks for the info.

R&D Forums

Is my WAN accelerated copy job on LAN too slow?

Re: Is my WAN accelerated copy job on LAN too slow?

Re: Is my WAN accelerated copy job on LAN too slow?

Re: Is my WAN accelerated copy job on LAN too slow?

Re: Is my WAN accelerated copy job on LAN too slow?

Re: Is my WAN accelerated copy job on LAN too slow?

Re: Is my WAN accelerated copy job on LAN too slow?

Re: Is my WAN accelerated copy job on LAN too slow?

Re: Is my WAN accelerated copy job on LAN too slow?

Re: Is my WAN accelerated copy job on LAN too slow?

Who is online