Two-way WAN Acceleration with limited SSD

Nov 12, 2015 4:40 pm

So I had a dilemma today on how to best setup the WAN accelerators between 2 sites. Each site has a dedicated physical Veeam server, with a limited amount of SSD (around 300GB free) and a large local disk repository on RAID 6.

Looking at the v8 best practice book (and other articles), it is suggested that the source WAN accelerator of a backup copy job benefits from the fastest IO possible (so in my case SSD), since it does a fair amount of uncompressing and re-compressing on the fly while creating/storing the digests. The amount of space used is about 2% of the VM source files, and this space can't be reserved in the WAN accelerator setting.

It also says that the destination WAN accelerator can also be high IO but generally spinning disks are fine on a one-to-one pairing (as opposed to many-to-one configuration). This disk space is reserved and uses the setting you specify in the WAN accelerator setup (called the Global Cache), and I believe this will benefit from larger reservations to store more cache (although it's not stated where the diminishing returns are likely to kick in, and my discussions with Chris D suggested that only the operating system drives are cached in v8 anyway). Since the cache pre-population feature finds 10 different OSes in my backups, I've chosen 400GB of global cache space.

So when setting this up in one direction between the servers, it's easy to select the correct paths:

One-Way Copy Job
Source Server - Digests: WAN Accelerator on C:\VeeamWAN (SSD)
Destination Server - Global Cache: WAN Accelerator on D:\VeeamWAN (RAID 6 SATA)

However if you also need to run backup copy jobs in the reverse direction, the source and destination WAN acceleration settings are using the wrong paths, since you can only have 1 WAN accelerator per server with one path/cache configuration.

Reverse copy job (equals bad configuration, which will fill up the SSD to 0 bytes remaining)
Source Server - Digests: WAN Accelerator on D:\VeeamWAN (RAID 6 SATA)
Destination Server - Global Cache: WAN Accelerator on C:\VeeamWAN (SSD)

So in order to keep the digests and global cache separate to suit the two-way nature of the backup copies, I ended up doing the following:

Modified Settings for two-way Copy Jobs
Source Server: WAN Accelerator on C:\VeeamWAN (SSD).

Code: Select all

mklink /J C:\VeeamWAN\GlobalCache D:\VeeamWAN\GlobalCache

Destination Server: WAN Accelerator on C:\VeeamWAN (SSD).

Code: Select all

mklink /J C:\VeeamWAN\GlobalCache D:\VeeamWAN\GlobalCache

This forces a reparse point to redirect the Global Cache onto the larger drive, leaving everything else with max IO on the SSD.

I have now kicked off a pre-population of the cache on both servers, and can immediately see the 400GB cache on the D drive, while the C drive is left with a comfortable amount of working space for Digests and Send/Recv folders.

I'll update with further confirmation that this works ok during the backup copies themselves, but hopefully this might help someone else in my situation. Originally I thought the WAN acceleration feature just used a static amount of space defined in the config, so I sized the SSDs accordingly. However this is not the case, only the Global Cache size is fixed, and the overhead from the Digest processing can be anything up to 2% of the source VMs + 10GB.

chrisdearden · Post by **chrisdearden** » Nov 12, 2015 4:53 pm this post

Thanks for the writeup Rob. I look forward to hearing how the testing progresses.

Post by **Amarokada** » Nov 18, 2015 11:00 am this post

Well the update from me is that I've had no problems using reparse points to split the VeeamWAN folders onto separate drives. However I underestimated the size the digests would reach.

The documentation isn't very clear on this point, it just says allow for 2% of the source VM sizes. That could be read as having 2% of the largest VM only, if the cache area is re-used for each VM, or it could mean 2% of the space the backups of the source VMs take (which would be smaller than the actual VMs). I suspect after seeing some of the digest sizes reaching 25GB, that the 2% is of the raw provisioning size of all the VMs in your copy job. If the copy job is based on an original backup job, then this would be 2% of the estimated size that is shown in the VM selection dialog of the original backup job.

I could be wrong, but it seems the digest size is 2% even if the VM uses thin provisioning and has lots of empty space. Is there an easy way to match the GUID folder name in the digest folder with the VM?

So it turns out I will need a much bigger SSD ! I'm currently at 420GB of digests and I'm expecting this to grow (hopefully not over 1TB).

It isn't really made clear that Veeam will need so much working digest space for backup copy jobs that use the WAN accelerator. I know it's in the docs once you find out the hard way you've run out of space, but I truly believed the cache setting in the WAN acceleration setup would be a hard limit for both source and destination. I guess I've learned a valuable lesson there.

So my next issue will be in trying to pre-seed the next DC to DC copy, since the initial WAN accelerated backup copies have taken much longer than I thought, especially since non-OS drives are not de-duped at all, and those VMs with 3TB VMDKs just block up the whole syncing process for days. Is there a fairly easy way to use an external USB drive that is shipped to the destination, to pre-seed the backup copy jobs? I suspect something along the lines of backing up to the USB respository (with encryption of course), shipping the drive, copying the files from USB to local repository, importing the backups. The issue might be in linking those backups with the backup copy jobs in the partner DC before kicking off an incremental update.

Cheers
Rob

Post by **PTide** » Nov 18, 2015 11:17 am this post

Hi Rob,

Actually SSD is no longer a requirement for WAN accelerator cache starting v8. Could you please provide us with the link to the v8 best practice book you've mentioned in your first post?

Thank you.

Post by **Amarokada** » Nov 18, 2015 11:42 am this post

It's not a downloadable book, but the one that was given out at VMWorld last month by Veeam. It's just called "Best Practices" Veeam Backup & Replication for VMware v8, September 2015.

On page 92 it says this about source side disk sizing for WAN accelerator:

"The disk I/O pattern on the source WAN accelerator is high and should be taken into account when planning for storage; it is recommended to deploy WAN accelerators on the fastest storage available on that side."

Since I had originally planned for 300GB of SSD to be used for this, the only other disk is the RAID 6 (6x6TB drives), which is managed by a hardware RAID card with 1GB cache. I'm currently using that because I had mis-sized the SSD required. During an incremental backup copy job I see this RAID 6 drive with roughly 40MB/sec read and 5MB/sec write and about 33% active time (according to windows task manager). Other than for merging operations I had tried to spec this volume for sequential reads and writes (which reach 600MB/sec) as a main repository only.

Since backup copy jobs are all processed locally on the server before sending over the WAN, I'm a little disappointed with how slow the incrementals are, all the while the CPU never goes above 13%, memory 39%, and only a few MB of data trickle across the WAN. The only thing that I would see as the bottleneck is the RAID 6 volume that all data now resides on (and even that has large gaps where it sits idle during the job). I'm hoping moving back to SSD for the digests would at least help this, especially during times where the repository is doing merges for other jobs.

A typical incremental backup copy VM has the following stats:

Hard Disk 1 (58.6GB) 5.6 GB read at 43 MB/s (03:31 duration)
57MB transferred over the network, 1.6MB obtained from WAN accelerator cache
Hard disk 2 (200GB) 10.1 GB read at 38 MB/s (05:31 duration)
502.6MB transferred over the network, 0.0 KB obtained from WAN accelerator cache
Busy: Source 87% > Source WAN 9% > Network 0% > Target WAN 98% > Target 50%

Rob

Post by **foggy** » Nov 18, 2015 12:01 pm this post

Amarokada wrote:So my next issue will be in trying to pre-seed the next DC to DC copy, since the initial WAN accelerated backup copies have taken much longer than I thought, especially since non-OS drives are not de-duped at all, and those VMs with 3TB VMDKs just block up the whole syncing process for days. Is there a fairly easy way to use an external USB drive that is shipped to the destination, to pre-seed the backup copy jobs? I suspect something along the lines of backing up to the USB respository (with encryption of course), shipping the drive, copying the files from USB to local repository, importing the backups. The issue might be in linking those backups with the backup copy jobs in the partner DC before kicking off an incremental update.

If you follow the backup copy job seeding procedure, you should not have any issues.

Amarokada wrote:I'm hoping moving back to SSD for the digests would at least help this, especially during times where the repository is doing merges for other jobs.

Your bottleneck stats show that the real bottleneck is Target WAN, so I'd pay attention to this side of the equation.

R&D Forums

Two-way WAN Acceleration with limited SSD

Re: Two-way WAN Acceleration with limited SSD

Re: Two-way WAN Acceleration with limited SSD

Re: Two-way WAN Acceleration with limited SSD

Re: Two-way WAN Acceleration with limited SSD

Re: Two-way WAN Acceleration with limited SSD

Who is online