Large VHDXs, CBT and backup optimization

Amyd80 · Post by **Amyd80** » Dec 29, 2012 2:29 pm this post

Hi everybody, we are getting ready to deploy our first Veeam backup installation in the next weeks, and I have a few questions about some configuration details that I couldn't quite figure out from reading the user manual and the forums. We will have a two-node Hyper-V WS2012 cluster, with 8 VMs stored on a SAS SAN, and a few Synology units as iSCSI backup targets. We intend to run Veeam in one of the VMs, and give it and the two cluster hosts access to the backup iSCSI LUNs - if I understood correctly, this is both supported as well as recommended for best performance, and Veeam will orchestrate the access to the LUNs between the backup server VM and the hosts as needed during backup.

Question 1: a couple of the VMs (file servers) will have VHDX drives that will be over 16 TB, possibly up to 30 TB. Are there any issues with Veeam performance with large VHDX files? Especially when it comes to CBT, compression and / or deduplication? I know that there is a special deduplication setting for such large VMs in 6.5, but anything other than that, like for instance other guidelines when it comes to memory or disk cache space (the hosts have RAID-1 SSDs which are only 100GB)? As a matter of fact, are there any performance issues with CBT in general? The VMs are expected to serve high-bandwidth clients, so what kind of an impact does the CBT driver have on performance?

Question 2: as a way to load-balance a bit in normal operation conditions, the VMs will be spread over both cluster nodes via affinity rules, and over two separate CSVs (each owned by the node which also runs the VMs), which are in turn stored on separate LUNs owned by different RAID controllers in the SAN. We also have two Synology units which will be used by daily backups, so in principle, we can run at least two backup jobs in parallel with no hardware contention whatsoever. Now, what would be the optimal way to configure a backup policy in order to use both backup targets in parallel? If it is two separate backup jobs for each cluster node, what happens when one node fails and the VMs migrate to the other node? If it is a single backup job using the cluster as a container , can one specify different repositories for the VMs and can Veeam automatically run the backups in parallel?

Thanks for any hints!

Post by **Gostev** » Dec 29, 2012 5:31 pm this post

Hi,

1. You are correct, there is a special storage optimization setting in the job for 16+ TB backups, so just make sure you have this enabled.
CBT driver performance impact is negligible (a few percent for typical storage). By the way, it can be disabled on per-host basis, if needed.

2. Since you can only specify one repository per job, two jobs (one job for each cluster, pointing to the designated repository) should be the best approach. Just use the cluster as a container when configuring the job. Migrating VM from one cluster node to another node will not affect your backups, of course.

Thanks.

Amyd80 · Post by **Amyd80** » Dec 29, 2012 9:59 pm this post

Thanks for the quick answer!

Unfortunately, the performance sensitive VMs are the same ones with the tens of terabytes data volumes, so disabling CBT would probably make it impossible to back them up within the normal backup window. Are there any specific CBT performance bottlenecks in terms of hardware or software configuration that we can tweak to make sure the impact of the driver is as low as possible? Like, is for instance the speed of the system drive of the cluster nodes the limiting factor, or something else entirely? And am I right in assuming that this small performance hit is only present when writing to the VMs, that is, read operations do bypass the CBT driver entirely, right?

To point 2, just to make sure I understood correctly: just schedule two simultaneous backup jobs with two different backup targets, both using as source container the entire cluster, and then use the exclude option to respectively kick out one half of the VMs for the one job and the other half for the other, right? That makes sense, but one quick follow-up question: since the container is dynamic, does this mean that adding at a later date a VM to the cluster will cause this new VM to be backed up twice automatically, unless we remember to exclude it from one of the jobs?

Post by **Vitaliy S.** » Jan 02, 2013 9:40 pm this post

Let me take the second one - if you use container as source for your backup jobs, then new objects added to these containers will also be backed up.

Amyd80 wrote: We will have a two-node Hyper-V WS2012 cluster, with 8 VMs stored on a SAS SAN

Also since you're going to backup a handful of VMs, why not to specify these VMs directly in the backup job? In this case even if one of the Hyper-V nodes hosting these VMs fails, you will still be able to access them through the second Hyper-V node in the cluster.

Post by **Gostev** » Jan 03, 2013 8:24 am this post

Amyd80 wrote:Are there any specific CBT performance bottlenecks in terms of hardware or software configuration that we can tweak to make sure the impact of the driver is as low as possible? Like, is for instance the speed of the system drive of the cluster nodes the limiting factor, or something else entirely? And am I right in assuming that this small performance hit is only present when writing to the VMs, that is, read operations do bypass the CBT driver entirely, right?

Correct, changed block tracking monitors for writes only. The only overhead is an extra write operation that happens occasionally to flush data on modified blocks to CBT data files. Those are stored on each hosts' system volume.

Amyd80 wrote:To point 2, just to make sure I understood correctly: just schedule two simultaneous backup jobs with two different backup targets, both using as source container the entire cluster, and then use the exclude option to respectively kick out one half of the VMs for the one job and the other half for the other, right? That makes sense, but one quick follow-up question: since the container is dynamic, does this mean that adding at a later date a VM to the cluster will cause this new VM to be backed up twice automatically, unless we remember to exclude it from one of the jobs?

Correct. This is something you will have to live with if you want to split the same container between multiple backup repositories... on-going manual management is unavoidable one way or another (whether you decide to add VMs to each job individually, or exclude VMs from container-based jobs instead).

Post by **Gostev** » Jan 04, 2013 11:23 am this post

I've split the following conversation into the new topic, as it is completely unrelated to this topic and as such completely derails this thread.

R&D Forums

Large VHDXs, CBT and backup optimization

Re: Large VHDXs, CBT and backup optimization

Re: Large VHDXs, CBT and backup optimization

Re: Large VHDXs, CBT and backup optimization

Re: Large VHDXs, CBT and backup optimization

Re: Large VHDXs, CBT and backup optimization

Who is online