First of all: Sorry for long text and thanks for reading in advance.
I'm facing some tough challenges. Let me explain the situation:
We have an environment with an amount of approx 16TB of backed up data (sum of all jobs in datacenter from table storedsize of database view dbo.ReportSessionsView) and a change rate of ca. 10 percent. One restore point per day, 30 restore points in total. All should be copied off-site over WAN link with let's say a bandwith that can handle the change rate per day when copying at full capacity all the time within one day backup cycle (150-300mbit targeted). At the moment there are 23 backup jobs with overall 511 vms to process. Every backup job is linked to a separate backup copy job.
- Veeam B&R 9.5 U1 Enterprise Plus
- Mixed VMware environment: 5.0, 5.5, 6.0
- 23 backup jobs
- 23 backup copy jobs
- 511 vms
- 16TB full backups (may be reduced through consolidate backup jobs and thus benefit from more dedup)
- 10 percent change rate
- 1 WAN connection to process 1,6TB within 24 hours
- 2 backup repositories on-site with respectively 54,6TB of capacity (scale-out repository is not yet used)
- 2 backup copy repositories off-site with respectively 54,6TB of capacity (scale-out repository is not yet used)
Goals to achieve:
- Decrease the number of backup jobs and copy jobs to reduce the load on the backup server and benefit more from dedup. The high number of jobs is due to missing parallel processing feature prior to verion 7. Remark: We have around 100 backup / backup copy jobs in total. We back up additional vms in our plants (repectively one backup and one backup job per plant).
- Copy all incremental data within a backup cycle of one day off-site.
- A copy job only starts once the source backup job has finished and not when the first vm processing has finished even if per-vm backup files is configured at source repository.
- If we configure less backup jobs a copy job will be idle until the first job has finished.
- Reverse incremental or forever forward incremental additionally extend the job duration due to merge while running (reverse) or at the end (forever forward).
- Weekly active fulls are only possible when decreasing the number of restore points cause of the large full backup size. Approach would be running forward incremental active fulls to avoid merge and end in shorter job duration to begin backup copy earlier at the cost of additional capacity which we don't have (yet).
- When using only 2 copy jobs merge will take forever and while merging is active the WAN link is idle.
- We cannot use active GFS with a lower number of restore points in total because this would only help if we can copy with "Read the entire restore point from source backup..." option enabled. And this is not possible due to the limited WAN link.
Considerations so far
- Configure two copy jobs (one for every respository on-site), use current copy repository as staging repository on-site and buy a NAS sized to hold the complete data of both copy jobs. Then copy with rsync instead of using backup copy jobs. Pro: No merge needed because the both primary copy jobs have merged all data already.; Rsync should be able to copy only delta changes to off-site NAS. Cons: Unreliable; Rsync isn't aware of running Veeam backup copy jobs, Rsync and backup copy jobs may overlap; I read "If rsync was any good, we would not have developed Backup Copy." -> I guess rsync wouldn't be a good option in any case, right?
- Configure two copy jobs (one for every respository on-site) that copy data of all jobs to the current backup copy repository but on-site and use GFS with "Read the entire restore point from source backup..." if necessary to avoid merge. Use two more backup copy jobs to process the data for off-site copy to a newly buyed NAS over WAN on even days and additional two copy jobs that copy the data over WAN on odd days. In this szenario both copy pairs should have enough time to merge the 1,6TB incremental backup once the maximum number of set retore points is reached.
- Configure just two backup copy pairs (2 copy jobs x 2 Repositories) for copy over WAN alternating daily wihtout an additional on-site copy respository. How would I configure alternating copy jobs, anyway? Just set cycle = two days? What if backup cycle of both copy jobs start at the same time (e.g. at backup service start)? Is it even possible to configure alternating backup copy jobs?
- Until now we havn't used WAN acceleration cause of the current number of backup copy jobs and the WAN acceleration process limit to one task. Would WAN accelaration help me in this situation as well?
Are there any suggestions how to configure the complete backup and backup copy process? What would you do if you were in my place?
Any recommendations, suggestions or other ideas welcome, thanks in advance.