Estimate time for backup copy from seed?

pkelly_sts · Post by **pkelly_sts** » Oct 29, 2013 1:43 pm this post

I've been pulling my hair out for the past week trying to transition from running 2nd backups to a DR site to backup copy jobs and getting nowhere useful in any reasonable timeframe so I'm wondering if it's just something I'm doing completely wrong or does a first sync just take "forever"?

Scenario is:
Main site: 3 backups around 1Tb each to local storage + 3 similar backups (with one slightly smaller due to excluding a few VMs) to a DR site over 100Mb link
Smaller site: 2 backups to local storage around 500Gb each and, again, two subset jobs sending around 400 & 300Gb to same DR site over another 10Mb link.

This was mostly running ok taking 2-3hrs for the main->DR site backups & 5hrs or so for the small->DR site backups

I'll forget what I've tried so far as it seems unusable to me so I'm about to walk out of the door heading to the DR site with a fresh copy of each local backup (from last Fridays synthetic full plus a couple of incrementals).

My plan for today/tomorrow is to simply re-instate site->dr site backup jobs just so I get things going off-site again whilst I work out what I need to do differently to get backup copy jobs working, but given the above type sizes, how long would one reasonably expect a first-run copy job to take? I'm not planning on deleting the WAN accell's for now so am I right to assume that these should still be useful?

Storage is all reasonable speed FC SAN (local copies of large files run @ 250MB/s or so for reference).

Another thing I'm struggling with on copy jobs is to establish exactly where a job is in the process - I've seen jobs with a run-time of 11+hrs but which have only copied 5Gb of actual data. This job will have tied up the WAN accell for all that time, whilst another (of the two other) copy jobs could otherwise have been working?

After how excited I've been at getting this all working I've been bitterly disappointed with where I've got to so far...

Paul

pkelly_sts · Post by **pkelly_sts** » Oct 29, 2013 1:49 pm this post

For reference another post giving more detail of my previous config is at http://forums.veeam.com/viewtopic.php?f=24&t=18747

Post by **foggy** » Oct 29, 2013 1:55 pm this post

Paul, did you have a chance to seed your backup copy jobs? What is the primary bottleneck shown in the job stats?

pkelly_sts wrote:This job will have tied up the WAN accell for all that time, whilst another (of the two other) copy jobs could otherwise have been working?

Note that source WAN accelerator can process only one task at a time.

pkelly_sts · Post by **pkelly_sts** » Oct 29, 2013 2:03 pm this post

I've been trying to, but half the time it seem to ignore the seed & initiate a full fresh copy over the WAN.

I've stopped both copy jobs for now & both of them give a bottleneck of "Target WAN".

So, if I deploy multiple source wan-x, say 1 per job, they all use the same target wan-x/global cache but the multiple source's can send 3 lower-bandwidth jobs simultaneously?

I'm resetting the whole thing as I've got so frustrated with it it so just want to find the most efficient way to start things up again once I've got the remote backups up to date after re-seeding from physical .vbk/vibs today (which will likely take 6-8hrs to copy to disk at remote end).

Post by **tsightler** » Oct 29, 2013 2:05 pm this post

You might want to try just using direct mode for backup copy jobs and get that working first. This should be very simple and should run at least as well as your previous method.

The initial sync using WAN acceleration can take quite a long time, especially if you don't have very fast disk for the global cache or you are using the same disks for global cache as for the repository. You have a 100Mb link and it's unlikely that WAN acceleration will be a significant benefit on this link, it will probably make the transfer slower, although it will use much less bandwidth.

So my suggestion is to configure a Backup Copy using direct mode so that you're performance will be at least as good as what you had before, and then you can decide to enable WAN acceleration, perhaps for the job on the 20Mb link, and see what happens, however, even there, because of the relatively low amount of data, direct mode may still be the best option.

Post by **foggy** » Oct 29, 2013 2:20 pm this post

pkelly_sts wrote:So, if I deploy multiple source wan-x, say 1 per job, they all use the same target wan-x/global cache but the multiple source's can send 3 lower-bandwidth jobs simultaneously?

Yes, with a small correction: each pair of accelerators will use its own cache and the weakest point in your setup (Target WAN) will be assigned triple load, so probably not the best option. It is worth trying what Tom has suggested first.

pkelly_sts · Post by **pkelly_sts** » Oct 29, 2013 5:42 pm this post

OK, so what I think I'll do is first get site-to-site working as I know for definite that worked and isn't dependant on a single recovery point to start with.
I'll then create a backup copy of that backup, i.e. a copy from the DR site backup to another set of spindles on the same B&R repository server which will create the required single restore point. This should hopefully run pretty quickly, i.e. more-or-less local storage speed.

I'll then either create a new copy job to copy the primary site to DR site and map it to the newly created copy at the DR site, or edit the the job in the previous step but tell it to use the primary site backups as a new source. This is where my understanding of the copy jobs starts to break down a bit as, reading through the docs, it implies that copy jobs will get their source data from wherever is the most efficient place, so it's a little unclear to me exactly how you tell it where to get the source data from?

If I get onto the copy jobs fairly quickly I can probably (just about) run both in parallel for a few days to make sure things are working as they should, before I abandon the site-to-site job.

Resource-wise for proxies/wan-x's, my thinking is that vSphere resources are woefully under-utilised at night so I'm happy to make good use out of the 3 x 6-core/8gb dedicated proxy VMs I've configured at each site. What I'm also unsure of as yet is how the cache side of things works at the source side - using this strategy (if I do go down the wan-x route based on discussions above) should I run a job through one source proxy first, then copy any cache to the remaining two source proxies before using them?

Something else I noticed is that, although I configured the target proxy with the default 100Gb, the target cache for the small (10Mb) site is up to 240 Gb in size and the target cache for the primary site is around 130Gb (what I mean is, both of these are target caches physically on the DR site server, which serves the primary & small site - god this stuff gets complicated to type out!

I could use the 3 target site proxies as wan-x's too but on the basis of the disk utilisation above it would be a pretty huge tie-up of disk resources x3 which is why I assumed I'd manage it with just the one.

Post by **foggy** » Oct 29, 2013 8:21 pm this post

pkelly_sts wrote:This is where my understanding of the copy jobs starts to break down a bit as, reading through the docs, it implies that copy jobs will get their source data from wherever is the most efficient place, so it's a little unclear to me exactly how you tell it where to get the source data from?

Paul, you specify the scope of source restore points while adding VMs to the backup copy job (either from virtual infrastructure, from particular backup jobs, or specific backup repositories).

Post by **veremin** » Oct 30, 2013 7:44 am this post

OK, so what I think I'll do is first get site-to-site working as I know for definite that worked and isn't dependant on a single recovery point to start with.

Actually, there is no need to create additional backup copy job, since you can map backup copy job to the file produced by site-to-site backup job. If you want to have site-to-site backup copy for some time, then, you can create a folder on that remote location, copy to it the file produced by site-to-site backup job, add this folder as a backup repository, and map backup copy job to this file afterwards.

As mentioned, if traffic saving isn’t your major concern, the direct copy mode might be the best option.

Thanks.

pkelly_sts · Post by **pkelly_sts** » Oct 31, 2013 11:23 am this post

Ah, I think I've found at least one cause for my confusion. I've now got site-2-site backups up & running again, so in preparation for copy jobs (whilst I'm starting from scratch & have the disk space) I've created a copy job that copys FROM the remote site backup repository to another set of spindles on the same storage. I paid more attention when adding the VMs this time and, as foggy stated, you can specify the infrastructure, job or repository to use as the source for the copy job, and that's nice & clear.

However, if you then go in to EDIT the copy job, you have no way of knowing the source of the CURRENT VMs in the job list. So, once I have the first copy pass seeded from the DR site backup repository, if I then edit the copy job, remove all the VMs, then re-add them, this time pointing the source to the primary site BACKUP JOB then it should resync & all should be set?

Getting there slowly...

Post by **foggy** » Oct 31, 2013 11:50 am this post

pkelly_sts wrote:So, once I have the first copy pass seeded from the DR site backup repository, if I then edit the copy job, remove all the VMs, then re-add them, this time pointing the source to the primary site BACKUP JOB then it should resync & all should be set?

Yes, if the VMs are the same, the backup copy job should continue copying incremental changes for them to the target site.

R&D Forums

Estimate time for backup copy from seed?

Re: Estimate time for backup copy from seed?

Re: Estimate time for backup copy from seed?

Re: Estimate time for backup copy from seed?

Re: Estimate time for backup copy from seed?

Re: Estimate time for backup copy from seed?

Re: Estimate time for backup copy from seed?

Re: Estimate time for backup copy from seed?

Re: Estimate time for backup copy from seed?

Re: Estimate time for backup copy from seed?

Re: Estimate time for backup copy from seed?

Who is online