Comprehensive data protection for all workloads
Post Reply
averylarry
Veteran
Posts: 264
Liked: 30 times
Joined: Mar 22, 2011 7:43 pm
Full Name: Ted
Contact:

Replication enhancements

Post by averylarry » 1 person likes this post

Are there any plans to allow partially successful replications? Clearly I have no idea how difficult it would be. I suppose VMware snapshots will end up being the problem. Part of me thinks you guys have thought of this before, but then again . . ? At least I'd like to know Veeam's thoughts . . .

Example:
I have a VM with a 50 Gb OS drive, a 750Gb data drive, and a 1.4Tb "special use" drive. "Something" happens and the replication process has to basically start over using the existing replica as a seed (so pretty close to the same, but needing to recalculate digests). (On a side note - it's sad how many times this happens in my environment for so many varied reasons.)

My production and replica storage isn't slow, but neither is it anything to brag about. WAN speed is 10Mb. If I don't bring the storage to the main site to resync, here's the reality of what usually happens at my site:

Replica run #1: Recalculate digests on disk 1. 1 hour. Send changed blocks (5Gb). 10 minutes. Calculate digests on disk 2. 15 hours. Cancel the job so I can get a backup done.

Replica run #2: Hard disk 1 -- send changed blocks (digests already done). 15 minutes (up to 8Gb). Hard disk 2, send changed blocks. 45Gb (this is now at least 2 days worth), 4 hours. Calculate digests on disk 3. 28 hours. Cancel job so I can get a backup done.

Replica run #3: Disk 1, 15 minutes. Disk 2, 120 Gb (this is now a minimum of 4 days), 6.5 hours. Disk 3, 220 Gb, 15 hours.



Some ideas:
1) Calculate digests on all disks before doing any replication. Do this on the replica side without a snapshot on the source side so the source VM isn't locked.
2) And before I say it, I'm sure there's a big VMware snapshot hurdle to figure out (or at least a retention policy exception to allow for). But allow a disk that finishes it's changes to be preserved if the job fails after the disk finishes.
3) The most difficult idea, I'd expect. Even if job fails during a disk sync, keep the changed blocks that were processed so they don't have to be processed again. In the 3 replica run scenario above, I have many Gb of data that is superfluously sent 2 or 3 times.
veremin
Product Manager
Posts: 20284
Liked: 2258 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Replication enhancements

Post by veremin »

Hi, Ted.

I have two questions regarding your replication scenario, as I’m a little bit confused why digest calculation took so long time to complete:

1) Do you have a proxy on the target side? Is it being properly configured and used?
2) Where is your replica metadata stored?

Anyway, thank for the feedback; highly-appreciated.

Thanks.
averylarry
Veteran
Posts: 264
Liked: 30 times
Joined: Mar 22, 2011 7:43 pm
Full Name: Ted
Contact:

Re: Replication enhancements

Post by averylarry »

1) Yes (well, I'm pretty sure anyway).
2) On a proxy in the source environment.

Another thing that could help -- figuring out a way to allow (possibly optionally) a backup to run during a replication. You could set all sorts of warning about performance, and limit it to 2 simultaneous jobs (so running on 2 separate snapshots). I suppose it could get really complicated with virtual appliance mode. Maybe force it to network mode for the 2nd job.

Of course -- there's the often requested "backup live once, and then replicate/backup multiple times from the backup instead of production".
veremin
Product Manager
Posts: 20284
Liked: 2258 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Replication enhancements

Post by veremin »

1) Yes (well, I'm pretty sure anyway).
2) On a proxy in the source environment.
The reason I asked you was that in overwhelming majority of cases the situation with digest calculation taking long time is caused by lack of proxy at the target side or if your replica metadata is stored not at source proxy.
Another thing that could help -- figuring out a way to allow (possibly optionally) a backup to run during a replication.
As you know, it’s not possible at the moment - VM can be processed by a single job at the same time, if another job wants to process the same VM it will be placed in the queue waiting for the first job to complete.

Once again thanks for the heads-up. Not only is feedback from the end users more than welcome, it also helps our products to become more and more efficient/convenient.

Thanks.
Post Reply

Who is online

Users browsing this forum: Google [Bot] and 132 guests