Host-based backup of VMware vSphere VMs.
Post Reply
FrancWest
Veteran
Posts: 550
Liked: 113 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Replication job being aborted bij backup job

Post by FrancWest »

Hi,

Case # 03512376. It appears that the merge operation of a backup job aborts a running replication job that uses the same backup files (we replicate from a repository instead of live VMs). I had a situation that a replication job was already running for 18 hours (very large VM) and once the backup job started and needed to do a merge (due to retention policies) it aborted the replication job. Backup jobs seem to have priority over replication jobs.

In order for the replication to be successful I had to disable the backup job during the replication so that a new backup session wouldn't abort the replication. Support says this is by design.

I have a suggestion though, why not postpone the merge operation and do it on one of the next backup job sessions if it detects that the same backup files are currently in use by a replication job instead of aborting the replication?

Franc.
HannesK
Product Manager
Posts: 15598
Liked: 3445 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Replication job being aborted bij backup job

Post by HannesK »

Hello,
I understand that the situation does not satisfy your needs but postponing merges is more complicated than it sounds in the first place.

The longer a merge is postponed, the longer it takes to finish. What if somebody re-schedules jobs. So finally nothing works anymore. 18h merge seems to be more a (harware) design issue... if ReFS is possible for your environment, I can recommend that. Or use forward incremental with active full to avoid merges at at.

Best regards,
Hannes
FrancWest
Veteran
Posts: 550
Liked: 113 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Replication job being aborted bij backup job

Post by FrancWest »

Hi,

We are already using reFS. The merge is not the issue it's the replication taking so long that it crosses two backup windows.
HannesK
Product Manager
Posts: 15598
Liked: 3445 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Replication job being aborted bij backup job

Post by HannesK »

ah sorry, I misread that. If I understood the case correctly, then you are using forever forward incremental. Synthetic full should resolve the merge issue but it quite hard to test in my lab as I cannot create a synthetic full on the same day and my throttling options are limited in my current setup...
FrancWest
Veteran
Posts: 550
Liked: 113 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Replication job being aborted bij backup job

Post by FrancWest »

but will doing synthetic fulls also not cause the replication job to abort since it uses the same files the replication job is also using?
richkovach
Novice
Posts: 3
Liked: never
Joined: Mar 22, 2019 12:36 am
Contact:

Re: Replication job being aborted bij backup job

Post by richkovach »

Hi,

I happen to be troubleshooting a similar issue but with merge from a backup job interrupting our backup copy jobs. While I am waiting on support to try and reproduce (they have been having trouble reproducing it) I have been trying different things to see if I can correlate the issue to something.

Out of curiosity, do you have bitlocker enabled on any of the volumes for any of the backup repos? No idea yet if it is related, just happens to be the next thing I am testing an was curious if a similar sounding issue might have it as a data point.

Thanks
csydas
Expert
Posts: 193
Liked: 47 times
Joined: Jan 16, 2018 5:14 pm
Full Name: Harvey Carel
Contact:

Re: Replication job being aborted bij backup job

Post by csydas »

Hi Franc,

I think the idea behind the synthetic full is that it avoids the potentially slow processing that can happen with a forever forward backup. We saw this a long ways back where absolutely everything with our forever forward chain was awful -- slow merges, slow FLRs, everything. We ended up enabling compact fulls, and after an awful week (yes, week) of compacting, it was like (well, literally) a brand new backup.

ReFS is naturally fragmented after awhile, and the forever forward fragmentation only makes it worse. I don't think you can solve ReFS fragmentation, but you avoid the backup file fragmentation with synthetic fulls.
FrancWest
Veteran
Posts: 550
Liked: 113 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Replication job being aborted bij backup job

Post by FrancWest »

Hi,

it's replicating from a backup job with a retention of 30 restore points. When doing synthethic full on a weekly basis, would it make that much of a difference then?
foggy
Veeam Software
Posts: 21182
Liked: 2163 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Replication job being aborted bij backup job

Post by foggy »

The issue is that if the replication job needs the same files that are being modified by the already running backup job, the former is terminated (backup jobs do have a higher priority). Synthetic fulls might prevent this from happening since data is only being read from the older part of the chain, files themselves are not touched (provided they are not subject to backup job retention).
FrancWest
Veteran
Posts: 550
Liked: 113 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Replication job being aborted bij backup job

Post by FrancWest »

sorry, but I still don't why a synthetic full will be of benefit with the issue we are having. In both cases the retention policy will be applied and thus the files are being locked and the replication job is being aborted. The current backup job runs a merge every day since the 30 day retention policy has already been reached.

For now one solution I see is pre- and post scripts to disable/enable the backup job for which the files are being used by the replication job.
HannesK
Product Manager
Posts: 15598
Liked: 3445 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Replication job being aborted bij backup job

Post by HannesK »

I still don't why a synthetic full will be of benefit with the issue we are having
because a synthetic full does not change the existing backup chain. It only creates a new synthetic full but leaves the old restore points in place.

I tested it with success in my lab with the following scenario:
- I had an old chain with forever forward with 3 RPs
- I changed that from forever forward to forward with synthetic full for today
- I started a replication job shaped to 1MBit/s to simulate a long running replication process
- then I started a backup on the source job
- synthetic full was created successfully. replication job continued running :-)

then I changed back to forever forward mode
- run three backups (merges started at the end)
- replication job stopped because "stopped by backup-job-name"

keep in mind that backup chains with synthetic full can become longer than expected. So I recommend simulating first with the restore point simulator
FrancWest
Veteran
Posts: 550
Liked: 113 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Replication job being aborted bij backup job

Post by FrancWest »

Hi,

thanks! Especially the simulator is nice, didn't know that it exists ;-)

Additional question though: with synthetic fulls enabled, will there be a point-in-time (eg when the retention period is reached) that it must do a merge and thus lock the files? Or does it leave all the incrementals on disk forever?
HannesK
Product Manager
Posts: 15598
Liked: 3445 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Replication job being aborted bij backup job

Post by HannesK » 1 person likes this post

the "tail" chain will stay until the retention is reached. in your setup for around 30 days and then it will be deleted.

Yes, it locks the chain, but as your replication job starts before the backup job, that's not a problem. It deletes the "inactive part" of the chain after one week (if you have one synthetic full per week). So your replication should finish before the next deletion of the chain. You can simulate every day in the restore point simulator. There is a small "manual run" checkbox
FrancWest
Veteran
Posts: 550
Liked: 113 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Replication job being aborted bij backup job

Post by FrancWest »

Ah, that explains it. Thanks!
Post Reply

Who is online

Users browsing this forum: Amazon [Bot] and 6 guests