Feature request: Backup jobs that exceed the backup window should IDLE not FAIL

Daniel2 · Post by **Daniel2** » Feb 12, 2020 8:28 am this post

Backup jobs that exceed the backup window should IDLE until the next backup window, instead of just FAIL.

First example:

Imagine having an RPO of 24 hours, starting at 0:00.
You have a backup window defined from 0:00-06:00 and 18:00-23:59, due to work hours and performance impact.

At 0:00 the job starts and if it is not done by 6:00, it crashes. VMs that have not been processed by that time will not be backed up until 0:00 o'clock the following day. This means that the RPO is not met because those VMs have 48 hours between their restore points.

Instead the job should cancel all operations at 6:00, IDLE until 18:00 and then retry the cancelled VMs and resume the remaining backup operations in the job.
When the next backup interval starts (at 0:00) and the job is not finished by then, then it FAILS like it normally would.
This is exactly like Backup Copy jobs already work.

Second example:

RPO 4 hours.
Backup Windows is from 0:00-11:59 and 13:00-23:59.
Interval starts at 11:00, runs for an hour and FAILS. RPO will be missed.
If the job would IDLE and resume at 13:00 it would still have until 14:59 to successfully complete.

Post by **Egor Yakovlev** » Feb 12, 2020 8:57 am this post

Hi Daniel!
I have a few thoughts here:
1. Backup Copy Job works with passive secondary data(copies backup files), whereas Backup Job works with production data(running production VM) - and if we keep it [idle] for 12 production hours, that means we will have to keep backup snapshot on said VMs during 12 production hours, when heavy load will be put on those, when most daily changes are generated. Snapshots growth for 12 hours and raised I\O might be devastating for production environment if we keep jobs [idle].
2. What if you set backup window to 3 hours per day, say 12am to 3am? Shall we [idle] till tomorrow if job doesn't fit? You will miss daily RPO anyway in this case, however with current [failed] status you will know you missed RPO, whereas with [idle] you will miss it without being aware it was finished day after...

We have had same feature request for physical backups with agent, which might be slightly better idea due to volume snapshots in action. Will keep it on track.

/Cheers!

Daniel2 · Post by **Daniel2** » Feb 12, 2020 10:17 am this post

Hi Egor,

thanks for your points. I don't see any problems in them as there are very logical steps how your concerns can be cleared. I believe my feature request handles this in a very natural way and is an easy concept to understand.

1. You could abort the VM backup process and remove the snapshot when you exceed the backup window, this is no change to the status quo. When the next backup window in the backup interval is entered, only the aborted VMs will be retried and all VMs resumed that have not yet been backed up. That would solve that problem. In other words, you would IDLE the job, not the individual VM backup process itself.

2. If there are no backup windows available until the next backup interval, you can fail the backup normally, like you do already in the status quo. This would be an expected failure as the backup admin decided to not define more backup windows during that interval.

Not sure I understand what you mean with the Agent, but installing an Agent on a VM to achieve this sounds overly complicated.

Post by **PetrM** » Feb 12, 2020 12:33 pm this post

Hi Daniel,

One more argument in favor of failing a job is that consistent backup implies that state of all VM disks corresponding to specific point in time
is fully transferred and saved into the backup file. If some external signal (user/backup window) aborts the job during data transfer we will have inconsistent backup.
If we remove snapshot we will lost this specific state and processed data will be anyway useless.

Thanks!

R&D Forums

Feature request: Backup jobs that exceed the backup window should IDLE not FAIL

Re: Feature request: Backup jobs that exceed the backup window should IDLE not FAIL

Re: Feature request: Backup jobs that exceed the backup window should IDLE not FAIL

Re: Feature request: Backup jobs that exceed the backup window should IDLE not FAIL

Who is online