Comprehensive data protection for all workloads
Post Reply
ejenner
Veteran
Posts: 636
Liked: 100 times
Joined: Mar 23, 2018 4:43 pm
Full Name: EJ
Location: London
Contact:

Chained backup schedules broken by one job

Post by ejenner »

On advice from technical support to help mitigate another issue it was decided to chain a series of backups (aprox 8 jobs).

I noticed today that a job I disabled which is early in the chain prevented all later jobs from running. So between the time I disabled the job and today when I spotted the issue the jobs haven't run. The period is a little longer than is really ideal but not catastrophic.

I check daily for failed jobs but all the jobs which haven't been running were successful on their last pass so I didn't see them amongst the 100+ jobs I have defined. They were successful but a long time ago (relatively speaking).

I accept it is certainly my fault that it didn't occur to me at the time when I was disabling the job that the other jobs whose scheduling is dependent on it would not start if it didn't run. I was replacing a corrupted job so renamed it. Then created a new job with the same name to replace it.

So even though there was a replacement job with the same name as the old job the schedule for the following jobs had updated to show the name of the disabled job. I kept the old disabled job on the system in case I would have to restore from it.

I can think of many ways around this dilemma but I'll leave the floor open to suggestions should this scenario be sufficiently likely to make a reworking worth while.
HannesK
Product Manager
Posts: 14322
Liked: 2890 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Chained backup schedules broken by one job

Post by HannesK »

Hello,
On advice from technical support to help mitigate another issue it was decided to chain a series of backups (aprox 8 jobs).
hmm, do you maybe have a case number for that? because that's a bad advice in general. job chaining should only be used in corner cases

Best regards,
Hannes

PS: for things like that, Veeam ONE has my favorite report... the "Protected VMs" report
ejenner
Veteran
Posts: 636
Liked: 100 times
Joined: Mar 23, 2018 4:43 pm
Full Name: EJ
Location: London
Contact:

Re: Chained backup schedules broken by one job

Post by ejenner »

I'm not sure it was bad advice as it seems to have kept that set of backups pretty stable over the last few months. At the high level it was essentially a resources conflict which was eventually leading to corruption. Chaining them stopped the individual jobs competing, they never run concurrently now.

Back on topic. You don't have to have a report to be able to find the problem. You can arrange the columns in B&R console to show when the last backup completed successfully. As I say, these jobs weren't failing, they were successful. It's just the case that the job they all depended on was no longer running so they weren't running either.
foggy
Veeam Software
Posts: 21073
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Chained backup schedules broken by one job

Post by foggy »

Chaining them stopped the individual jobs competing, they never run concurrently now.
Automatic jobs schedule allows for more optimal resource usage.
You don't have to have a report to be able to find the problem.
What Hannes means is that this report allows you to immediately see if some VMs backups are not occurring as a result of the behavior you've described.
ejenner
Veteran
Posts: 636
Liked: 100 times
Joined: Mar 23, 2018 4:43 pm
Full Name: EJ
Location: London
Contact:

Re: Chained backup schedules broken by one job

Post by ejenner »

I think I've found this sort of response on here previously where Veeam say you can do it another way so that's fine and nothing has to be done.

The problem with this attitude is that all Veeam really does at the end of the day is copy files from one place to another. So that answer could be the default answer to every suggestion for software improvement / enhancement.

I've described a situation where the design of the software has caused me to mistakenly not backup some of our data for a few backup cycles and I've explained how I check for failed jobs each day but I missed these jobs as they weren't failures. The software should facilitate the successful completion of backup jobs and if it doesn't then I might as well use Robocopy instead. i.e. the software is a convenience tool to help get the work done. So the approach to thinking about new features should be to consider whether or not the proposed enhancement can help customers to complete their backups more successfully.

I can help with some suggestions on how to make the improvement:

1. When a job in a chained schedule changes name the chain should remember the name of the old job instead of trying to follow the renamed job. That would cover my scenario where a backup job became corrupted so I cloned it, renamed and disabled the old job. The new job with the same name replaced the corrupted job.

2. If you disable a chained job present the operator with a warning to let the operator know the jobs which depend on it won't run.

3. If the chain is broken change the last result to 'Failed' on the basis that the broken schedule means these jobs can never run until the schedule is fixed. This would be complicated but it's another way of doing it.

4. Also complicated but if there were some form of atypical behavior detection in the software it could warn you if a job hasn't completed for more than 5 days when normally it would run every day.

5. If a job in a chained schedule is disabled run the subsequent dependent jobs anyway. i.e. continue the schedule. Maybe make this optional in the schedule settings.
ejenner
Veteran
Posts: 636
Liked: 100 times
Joined: Mar 23, 2018 4:43 pm
Full Name: EJ
Location: London
Contact:

Re: Chained backup schedules broken by one job

Post by ejenner »

Oh... and the answer isn't to stop using chained backups. The feature is in the software so if it shouldn't be used it should be removed.
HannesK
Product Manager
Posts: 14322
Liked: 2890 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Chained backup schedules broken by one job

Post by HannesK »

Hello,

1. doing anything based on names would be dangerous and uncomfortable for many scenarios. imagine someone renames the job and then the chain breaks. so following the original path is desired behavior. I would compare user management (Active Directory for example) where a user has a unique ID. The user name can change (marriage, whatever), but the person is still the same.

2. well, it's a "thin edge" to decide whether it's "flooding" the user with "useful" or "annoying" messages.

3. what defines "broken"? If a user deactivates a job in a chain, then the general expectation of the software is, that this configuration was done intentionally.

4. We have VeeamONE reports for that today. There are also alarms for disabled jobs

5. Imagine a physical chain with 10m length. If you break it in the middle: would you expect that the chain has 5m now or 10m? :-)

A feature that is used for a few corner cases cannot be removed easily. With 450k customers, that would hid some thousand customers even if it's only 1% using that feature

Best regards,
Hannes
Post Reply

Who is online

Users browsing this forum: No registered users and 104 guests