Comprehensive data protection for all workloads
Post Reply
Tom_LeFx
Enthusiast
Posts: 25
Liked: 1 time
Joined: Jan 13, 2023 6:50 pm
Contact:

Health Check with Errors due to Backup-Job

Post by Tom_LeFx »

I recently checked my Backup-History and found the category "System -> Health Check"
I never got errors with my backup jobs and so far, my test-restores seem ok as well - but I was startled to see, that half of my jobs health checks are marked there as "Failed".
When I click on them to investigate, I see, that they all have the same error reason:

Two out of three backup-jobs fail with this reason:


18.03.2023 22:00:25 Failed Failed to verify backup file Error: Stopped by job '[Name of Backup-Job]' (Backup)

Two out of three backup-copy-jobs health checks fail with this reason:

17.03.2023 22:07:55 Failed Failed to verify backup file Error: Stopped by job '[Name of Backup Copy Job]' (Backup Copy) Agent failed to process method {Signature.FullRecheckBackup}.

The Jobs are configured in the following way for example for one of them who fails:

Daily backup at 22:00 on every day from Mo - Fr.
No termination when outside the backup window
Health check every Saturday at 22:00

Backup-Copy at "Any time (continuously)"
Health check for Copy on Friday, 22:00

One of the 3 VMs that are backuped is a file server and the synthetic fulls can take quite a while to be taken - same goes for the following offsite-copys.
Synthetic full of this one can take up to 7 hrs, Offsite even a few hrs more sometimes.

Did I set things up wrongly for these errors to occur? What is the best timing for the health checks? They can take quite a while, because the fileserver vm is about 12 TB in size.
wesmrt
VeeaMVP
Posts: 29
Liked: 12 times
Joined: Aug 08, 2019 2:13 am
Full Name: Wesley Martins Silva
Contact:

Re: Health Check with Errors due to Backup-Job

Post by wesmrt »

I suggest opening a support case because we need to look at the logs to be sure.

Reading your description, which day the synthetic full is scheduled to run?
Maybe it is the synthetic full stopping the job Health Check because, by default, the Synthetic Full runs on Saturdays.
Specialist Engineer @ Veeam Support
Blog: https://itproland.com.br (Brazilian Portuguese)
Tom_LeFx
Enthusiast
Posts: 25
Liked: 1 time
Joined: Jan 13, 2023 6:50 pm
Contact:

Re: Health Check with Errors due to Backup-Job

Post by Tom_LeFx »

Thank you for your input. I'll try to setup a ticket, but I'm only on Community Edition, so let's see about that.

I have a feeling, that my setup for the timings of Health check and backup jobs is not ideal - but it wasn't fully clear to me, how to do it optimally from reading the documentation.
What I don't fully understand is, why the health check does not run on it's own but is more or less a timer, that gets checked, when the backup job itself runs - yet the job itself seems to be able to interfere, if the health check runs too long...

Maybe you can give me a hint here - what would be the best setup for the following job:

1 large VM (11 TB), 2 linux repos - one on-site, one off-site connected through VPN
GFS retention, 2 weekly, 6 monthly, 3 yearly
Keep 7 min. restore points
Scheduled to run from Mo - Fr, daily at 22:00
Incremental backup, create synth. full periodically on Saturday, 22:00
Perform Health-Check on Saturday 22:00

Secondary Location set, immediate copy mode
7 days min. retention
GFS scheme, 2 weekly, 3 monthly, 2 yearly
Health check on Friday, 22:00

I see, that there are potential conflicts between the synth full and the health-check on the Backup-Job - but again - I don't fully understand, how I the health check should be timed.
My idea is to do the health check with or right after the synth. full to make sure that one is good and also to speed it up, because according to docs and forums, the health check is the fastest on a closed full backup, instead of having to incorporate increments as well - right?
Amarokada
Service Provider
Posts: 92
Liked: 10 times
Joined: Jan 30, 2015 4:24 pm
Full Name: Rob Perry
Contact:

Re: Health Check with Errors due to Backup-Job

Post by Amarokada »

I came across this as well today as a service provider. We have a lot of jobs from multi-tenant cloud environments (VCD) and since the upgrade to v12 the health check now has a time-of-day which can be set (I suspect because this is now decoupled from the backup job itself completing?). Before I believe this health-check ran after the backup job natually completed.

Because it defaults to starting at the same time as the backup job itself, could we perhaps have a script from Veeam that checks all jobs and configures the timing on the health check to run say 2 hours after the backup job starts?
Tom_LeFx
Enthusiast
Posts: 25
Liked: 1 time
Joined: Jan 13, 2023 6:50 pm
Contact:

Re: Health Check with Errors due to Backup-Job

Post by Tom_LeFx »

Aaaah - that is why I never had that issue before updating to v12.
Yes, I updated recently and since then have this issues, because before the health checks just ran with my jobs
Tom_LeFx
Enthusiast
Posts: 25
Liked: 1 time
Joined: Jan 13, 2023 6:50 pm
Contact:

Re: Health Check with Errors due to Backup-Job

Post by Tom_LeFx » 1 person likes this post

Update:
I could figure things out myself - it was related to a change by updating from v11 to v12 - since then, health checks run as seperate tasks - which I appreciate. Just the update documentation could mention that changes will be needed after updating.

I now setup the jobs as follows:
* daily incremental backups
* synthetic full on fridays
* health checks on Sunday at 12pm to give a synthetic full and it's transfer by backup-copy to offsite-location enough time to finish

Seems to work well now - so if anybody else encounters this - check your timings
FrancWest
Veteran
Posts: 489
Liked: 93 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Health Check with Errors due to Backup-Job

Post by FrancWest »

The new way health checks work in V12 is more of a hassle than it was before. Now you have to schedule the health checks in such a way that they have enough time to complete or else they will be aborted by their backup (copy) job. Also no notification is being send that the health check failed this is a known bug).

The health check start by default at 5 AM on the last Sunday of the month you have to reschedule them all. If you have many jobs, this makes it quite difficult, since for immediate mode backup copy jobs you have to switch back and forth to lookup the time the source backup job starts and schedule accordingly.
mkretzer
Veeam Legend
Posts: 1145
Liked: 388 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Health Check with Errors due to Backup-Job

Post by mkretzer »

Why does a read-only operation like a health check have to "lock" a backup chain at all? Tape jobs do not lock a job (at least with the right registry setting in V12), why should health checks?
FrancWest
Veteran
Posts: 489
Liked: 93 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Health Check with Errors due to Backup

Post by FrancWest »

It’s not entirely read-only. If it detects corruption, it goes into retry mode to repair the backup chain.

https://helpcenter.veeam.com/docs/backu ... ml?ver=120
mkretzer
Veeam Legend
Posts: 1145
Liked: 388 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Health Check with Errors due to Backup-Job

Post by mkretzer »

First of all, i had corruption in the past and it never repaired anything (but that might be because of an old version). Also why should a functionality that is triggered 0,01 % of times be a reason to design it like this. If a corruption is detected it should alert the backup Admin and fail the job that is working on the chain.
In all 99,99 % of cases it should be purely a backround action.

I have a theory that many companies do not do health checks at all because of such limitations, when in fact with a minor limitation of the functionality as described health check can be a no-brainer in the future.
FrancWest
Veteran
Posts: 489
Liked: 93 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Health Check with Errors due to Backup-Job

Post by FrancWest » 1 person likes this post

I agree, the way it has changed now in V12, is from giving health check priority (the backup job won’t start until the health check finishes), to the backup job giving priority by cancelling the health check. Since notifications for health checks are broken (not fixed in the latest patch), even more people will have no health checks because they aren’t aware that it’s being cancelled all the time due to wrong scheduling.
mkretzer
Veeam Legend
Posts: 1145
Liked: 388 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Health Check with Errors due to Backup-Job

Post by mkretzer »

To be honest the priority change is the one thing i like very much about this. Still, it should be 100 % backround - exactly like SOBR rebalance (which should theoretically be possible as well).
tyler.jurgens
Veeam Legend
Posts: 290
Liked: 128 times
Joined: Apr 11, 2023 1:18 pm
Full Name: Tyler Jurgens
Contact:

Re: Health Check with Errors due to Backup-Job

Post by tyler.jurgens »

Health Checks probably lock the backups because retention policies could change that backup file. If you run another backup and it changes the backup chain that is being health checked, the health check would not be valid anymore.

Eg - 3 day retention policy. Monday (Full), Tuesday (Incremental), Wednesday (Incremental). Do a health check on Wednesday. The health check takes a long time and now you take another backup Thursday, changing that backup chain (Monday gets removed, you now have a full on Tuesday and incrementals after that). This is a simple scenario, but you could extrapolate it to larger jobs easily.
Tyler Jurgens
Veeam Legend x2 | vExpert ** | VMCE | VCP 2020 | Tanzu Vanguard | VUG Canada Leader | VMUG Calgary Leader
Blog: https://explosive.cloud
Twitter: @Tyler_Jurgens BlueSky: @tylerjurgens.bsky.social
mkretzer
Veeam Legend
Posts: 1145
Liked: 388 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Health Check with Errors due to Backup-Job

Post by mkretzer »

In theory yes, but as i said the same logic would apply to tape jobs.
Post Reply

Who is online

Users browsing this forum: Bing [Bot], Semrush [Bot] and 112 guests