Comprehensive data protection for all workloads
Post Reply
jcofin13
Service Provider
Posts: 169
Liked: 17 times
Joined: Feb 01, 2016 10:09 pm
Contact:

health checks and best practice

Post by jcofin13 »

Simply put, i have veeam jobs where i am wondering if it is best practice to enable the health check feature.
I do believe it is off by default.

The jobs in question are configured for 14 daily and a synthetic full once per week. They are running on a SOBR which is immutable.

In the case of a SOBR (immutable performance and capacity) i understand it checks the performance tier directly as per the last successful backup that ran, and the capacity tier via its metadata files.

If i have a customer that has a job with 40tb of vms added to it for example and this job runs daily to a sobr, should i be enabling this health check to run monthly (or more frequently) or would the check itself take too long due to the backup size and cause issues with the backup itself running the follow days?

Also if the check is run once a month as per the default schedule would the capacity tier I/O costs shoot up in price due to this check running? I realize its only checking the metadata files and replacing blocks it might not have but im trying to get an idea of how much I/O would increase as it is an item that is billed for by most providers. In this case, AWS S3.

IF it finds corrupt data i also understand that it cant "heal" that data because it cant modify it (because its immutable).

I do remember a LONG time ago the recommendation was forever forward incrementals with regular sure backup testing. With the health checks, im just looking for best practice and what to expect if it is enabled.


Also...IF it finds corrupt data and cant heal it, im sure it must mark it in some way and error out. What then is the path to remedy this if the data is immutable both on the performance tier and the capacity tier? Would you need to start a new backup chain and thus another full backup and thus....potentially use 2x the storage until the old chain ages out? Does veeam track the old chain if this is the case and age it out and clean it up or does it mark it as corrupt and then you have to go back and manually clean it up?

Apologies for all the questions. I just want to be clear on what to expect should i want to turn this on. I have a feeling with backup jobs that are this huge any sort of error checking would take forever and maybe never complete.

I have more questions but ill start with this. I did read the guide as far as health checks but it left me with these questions.
david.domask
Veeam Software
Posts: 2573
Liked: 603 times
Joined: Jun 28, 2016 12:12 pm
Contact:

Re: health checks and best practice

Post by david.domask »

Hi jcofin13,

No worries on the amount of questions, it's what the forums are for.

1. Health Check and Immutable Files.

You are correct, with immutable files, repair is not possible, so the remediation is to run an Active Full if there are indeed corrupted files.

2. Costs on AWS

Some costs will be associated with the Health Check yes, as AWS charges for API calls. Luckily the Health Check for Capacity Tier is a bit smarter than normal health check and as you stated confirms that the metadata is correct and all blocks are present, it does not do a full read of each block.

3. Health Check scheduling

You mention 40 TiB worth of backups, and I'm assuming this is across a few machines, not just one or two machines with larger disks, correct? You might consider using SureBackup job with Backup verification and content scan only option enabled. Selecting this option with SureBackup will only run the content verification scan, a sort of Health Check-lite in that it will do the same integrity scan, but it does not do the repair; in your case, since the performance tier is immutable anyways, not having the repair is no loss. The big advantage with the SureBackup job is that you can spread out the testing over time by configuring the job to randomly test backups linked to the SureBackup job. So rather than the job processing all 40 TiB at once resulting in a potentially long SureBackup job, it tests all machines in the backup over a few runs, breaking it down into more manageable sessions.
David Domask | Product Management: Principal Analyst
jcofin13
Service Provider
Posts: 169
Liked: 17 times
Joined: Feb 01, 2016 10:09 pm
Contact:

Re: health checks and best practice

Post by jcofin13 »

Thanks for the reply David.

Job vm size as calculated by the job properties are anywhere between 5tb and 40tb. It really depends on which backup job im looking at.
Most of the jobs have many smaller vms in them but a few have just a couple large vms with those vms being 8tb or so as the largest.

I guess that leads me to another question about space .

You have the vm size on the backup job properties if you go to the vms area. I assume that shows the size that veeam thinks all the vms in the job are regardless of actual used space by the vms. That makes sense
If you got to disks, right click the job, properties you can see Total Size and at the bottom of the properties here you can see a "Backup Size" which i assume is all your backup data after dedup/compression and all restore points and what its stored size is. That makes sense.

If you go to backups-->capacity tier--><job name> properties for the same job it will show you the space used on the capacity teir and i assume that fluctuations in size due to operational restore window settings and GFS retention....and i guess that makes sense.
.......but....it also shows it in a format like this Backup Size 20tb (3.3tb actual). What does that mean? Is it using 20 or 3?
I want to say that if i added up all the (Actual) numbers it does match what the AWS portal says we are using...but if we add up the other number labeled as backup size it doesnt appear to match it either. I guess im just curious on what "backup size" is vs (actual) with regards to capacity tier properties?
david.domask
Veeam Software
Posts: 2573
Liked: 603 times
Joined: Jun 28, 2016 12:12 pm
Contact:

Re: health checks and best practice

Post by david.domask »

Happy to help jcofin13.

As for that calculation, yes, it's actual storage usage on the target storage, and if I understood you correctly, the Actual Size matches your AWS portal usage, am I correct?

Actual may vary depending on immutability settings and retention settings, but in general you can rely on this value to track the usage.
David Domask | Product Management: Principal Analyst
Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 128 guests