Comprehensive data protection for all workloads
DonZoomik
Service Provider
Posts: 368
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Health check on large backups

Post by DonZoomik »

Can someone from Veeam comment on async read tuning opportunities? Larger readahead window could work around eventual fragmentation and allow for faster health checks on capable hardware.
mkh
Service Provider
Posts: 64
Liked: 18 times
Joined: Apr 20, 2018 6:17 am
Full Name: Michael Høyer
Contact:

Re: Health check on large backups

Post by mkh »

perjonsson1960 wrote: May 26, 2021 6:03 am I am not sure if it is possible to convert without losing all the data? I have googled a little, and some say that it is possible, and some that it is not. And isn't RAID 6 slower than RAID 5 due to the parity calculations in all writes?
Sadly neither of these questions i can answer for you, it depends on what your raid card can do.
I'm my experience a capable enough raid card means there is little difference between raid 5 and 6 speed, since it has to calculate parity on raid 5 anyway.
In the end, in my view there is no choice, with the high risk of not being able to rebuild the array after a failed disk. A bit slower beats the risk of total failure.
perjonsson1960
Veteran
Posts: 463
Liked: 47 times
Joined: Jun 06, 2018 5:41 am
Full Name: Per Jonsson
Location: Sweden
Contact:

Re: Health check on large backups

Post by perjonsson1960 »

We have actually replaced one failed disk already, maybe a year ago. The spare disk kicked in, and when the failed disk was replaced, it was rebuilt successfully. I don't remember how long it took, but there were no complications.
18436572
Influencer
Posts: 20
Liked: 8 times
Joined: Jul 25, 2017 6:52 pm
Full Name: Devin Meade
Contact:

Re: Health check on large backups

Post by 18436572 » 2 people like this post

We had these issues with Veeam over a year ago. It was our main 14TB file share for the company. I decided to do Windows DFS and split the data up into 5 file servers. Best decision ever! Now our backups/copies/replicas are easy as pie. We also re-did our project folders to put each year into one folder \\server\share\project\year with one virtual server per year. After 5 years the projects go inactive... That's a major change and it took a few months to accomplish. It was definitely worth it!
DonZoomik
Service Provider
Posts: 368
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Health check on large backups

Post by DonZoomik »

I was observing Veeam behavior today and it seems to me that there is no read-ahead at all during health check.

A few days ago I was evacutating a SOBR extent and noticed that almost all IO was attributed to System process - this means that Windows itself was doing read-ahead. However this repository was very fragmented and I was seeing wildly fluctuating throughput from a low-end NAS. During high fragmentation QD went up (with lower throughput down to 50MB/s) while low fragmentation saw QD drop (with higher throughput up to 5-600MB/s).
Today I observed a health test and it was barely doing ~100MB/s on a quite high-end system (HPE Apollo 4200, P816 hardware RAID, 24*16TB disks in RAID60, SSD cache) that has been in operation for only two weeks (so low fragmentation). QD around one, all IO attributed to Veeam Process itself. I bet that if I checked handle attributes, the file was opened with flags that disable read-ahead. This chain is also reverse incremental so it is my understanding that health check largely just reads VBK file from start to end.

So no read-ahead at all or Veeam's async read is worse than Windows read-ahead?
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Health check on large backups

Post by Gostev »

Looks like you found a bug and async read is not initialized for health check.
DonZoomik
Service Provider
Posts: 368
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Health check on large backups

Post by DonZoomik »

Umm... does that mean that the bug is a fact or just a theory? Should I push through support or can you confirm through QA/RD?

If it's a theory then I think v10 behaved the same. I don't have a v10 similar system to check but I pointed to similar behavior in another thread.
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Health check on large backups

Post by Gostev » 3 people like this post

It's a fact. The dev behind async engine has already reviewed the source code for me and confirmed that health check job does not initialize the required parameter. So no need for push through support as the bug has already been logged.

V10 did not have the async engine at all (outside of virtual full to tape where we piloted this functionality) so yeah V11 behaves the same as V10 in that sense.
DonZoomik
Service Provider
Posts: 368
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Health check on large backups

Post by DonZoomik »

Well that's good news (as in the situation isn't hopeless).
Sounds like a minor thing that could get a fix in now quite frequent patches in reasonable timeframe (months)?
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Health check on large backups

Post by Gostev »

That's my hope too.
pirx
Veteran
Posts: 573
Liked: 75 times
Joined: Dec 20, 2015 6:24 pm
Contact:

Re: Health check on large backups

Post by pirx »

+1 Today my first health check of a single VM with 8 TB finished, it took ~8h (v10), in iotop I checked that the Veeam Agent was reading the data with ~300MB/s. Our regular jobs are 20-40 TB, so this will take more than one day.

I'm still looking into surebackup but it's also not that fast, depending of the tests you run. And a simple boot + ping will not detect all possible bad blocks, right?
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Health check on large backups

Post by Gostev »

Yes, you are right.
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Health check on large backups

Post by Gostev »

@pirx however, I should mention that SureBackup job also has an option to check all blocks. See the first checkbox on this step of the wizard. Thanks!
pirx
Veteran
Posts: 573
Liked: 75 times
Joined: Dec 20, 2015 6:24 pm
Contact:

Re: Health check on large backups

Post by pirx »

@Ghostev yeah, I've seen this and even used in my test job (also took ages, but I guess there's not much that can be changed there). It would still be nice to have a tool where I can randomly check single VM's. Something like right click -> check, or Validator for Linux.

It would be perfect to be able to check everything, but it seems just not possible. So my idea would be to randomly pick a number of VM's and let them be checked. That's not really possible with surebackup, right? Instead of checking a whole job once a month which then runs very long and RPO gets violated, let a some VM's from different jobs be checked every day.
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Health check on large backups

Post by Gostev »

This is a good idea.
pirx
Veteran
Posts: 573
Liked: 75 times
Joined: Dec 20, 2015 6:24 pm
Contact:

Re: Health check on large backups

Post by pirx »

Can I count that as yes for a feature request? Very high level description would be: let Veeam check a random number of x VM's each day that will be rechecked only after all other VM's were checked. So every VM (for which this check is activated) will be checked at some point. Maybe every 2 weeks, maybe only every 2 months. A priority list could have VM's that should be check daily/weekly.
tsightler
VP, Product Management
Posts: 6013
Liked: 2843 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Health check on large backups

Post by tsightler » 3 people like this post

In the interim, there's this great script from about 5 years ago that automates Surebackup via Powershell to effectively provide this capability within the existing feature set of Veeam. Basically you provide a list of systems in a text file and it modifies the Surebackup job every day with X number of systems per day so that, in the defined about of time, say over the course of a month, all of them are tested, by testing a different set each day.

https://www.virtualtothecore.com/can-te ... urebackup/

It's been quite a while, and I suppose the script might not work without modification on the latest VBR, but most of the commands look pretty straightforward so if something is broken it would hopefully be easy to modify.
pirx
Veteran
Posts: 573
Liked: 75 times
Joined: Dec 20, 2015 6:24 pm
Contact:

Re: Health check on large backups

Post by pirx »

Yes, this looks very interesting, I'll give it a try after my vacation.
pirx
Veteran
Posts: 573
Liked: 75 times
Joined: Dec 20, 2015 6:24 pm
Contact:

Re: Health check on large backups

Post by pirx »

I checked one other job with 117 VM's where I enabled health checks and was surprised that this job finished in just 35min.

VM size: 10,4 TB
Backup files health check has been completed 35:48

Those are jobs with just one VM.

VM size: 8,1 TB
Backup files health check has been completed 07:38:38

VM size: 3,2 TB
Backup files health check has been completed 02:32:54

Why is the one job that fast and others much slower?
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Health check on large backups

Post by Gostev »

Due to the presence of many empty or repeating blocks perhaps. Remember Veeam has built-in deduplication and only stores such blocks once.
DonZoomik
Service Provider
Posts: 368
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Health check on large backups

Post by DonZoomik »

Also maybe a lot of similarly sized backups combined with per-vm chains? This provides additional parallelism, greatly improving throughput on capable storage devices.
dweide
Enthusiast
Posts: 38
Liked: 9 times
Joined: Mar 29, 2012 1:57 pm
Full Name: D. Weide
Contact:

Re: Health check on large backups

Post by dweide »

Just to add another example of real slow backup verification:

- 17.8 TB VM
- Backup files health check has been completed 67:07:03
perjonsson1960
Veteran
Posts: 463
Liked: 47 times
Joined: Jun 06, 2018 5:41 am
Full Name: Per Jonsson
Location: Sweden
Contact:

Re: Health check on large backups

Post by perjonsson1960 »

What happens if the health check of a backup copy job is still running when the copy interval expires?
foggy
Veeam Software
Posts: 21073
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Health check on large backups

Post by foggy »

The interval is extended and the health check continues.
dejan.ilic@liu.se
Enthusiast
Posts: 29
Liked: never
Joined: Apr 11, 2019 11:37 am
Full Name: Dejan Ilic
Contact:

Re: Health check on large backups

Post by dejan.ilic@liu.se »

Just a quick question, why can't the health check be implemented in a separate job/process so that it won't break the normal backup job schedule?

It it would find an error it wouldn't matter if it is signaled later, Veeam B&R would have to do the error handling anyway.
Worst case is that any backups afterwards the health check is started are invalid (which the health check could detect)
If it doesn't find (the normal case) it wouldn't interfere with the next backup run and pick up backup data that the current implementation of "syncronous health check" jobs misses.

So in best case, the backup jobs are not interferred and in worst case is no worse that the current implementation where all the jobs that should run but are missed due to health check dont do any backups.

(we had a fileserver with 21TB+ data in one filesystem, healthchecks 60 hours)
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Health check on large backups

Post by Gostev » 2 people like this post

No reason why it cannot be, in fact we're working on implementing this change right now :D
garrettt12
Novice
Posts: 3
Liked: 1 time
Joined: Oct 24, 2019 3:29 pm
Full Name: Garrett
Contact:

Re: Health check on large backups

Post by garrettt12 » 1 person likes this post

Will this separation change allow us to have health checks that run after the original job's backup window would gave terminated it?

We have customers with monster VMs and very "value engineered" virtual environments that need strict backup windows, but these cut off healthchecks which we'd have no problem running against backup storage during business hours otherwise.
DonZoomik
Service Provider
Posts: 368
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Health check on large backups

Post by DonZoomik »

Gostev wrote: Jun 09, 2021 9:37 pm That's my hope too.
Any news?
DonZoomik
Service Provider
Posts: 368
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Health check on large backups

Post by DonZoomik »

Gostev, any hope of this making into v11a?
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Health check on large backups

Post by Gostev »

I think so. @HannesK could you please check this did not get lost?
Post Reply

Who is online

Users browsing this forum: No registered users and 114 guests