-
- Service Provider
- Posts: 372
- Liked: 120 times
- Joined: Nov 25, 2016 1:56 pm
- Full Name: Mihkel Soomere
- Contact:
Re: Health check on large backups
Can someone from Veeam comment on async read tuning opportunities? Larger readahead window could work around eventual fragmentation and allow for faster health checks on capable hardware.
-
- Service Provider
- Posts: 64
- Liked: 18 times
- Joined: Apr 20, 2018 6:17 am
- Full Name: Michael Høyer
- Contact:
Re: Health check on large backups
Sadly neither of these questions i can answer for you, it depends on what your raid card can do.perjonsson1960 wrote: ↑May 26, 2021 6:03 am I am not sure if it is possible to convert without losing all the data? I have googled a little, and some say that it is possible, and some that it is not. And isn't RAID 6 slower than RAID 5 due to the parity calculations in all writes?
I'm my experience a capable enough raid card means there is little difference between raid 5 and 6 speed, since it has to calculate parity on raid 5 anyway.
In the end, in my view there is no choice, with the high risk of not being able to rebuild the array after a failed disk. A bit slower beats the risk of total failure.
-
- Veteran
- Posts: 527
- Liked: 58 times
- Joined: Jun 06, 2018 5:41 am
- Full Name: Per Jonsson
- Location: Sweden
- Contact:
Re: Health check on large backups
We have actually replaced one failed disk already, maybe a year ago. The spare disk kicked in, and when the failed disk was replaced, it was rebuilt successfully. I don't remember how long it took, but there were no complications.
-
- Enthusiast
- Posts: 27
- Liked: 10 times
- Joined: Jul 25, 2017 6:52 pm
- Full Name: Devin Meade
- Contact:
Re: Health check on large backups
We had these issues with Veeam over a year ago. It was our main 14TB file share for the company. I decided to do Windows DFS and split the data up into 5 file servers. Best decision ever! Now our backups/copies/replicas are easy as pie. We also re-did our project folders to put each year into one folder \\server\share\project\year with one virtual server per year. After 5 years the projects go inactive... That's a major change and it took a few months to accomplish. It was definitely worth it!
-
- Service Provider
- Posts: 372
- Liked: 120 times
- Joined: Nov 25, 2016 1:56 pm
- Full Name: Mihkel Soomere
- Contact:
Re: Health check on large backups
I was observing Veeam behavior today and it seems to me that there is no read-ahead at all during health check.
A few days ago I was evacutating a SOBR extent and noticed that almost all IO was attributed to System process - this means that Windows itself was doing read-ahead. However this repository was very fragmented and I was seeing wildly fluctuating throughput from a low-end NAS. During high fragmentation QD went up (with lower throughput down to 50MB/s) while low fragmentation saw QD drop (with higher throughput up to 5-600MB/s).
Today I observed a health test and it was barely doing ~100MB/s on a quite high-end system (HPE Apollo 4200, P816 hardware RAID, 24*16TB disks in RAID60, SSD cache) that has been in operation for only two weeks (so low fragmentation). QD around one, all IO attributed to Veeam Process itself. I bet that if I checked handle attributes, the file was opened with flags that disable read-ahead. This chain is also reverse incremental so it is my understanding that health check largely just reads VBK file from start to end.
So no read-ahead at all or Veeam's async read is worse than Windows read-ahead?
A few days ago I was evacutating a SOBR extent and noticed that almost all IO was attributed to System process - this means that Windows itself was doing read-ahead. However this repository was very fragmented and I was seeing wildly fluctuating throughput from a low-end NAS. During high fragmentation QD went up (with lower throughput down to 50MB/s) while low fragmentation saw QD drop (with higher throughput up to 5-600MB/s).
Today I observed a health test and it was barely doing ~100MB/s on a quite high-end system (HPE Apollo 4200, P816 hardware RAID, 24*16TB disks in RAID60, SSD cache) that has been in operation for only two weeks (so low fragmentation). QD around one, all IO attributed to Veeam Process itself. I bet that if I checked handle attributes, the file was opened with flags that disable read-ahead. This chain is also reverse incremental so it is my understanding that health check largely just reads VBK file from start to end.
So no read-ahead at all or Veeam's async read is worse than Windows read-ahead?
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Health check on large backups
Looks like you found a bug and async read is not initialized for health check.
-
- Service Provider
- Posts: 372
- Liked: 120 times
- Joined: Nov 25, 2016 1:56 pm
- Full Name: Mihkel Soomere
- Contact:
Re: Health check on large backups
Umm... does that mean that the bug is a fact or just a theory? Should I push through support or can you confirm through QA/RD?
If it's a theory then I think v10 behaved the same. I don't have a v10 similar system to check but I pointed to similar behavior in another thread.
If it's a theory then I think v10 behaved the same. I don't have a v10 similar system to check but I pointed to similar behavior in another thread.
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Health check on large backups
It's a fact. The dev behind async engine has already reviewed the source code for me and confirmed that health check job does not initialize the required parameter. So no need for push through support as the bug has already been logged.
V10 did not have the async engine at all (outside of virtual full to tape where we piloted this functionality) so yeah V11 behaves the same as V10 in that sense.
V10 did not have the async engine at all (outside of virtual full to tape where we piloted this functionality) so yeah V11 behaves the same as V10 in that sense.
-
- Service Provider
- Posts: 372
- Liked: 120 times
- Joined: Nov 25, 2016 1:56 pm
- Full Name: Mihkel Soomere
- Contact:
Re: Health check on large backups
Well that's good news (as in the situation isn't hopeless).
Sounds like a minor thing that could get a fix in now quite frequent patches in reasonable timeframe (months)?
Sounds like a minor thing that could get a fix in now quite frequent patches in reasonable timeframe (months)?
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Health check on large backups
That's my hope too.
-
- Veteran
- Posts: 599
- Liked: 87 times
- Joined: Dec 20, 2015 6:24 pm
- Contact:
Re: Health check on large backups
+1 Today my first health check of a single VM with 8 TB finished, it took ~8h (v10), in iotop I checked that the Veeam Agent was reading the data with ~300MB/s. Our regular jobs are 20-40 TB, so this will take more than one day.
I'm still looking into surebackup but it's also not that fast, depending of the tests you run. And a simple boot + ping will not detect all possible bad blocks, right?
I'm still looking into surebackup but it's also not that fast, depending of the tests you run. And a simple boot + ping will not detect all possible bad blocks, right?
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Health check on large backups
Yes, you are right.
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Health check on large backups
@pirx however, I should mention that SureBackup job also has an option to check all blocks. See the first checkbox on this step of the wizard. Thanks!
-
- Veteran
- Posts: 599
- Liked: 87 times
- Joined: Dec 20, 2015 6:24 pm
- Contact:
Re: Health check on large backups
@Ghostev yeah, I've seen this and even used in my test job (also took ages, but I guess there's not much that can be changed there). It would still be nice to have a tool where I can randomly check single VM's. Something like right click -> check, or Validator for Linux.
It would be perfect to be able to check everything, but it seems just not possible. So my idea would be to randomly pick a number of VM's and let them be checked. That's not really possible with surebackup, right? Instead of checking a whole job once a month which then runs very long and RPO gets violated, let a some VM's from different jobs be checked every day.
It would be perfect to be able to check everything, but it seems just not possible. So my idea would be to randomly pick a number of VM's and let them be checked. That's not really possible with surebackup, right? Instead of checking a whole job once a month which then runs very long and RPO gets violated, let a some VM's from different jobs be checked every day.
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Health check on large backups
This is a good idea.
-
- Veteran
- Posts: 599
- Liked: 87 times
- Joined: Dec 20, 2015 6:24 pm
- Contact:
Re: Health check on large backups
Can I count that as yes for a feature request? Very high level description would be: let Veeam check a random number of x VM's each day that will be rechecked only after all other VM's were checked. So every VM (for which this check is activated) will be checked at some point. Maybe every 2 weeks, maybe only every 2 months. A priority list could have VM's that should be check daily/weekly.
-
- VP, Product Management
- Posts: 6035
- Liked: 2860 times
- Joined: Jun 05, 2009 12:57 pm
- Full Name: Tom Sightler
- Contact:
Re: Health check on large backups
In the interim, there's this great script from about 5 years ago that automates Surebackup via Powershell to effectively provide this capability within the existing feature set of Veeam. Basically you provide a list of systems in a text file and it modifies the Surebackup job every day with X number of systems per day so that, in the defined about of time, say over the course of a month, all of them are tested, by testing a different set each day.
https://www.virtualtothecore.com/can-te ... urebackup/
It's been quite a while, and I suppose the script might not work without modification on the latest VBR, but most of the commands look pretty straightforward so if something is broken it would hopefully be easy to modify.
https://www.virtualtothecore.com/can-te ... urebackup/
It's been quite a while, and I suppose the script might not work without modification on the latest VBR, but most of the commands look pretty straightforward so if something is broken it would hopefully be easy to modify.
-
- Veteran
- Posts: 599
- Liked: 87 times
- Joined: Dec 20, 2015 6:24 pm
- Contact:
Re: Health check on large backups
Yes, this looks very interesting, I'll give it a try after my vacation.
-
- Veteran
- Posts: 599
- Liked: 87 times
- Joined: Dec 20, 2015 6:24 pm
- Contact:
Re: Health check on large backups
I checked one other job with 117 VM's where I enabled health checks and was surprised that this job finished in just 35min.
VM size: 10,4 TB
Backup files health check has been completed 35:48
Those are jobs with just one VM.
VM size: 8,1 TB
Backup files health check has been completed 07:38:38
VM size: 3,2 TB
Backup files health check has been completed 02:32:54
Why is the one job that fast and others much slower?
VM size: 10,4 TB
Backup files health check has been completed 35:48
Those are jobs with just one VM.
VM size: 8,1 TB
Backup files health check has been completed 07:38:38
VM size: 3,2 TB
Backup files health check has been completed 02:32:54
Why is the one job that fast and others much slower?
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Health check on large backups
Due to the presence of many empty or repeating blocks perhaps. Remember Veeam has built-in deduplication and only stores such blocks once.
-
- Service Provider
- Posts: 372
- Liked: 120 times
- Joined: Nov 25, 2016 1:56 pm
- Full Name: Mihkel Soomere
- Contact:
Re: Health check on large backups
Also maybe a lot of similarly sized backups combined with per-vm chains? This provides additional parallelism, greatly improving throughput on capable storage devices.
-
- Enthusiast
- Posts: 38
- Liked: 9 times
- Joined: Mar 29, 2012 1:57 pm
- Full Name: D. Weide
- Contact:
Re: Health check on large backups
Just to add another example of real slow backup verification:
- 17.8 TB VM
- Backup files health check has been completed 67:07:03
- 17.8 TB VM
- Backup files health check has been completed 67:07:03
-
- Veteran
- Posts: 527
- Liked: 58 times
- Joined: Jun 06, 2018 5:41 am
- Full Name: Per Jonsson
- Location: Sweden
- Contact:
Re: Health check on large backups
What happens if the health check of a backup copy job is still running when the copy interval expires?
-
- Veeam Software
- Posts: 21139
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Health check on large backups
The interval is extended and the health check continues.
-
- Enthusiast
- Posts: 37
- Liked: 1 time
- Joined: Apr 11, 2019 11:37 am
- Full Name: Dejan Ilic
- Contact:
Re: Health check on large backups
Just a quick question, why can't the health check be implemented in a separate job/process so that it won't break the normal backup job schedule?
It it would find an error it wouldn't matter if it is signaled later, Veeam B&R would have to do the error handling anyway.
Worst case is that any backups afterwards the health check is started are invalid (which the health check could detect)
If it doesn't find (the normal case) it wouldn't interfere with the next backup run and pick up backup data that the current implementation of "syncronous health check" jobs misses.
So in best case, the backup jobs are not interferred and in worst case is no worse that the current implementation where all the jobs that should run but are missed due to health check dont do any backups.
(we had a fileserver with 21TB+ data in one filesystem, healthchecks 60 hours)
It it would find an error it wouldn't matter if it is signaled later, Veeam B&R would have to do the error handling anyway.
Worst case is that any backups afterwards the health check is started are invalid (which the health check could detect)
If it doesn't find (the normal case) it wouldn't interfere with the next backup run and pick up backup data that the current implementation of "syncronous health check" jobs misses.
So in best case, the backup jobs are not interferred and in worst case is no worse that the current implementation where all the jobs that should run but are missed due to health check dont do any backups.
(we had a fileserver with 21TB+ data in one filesystem, healthchecks 60 hours)
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Health check on large backups
No reason why it cannot be, in fact we're working on implementing this change right now
-
- Novice
- Posts: 3
- Liked: 1 time
- Joined: Oct 24, 2019 3:29 pm
- Full Name: Garrett
- Contact:
Re: Health check on large backups
Will this separation change allow us to have health checks that run after the original job's backup window would gave terminated it?
We have customers with monster VMs and very "value engineered" virtual environments that need strict backup windows, but these cut off healthchecks which we'd have no problem running against backup storage during business hours otherwise.
We have customers with monster VMs and very "value engineered" virtual environments that need strict backup windows, but these cut off healthchecks which we'd have no problem running against backup storage during business hours otherwise.
-
- Service Provider
- Posts: 372
- Liked: 120 times
- Joined: Nov 25, 2016 1:56 pm
- Full Name: Mihkel Soomere
- Contact:
-
- Service Provider
- Posts: 372
- Liked: 120 times
- Joined: Nov 25, 2016 1:56 pm
- Full Name: Mihkel Soomere
- Contact:
Re: Health check on large backups
Gostev, any hope of this making into v11a?
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Health check on large backups
I think so. @HannesK could you please check this did not get lost?
Who is online
Users browsing this forum: No registered users and 263 guests