Discussions specific to the VMware vSphere hypervisor
WinstonWolf
Expert
Posts: 187
Liked: 4 times
Joined: Jan 06, 2011 8:33 am
Contact:

B&R 9 : Health check of VM Backup Files needs so long time

Post by WinstonWolf » Jan 28, 2016 8:22 am

Hello ,

i activated the Option "Perform Backup Files Health Check" , but it needs 10 Hours on some Backup Jobs .
Why this Option needs so long time ?

Michael

P.Tide
Product Manager
Posts: 5128
Liked: 443 times
Joined: May 19, 2015 1:46 pm
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by P.Tide » Jan 28, 2016 10:23 am

Hi,

During health check VBR calculates CRC values for backup metadata and hash values for VM disks data blocks in the backup file and compares them with the CRC and hash values that are already stored in the backup file. The process can take a lot of time for large restore points. What's the average size of "some Backup Jobs" restore point?

Thank you.

WinstonWolf
Expert
Posts: 187
Liked: 4 times
Joined: Jan 06, 2011 8:33 am
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by WinstonWolf » Jan 28, 2016 12:10 pm

We have an retention Time of 30 Days .
The Problem is that we have an Tape Job after the VM Backup Job running and this Job was waiting long time for finishing the check Backup Files Job .

I think it should be possible may be as an Feature to create own Jobs for the Health Check option .

P.Tide
Product Manager
Posts: 5128
Liked: 443 times
Joined: May 19, 2015 1:46 pm
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by P.Tide » Jan 28, 2016 12:50 pm

So, how big is your average restore point?

WinstonWolf
Expert
Posts: 187
Liked: 4 times
Joined: Jan 06, 2011 8:33 am
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by WinstonWolf » Jan 28, 2016 2:03 pm

What you mean ? The vib Files or the vbk Files .
The vbk File is 6 TB big the everage vib files are 100 GB big .

P.Tide
Product Manager
Posts: 5128
Liked: 443 times
Joined: May 19, 2015 1:46 pm
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by P.Tide » Jan 28, 2016 3:22 pm

So you have 6Tb vbk + 29 .vibs (average size) 100GB. In order to make sure that your last restore point is vaild and can be used for recovery veeam needs to check all .vibs since your last full. In your case, if no intermediate fulls are present it is 9Tb of data to process. 10 hours looks good when talking about CRC check of 9Tb of data. To reduce the amount of time needed for health check consider configuring periodic fulls.

UPDATE at 3:46 PM (EST) 29/01.2016:

I checked with devs and it turned out that we actually check only blocks that compose the last restore point. Since those blocks can be spread across the whole backup chain that can be a lot of data to check, depending on many factors. Anyway 6Tb of data (even if compressed) that's still a lot work to do.

Thank you.

Gostev
SVP, Product Management
Posts: 24297
Liked: 3331 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by Gostev » Jan 28, 2016 9:56 pm

Pasha, this is not correct as only the latest VM state is checked for consistency. As such, we do not read the entire content of each and every backup file in the chain. As such, periodic fulls will make no difference to the health check performance (as this will not change the amount of data that we will need to read from the backup storage). Please, double-check with the devs. Thanks!

WinstonWolf
Expert
Posts: 187
Liked: 4 times
Joined: Jan 06, 2011 8:33 am
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by WinstonWolf » Jan 29, 2016 11:59 am

Ok , the i can forget the Health Check Option . It is too bad . Because the Time runs on the Backup Job Time .
And after this Backup Job there comes an Tape Job .

What i say before - I think it will be a good feature to have the possibility to create an own "Health Check Job "

I have the Feeling that on all new features from V9 something is wrong and other necessary features are Missing :cry:

lennis40
Expert
Posts: 123
Liked: 3 times
Joined: Nov 11, 2014 11:03 pm
Full Name: Michael
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by lennis40 » Feb 12, 2016 12:57 pm

We've had a few lengthy health checks as well, which has brought up some questions on what might be best practice. Health checks by default are disabled on backup jobs, and enabled on copy jobs. There must be a reason why the option is available on both, even though they're using the same backup files for the health check process. Other than spreading out the times when health checks run, I assume Veeam recommends to run on copy job, as that's the default setting? Without parallel processing to cloud repository on copy jobs, would it make more sense to run a health check on the backup job where parallel processing is available?

When we run into a health check taking several hours, it's holding up the other copy jobs from running any tasks. Even running them on backup files may delay the time the copy job starts transferring during the interval. I'm just curious to get the different opinions on what might be the best way to go about alleviating some of the wait time on health checks. Is it also safe to say that if SureBackup is being used, with the backup file verification option enabled, that health checks can be disabled on both backup jobs and copy jobs? Thanks for any input!

P.Tide
Product Manager
Posts: 5128
Liked: 443 times
Joined: May 19, 2015 1:46 pm
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by P.Tide » Feb 12, 2016 3:47 pm

There must be a reason why the option is available on both, even though they're using the same backup files for the health check process. Other than spreading out the times when health checks run, I assume Veeam recommends to run on copy job, as that's the default setting?
Health-check option has been added recently and is set to "disabled" state so it does not change the behaviour of backup jobs that had been already configured.
ven though they're using the same backup files for the health check process
That's not correct. Backup files produces by backup job and backup files produced by backup copy job are different sets of files. Backup copy health check protects your backup copy chain from storage corruption whereas backup health check protects you main backup chain from storage corruption.
would it make more sense to run a health check on the backup job where parallel processing is available?
No, it would not. However, there are some improvements on health-check planned in future releases.
Is it also safe to say that if SureBackup is being used, with the backup file verification option enabled, that health checks can be disabled on both backup jobs and copy jobs?
Yes if you run SureBackup on both backup and backup copy jobs. Did you mean backup validator when you said "backup file verification option"?

Thank you.

lennis40
Expert
Posts: 123
Liked: 3 times
Joined: Nov 11, 2014 11:03 pm
Full Name: Michael
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by lennis40 » Feb 12, 2016 4:09 pm

The backup file integrity check option, in the settings menu of the SureBackup wizard, is the validation I was talking about.

So if the copy job is checking the copy job chain, that health check is actually happening on the target repository, or in this case the cloud repository?

Thanks for the information. We certainly look forward to improvements in future releases.

bkc
Influencer
Posts: 20
Liked: 3 times
Joined: Dec 21, 2010 10:31 pm
Full Name: brad clements

[MERGED] : backup copy health check very slow

Post by bkc » Feb 15, 2016 4:57 pm

Hi,

Our backup copy job started a health check at the end of January and today I finally killed it off with only 61% complete. During the past 2 weeks the backup copy job didn't copy any jobs. (hardware details below)

I've disabled the health check feature in the job and restarted it.

After restarting I see that it is processing 4 out of 31 VMs simultaneously.

Would disabling parallel data processing reduce disk fragmentation on the target backup repository?

Would enabling use per-vm backup files possibly reduce the health check time?

infrastructure notes..

we have a 3 node vsphere 6 cluster (w/ CBT disabled) and SAN. Veeam backup software runs on a VM and stores data to an external physical linux server w/ RAID 1 via 1GB ethernet connection. The main backup repository is currently using 1.2TB of space

The backup copy job target repository is a ReadyNas 6 box w/ 6 hard drives in Raid 5, also reachable via 1GB ethernet. Yeah, the Readynas isn't great but it's not terrible. Those 4 vms are currently showing "x% completed at 3MB/sec"

The main backup repository machine is currently running 6 VeeamAgents, 0% load and 0.1 I/O wait state.

suggestions?

P.Tide
Product Manager
Posts: 5128
Liked: 443 times
Joined: May 19, 2015 1:46 pm
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by P.Tide » Feb 15, 2016 5:48 pm

Hi,
Would disabling parallel data processing reduce disk fragmentation on the target backup repository?
No, it would not. If you want to reduce fragmentation please use compact full backup file feature.
Would enabling use per-vm backup files possibly reduce the health check time?
No, as the amount of data to be checked stays the same.

Please provide your bottleneck statistics.

Thank you.

bkc
Influencer
Posts: 20
Liked: 3 times
Joined: Dec 21, 2010 10:31 pm
Full Name: brad clements

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by bkc » Feb 15, 2016 9:56 pm

Can you explain how fragmentation would not be reduced by having only one VM processed at a time? With 4 VMs all writing to the target at the same time their data blocks will be interleaved with each other as the target allocates free space to write to. Simultaneous writes is a classic cause of file fragmentation.

If I enable compact full, and all 4 vms start compacting their fulls at the same time, I'll have the same problem ... simultaneous writes to the same data store = fragmented files.

regarding bottleneck stats, the backup copy job just finished the first VM, 30 vms to go.

stats are very bad: 2/15/2016 4:31:13 PM :: Busy: Source 0% > Proxy 0% > Network 0% > Target 99%

The backup copy job doesn't report much for stats.. 7.4GB read at 533Kb/sec
Would be nice to have more stats regarding the target repository .. avg write speed vs avg read speed.

I guess this ReadyNas is a dog.

bkc
Influencer
Posts: 20
Liked: 3 times
Joined: Dec 21, 2010 10:31 pm
Full Name: brad clements

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by bkc » Feb 15, 2016 10:21 pm 2 people like this post

wait.. 2 ethernet interfaces on the Readynas box.

eth0 - rsync speed is about 600kb/sec (this is the interface veeam is using)

eth1 - rsync speed 19 MB/sec

I see lots of framing errors on eth0.. well that's hopeful, something I can actually troubleshoot

--

after reconfiguring veeam to use the good interface, I'm now seeing processing rate around 35MB/sec. yeah!

P.Tide
Product Manager
Posts: 5128
Liked: 443 times
Joined: May 19, 2015 1:46 pm
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by P.Tide » Feb 16, 2016 10:09 am

Can you explain how fragmentation would not be reduced by having only one VM processed at a time? With 4 VMs all writing to the target at the same time their data blocks will be interleaved with each other as the target allocates free space to write to. Simultaneous writes is a classic cause of file fragmentation.
My mistake. I was sure that something should had been invented in order to reduce fragmentation during parallel writes. My apologies.
If I enable compact full, and all 4 vms start compacting their fulls at the same time, I'll have the same problem ... simultaneous writes to the same data store = fragmented files.
Your logic is correct, that's why there is no parallel processing for compact operations.
after reconfiguring veeam to use the good interface, I'm now seeing processing rate around 35MB/sec. yeah!
Good to hear that. Feel free to contact us if any isuues arise.

Thank you.

bkc
Influencer
Posts: 20
Liked: 3 times
Joined: Dec 21, 2010 10:31 pm
Full Name: brad clements

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by bkc » Feb 17, 2016 2:45 am

so the copy job has now been running for 28 hours and about 26 of those hours it's been at 99% complete state.

it seems that after the vms are copied over there's a bunch of house-keeping to perform, creating some kind of fulls or something and now it seems to be creating a GFS restore point.

I think it would be very useful if these house-keeping steps appeared after the named list of vms with clear details of what's happening, what % is completed and the I/O performance the system sees such as read or write performance of the target backup repository.

likewise during the monthly health check it would be good to see more details about what is happening, how far along it is and what the I/O performance is.

showing only 99% complete for 26 hours w/o any other performance info isn't very informative..

thanks

bkc
Influencer
Posts: 20
Liked: 3 times
Joined: Dec 21, 2010 10:31 pm
Full Name: brad clements

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by bkc » Feb 17, 2016 2:28 pm

Job has been running 40 hours, still creating a GFS restore point, still at 99%.

I really think more feedback would be helpful..

andriktr
Enthusiast
Posts: 41
Liked: 1 time
Joined: Mar 02, 2015 11:53 am
Full Name: Andrej
Contact:

[MERGED] Health check for copy jobs

Post by andriktr » Feb 29, 2016 8:43 am

Hello,
Recently migrated our backup copy jobs from StoreOnce CIFS shares to StoreOnce catalyst stores. As was recommended I cloned old BC jobs and configured them with new catalyst repository. That means absolutely new backup chain started (GFS retention used). Jobs performance few times better than it was with shares, but one thing still look strange for me. We have enabled health check for these jobs ( being executed once per month) and this procedure takes a lot of time ~ 5-8 hours for job. I would like to mark that it's a pretty new jobs configured few days ago and each VM have only 2-3 recovery points. StoreOnce catalyst have a requirement to use per-VM backup files that means we will have much more .vbk files in repository. Can this will be a reason for such long health check process. Also what else can impact health check performance?
Thank you.

foggy
Veeam Software
Posts: 17931
Liked: 1512 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by foggy » Feb 29, 2016 2:52 pm

Depending on the backup size, this might be expected (see above). Are you saying it used to complete faster before migration to Catalyst?

andriktr
Enthusiast
Posts: 41
Liked: 1 time
Joined: Mar 02, 2015 11:53 am
Full Name: Andrej
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by andriktr » Feb 29, 2016 3:18 pm

No, on CIFS shares it wasn't faster also needed many time to complete. I expected that after migration to catalyst this time will be some how minimized.
Also hope that completion time amount will not grow up hugely in future due to a largest amount of backup files.

lando_uk
Expert
Posts: 302
Liked: 22 times
Joined: Oct 17, 2013 10:02 am
Full Name: Mark
Location: UK
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by lando_uk » Feb 29, 2016 5:21 pm

On this subject - If a health check of a copy job is fine, could one presume the last restore point of the main backup job that it was sourced from is also fine? Or could the same restore point be knackered on the backup job, but ok on the copy job?

P.Tide
Product Manager
Posts: 5128
Liked: 443 times
Joined: May 19, 2015 1:46 pm
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by P.Tide » Feb 29, 2016 6:01 pm

Hi
On this subject - If a health check of a copy job is fine, could one presume the last restore point of the main backup job that it was sourced from is also fine?
No.
Or could the same restore point be knackered on the backup job, but ok on the copy job?
Yes. For example your primary storage corruption can occur after the backup copy sync has been completed. In this case your backup copy restore point will be ok whereas your primary backup will not.

Please refer to helpcenter:
When a new synchronization interval starts, Veeam Backup & Replication performs a health check for the most recent restore point in the backup chain. Veeam Backup & Replication calculates checksums for data blocks in the backup file on the target backup repository and compares them with the checksums that are already stored in the backup file.
Your backup copy job is an independent set of files so Backup Copy health check makes sure that your secondary restore point is not corrupted.

Thank you.

foggy
Veeam Software
Posts: 17931
Liked: 1512 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by foggy » Mar 01, 2016 12:54 pm

andriktr wrote:Also hope that completion time amount will not grow up hugely in future due to a largest amount of backup files.
Per-VM backup chains option should not affect health check performance (actually, it could even be faster with per-VM, since less metadata is kept within each backup file and they are less fragmented).

andriktr
Enthusiast
Posts: 41
Liked: 1 time
Joined: Mar 02, 2015 11:53 am
Full Name: Andrej
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by andriktr » Mar 07, 2016 7:53 am

Are there any recommendations for StoreOnce catalyst repository maximum concurrent tasks?

Gostev
SVP, Product Management
Posts: 24297
Liked: 3331 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: B&R 9 : Health check of VM Backup Files needs so long ti

Post by Gostev » Mar 07, 2016 1:52 pm

Kindly please do not hijack this topic. This question is best directed into Catalyst support topic (and the number depends on Catalyst model anyway). Thanks!

robvs
Lurker
Posts: 1
Liked: never
Joined: Aug 07, 2017 6:52 am
Full Name: Rob
Contact:

[MERGED] Healthcheck Offside

Post by robvs » Aug 07, 2017 7:01 am

Hello,

I got some questions about the Veeam health check on the copy out jobs.

We got a VMWare enviroment with 120 VM's the total backup size is around 10 TB, these vm's are separated in 11 Veeam jobs (forever incremental).
All these veeam jobs have copy out jobs to a offside location.
The connection between these location is around 200Mbit.
The current problem we have now is that a health check on the offside backup could take around 1 week for some jobs, this is causing problems because the copyout job will not run untill the healthcheck is finished. When it acctually starts it will have to copy such a big amount of data that that will also take a lot of time to complete.

Is there a way to offload the health checks to a offside server because this would eliminate the bandwith problem that is causing the health checks taking a lot of time.
If not what would be a goot solution to this for speeding it up ?


Kind regards
Rob

DGrinev
Veeam Software
Posts: 1641
Liked: 201 times
Joined: Dec 01, 2016 3:49 pm
Full Name: Dmitry Grinev
Location: St.Petersburg
Contact:

Re: Healthcheck Offside

Post by DGrinev » Aug 07, 2017 1:18 pm

Hi Rob and welcome to the community!

Health-check performance depends on reading speed of the storage and the backup file size. It cannot be delegated to the remote server as it's part of the backup/copy job.
As an alternative you can to set up Surebackup job for the source backups as it's best approach for the recovery verification.
Please review this thread for additional information. Thanks!

egrutman
Novice
Posts: 5
Liked: never
Joined: May 31, 2016 1:22 pm
Full Name: Yevgeniy Grutman
Contact:

[MERGED] Health Check Long Duration

Post by egrutman » Oct 27, 2017 3:08 pm

Hello,

I have backups running and the health check takes days. We are backing up about 85 machines with average size of 125 GB. So total size is 10 TB of data we are backing up. The backups are stored on DD2200. Is there anyway to speed the process of the healthcheck? 12 hours have gone by and I am only at 5%. At this rate it will take 10 days to run a health check. This is not acceptable for critical system production machines to be without backups and I am trying to find anyway possible to increase resources to complete the job at a faster pace.

DGrinev
Veeam Software
Posts: 1641
Liked: 201 times
Joined: Dec 01, 2016 3:49 pm
Full Name: Dmitry Grinev
Location: St.Petersburg
Contact:

Re: Health Check Long Duration

Post by DGrinev » Oct 27, 2017 3:30 pm

Hi Yevgeniy,

Please review this discussion about the health check performance as it contains plenty of useful considerations. Thanks!

Post Reply

Who is online

Users browsing this forum: Google [Bot] and 12 guests