Host-based backup of VMware vSphere VMs.
Post Reply
cfvonner
Enthusiast
Posts: 32
Liked: 3 times
Joined: Nov 16, 2012 9:58 pm
Full Name: Carl Von Stetten
Contact:

Health Check on Copy Backup Job stalled the job

Post by cfvonner »

I recently created a copy backup job that copies VMs from three different nightly backup jobs stored on a Nimble SAN into a single repository on a Drobo B1200i. I have archival restore points enabled, and currently have about 4.2 TB of full and incremental backups stored. I have the copy job configured to keep 90 restore points, and to keep some restore points for archival purposes (montly, quarterly, yearly). I have not yet reached 90 restore points.

I have the "Health check" feature turned on and set to do a check on the last Sunday of the month. Since this past Sunday was the last Sunday of June, the copy backup job started a new interval on schedule, and then hung at "Starting backup file verification". The Duration counter keeps running, and I can see from the Windows Resource Monitor that the VeeamAgent has open file handles on pretty much all of the .vib files in the repository. This has been running for four days now, and no new backups have been copied by this job. Is this normal behavior?

-Carl V.
Gostev
Chief Product Officer
Posts: 31814
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Health Check on Copy Backup Job stalled the job

Post by Gostev »

No, it should complete as fast as it takes to read all your backup files.
cfvonner
Enthusiast
Posts: 32
Liked: 3 times
Joined: Nov 16, 2012 9:58 pm
Full Name: Carl Von Stetten
Contact:

Re: Health Check on Copy Backup Job stalled the job

Post by cfvonner »

Of course, as soon as I post, it finishes and starts running a copy job... :oops:

Unfortunately, it only seems to be copying last night's backup, so I'll have a gap of a few days. Any way to make Veeam go back and copy the nighly backups from the previous three nights?

-Carl V.
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Health Check on Copy Backup Job stalled the job

Post by foggy »

No, backup copy job copies the latest available state only.
readie
Expert
Posts: 158
Liked: 30 times
Joined: Dec 05, 2010 9:29 am
Full Name: Bob Eadie
Contact:

[MERGED] Stopping long running Health Check

Post by readie »

I have a Backup Copy Health Check which has now been running for over 3 days and is only 70% done! I should like to stop it, as I have some server changes to make (which might actually make it faster in the long run), but unlike other jobs there is no option to 'stop session'.
Can I just stop the veeam services somewhere, or do I just have to be patient and wait another day or two?
Bob
Bob Eadie
Computer Manager at Bedford School, UK (since 1999).
Veeam user since 2009.
Shestakov
Veteran
Posts: 7328
Liked: 781 times
Joined: May 21, 2014 11:03 am
Full Name: Nikita Shestakov
Location: Prague
Contact:

Re: Health Check on Copy Backup Job stalled the job

Post by Shestakov »

Hello Bob,
Health check is performed by target side. In general it should not take 3 days. Could you check if CPU or Memory use 100% if their capacity? And what is the size of backups of the job?
What version of VBR are you at?
readie wrote:Can I just stop the veeam services somewhere, or do I just have to be patient and wait another day or two?
Even if you stop the process, health check will be started again from the beginning. You can disable it permanently though.
Thanks!
readie
Expert
Posts: 158
Liked: 30 times
Joined: Dec 05, 2010 9:29 am
Full Name: Bob Eadie
Contact:

Re: Health Check on Copy Backup Job stalled the job

Post by readie »

Thanks. Version 8.0.0.917.
I have left it and it has finished the Health Check (I've never noticed this before? is it a new feature?) and has now started its backup copy - 23% through after about 3 hours (though I am not sure when it finished the health check and started the backup copy.
I'm fairly sure it is Windows iSCSI to fairly slow SANs (Netstor and Synology) which is limiting it - the total size of the job is about 7TB!! Do these timings seem OK, or way slower than you would expect?
Can you confirm whether the health check is comparing source with target - so reading from both?
(I think now I've spotted this, I shall have to think which months it is sensible to run - probably December, July - when the School is on holiday. (Feature request - it would be nice to select a date rather than day of the week - e.g. starting on 23rd December would be ideal. Last day of December (whatever weekday I choose) may well run on into 3rd or 4th Jan when school might be restarting . . . . but I could manage this manually each year by changing the day of the week!)
At least I am now happier that I know what's going on.

Follow up question. If the backup copy job is not due to run on the Sunday of the health check, will it just do the health check and NOT continue to the backup cope job?
Bob Eadie
Computer Manager at Bedford School, UK (since 1999).
Veeam user since 2009.
Shestakov
Veteran
Posts: 7328
Liked: 781 times
Joined: May 21, 2014 11:03 am
Full Name: Nikita Shestakov
Location: Prague
Contact:

Re: Health Check on Copy Backup Job stalled the job

Post by Shestakov »

readie wrote:Thanks. Version 8.0.0.917.
There is a newer version available. Worth updating.
readie wrote:is it a new feature?
No, it`s not. Can`t remember when it appeared, but it was in v7 for sure.
readie wrote:Do these timings seem OK, or way slower than you would expect?
Waiting the health check for 3 days doesn`t seem ok for me, but it depends on your target proxy capabilities a lot. Please, answer the question I asked in the previous post? (Could you check if CPU or Memory use 100% if their capacity?)
readie wrote:Can you confirm whether the health check is comparing source with target - so reading from both?
Health check procedure is explained here. Please get familiar and ask additional questions if you have any.
readie wrote:(Feature request - it would be nice to select a date rather than day of the week - e.g. starting on 23rd December would be ideal. Last day of December (whatever weekday I choose) may well run on into 3rd or 4th Jan when school might be restarting . . . . but I could manage this manually each year by changing the day of the week!)
Normally it should not take several days. And if you want be sure your backups are recoverable with no issues, I`d suggest to use our new feature Surebackup.
Thanks!
ferrus
Veeam ProPartner
Posts: 300
Liked: 44 times
Joined: Dec 03, 2015 3:41 pm
Location: UK
Contact:

Re: Health Check on Copy Backup Job stalled the job

Post by ferrus »

Sorry to dig up an old thread - but do these time estimations still count for dedup devices?

We're about to head into the third night without a backup copy of a file server, to an EMC Data Domain - because the Health Check is still running from the weekend.
The Health Check is scheduled for once a month and loses us at least one full night of backup copy's each time.
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Health Check on Copy Backup Job stalled the job

Post by foggy »

There're no general estimation, everything depends on the particular storage. But 3 days looks long indeed.
ferrus
Veeam ProPartner
Posts: 300
Liked: 44 times
Joined: Dec 03, 2015 3:41 pm
Location: UK
Contact:

Re: Health Check on Copy Backup Job stalled the job

Post by ferrus »

I thought part of it may have been congestion because of (almost) all the Health Checks starting concurrently.
But it seems there was one that was delayed - and on it's own, it has the same performance.

Parent job - Two file server VMs - 3.5TB
Copy job, from Veeam DAS to EMC DD2500 - both fibre connected

This morning it stands at 79% complete, after 57 hours.

Is there any checking I do to find the cause/bottleneck?
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Health Check on Copy Backup Job stalled the job

Post by foggy »

Health check needs to rehydrate a lot of data to calculate checksums for each of the stored block, so slow performance is expected on dedupe storage.
efranklin
Influencer
Posts: 23
Liked: 1 time
Joined: Jan 26, 2017 8:36 pm
Full Name: Eric Franklin
Contact:

[MERGED] Backup Copy Job Health Checks

Post by efranklin »

I managed multiple B&R servers and I've found that the backup job monthly health checks are taking multiple days to finish on large backup jobs. Majority of the destinations are to a Cloud Connect server which I also manage. I'm trying to find ways to improve the time to finish the health check. Any suggestions would be appreciated.

One question I haven't been able to find an answer to is when performing a health check on a backup copy job, is it all processed on the local B&R server or is it on the remote destination(i.e. Cloud Connect server)? If it's run on the local B&R server, how much does the available bandwidth relate to the speed of the health check? Would WAN acceleration help speed up the health check?

Thanks,
Eric
PTide
Product Manager
Posts: 6551
Liked: 765 times
Joined: May 19, 2015 1:46 pm
Contact:

Re: Health Check on Copy Backup Job stalled the job

Post by PTide »

Hi,

What kind of repository is used on the Cloud Connect side? During health check checksums for data blocks in the backup files stored on the target repo are calculated, so the most part of the job is performed at the destination (be it repository or gateway in case of CIFS repo). Neither increased bandwidth nor WAN accelerator will give a significant boost as there is not so much data to transfer between sites.

Thank you
efranklin
Influencer
Posts: 23
Liked: 1 time
Joined: Jan 26, 2017 8:36 pm
Full Name: Eric Franklin
Contact:

Re: Health Check on Copy Backup Job stalled the job

Post by efranklin »

The repository on the Cloud Connect side is a Synology NAS hooked up SMB.

If I'm understanding this correctly, I would need to improve the speed on the Cloud Connect side to improve the health check times? Should I focus more on storage I/O versus CPU & memory?
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Health Check on Copy Backup Job stalled the job

Post by foggy »

Read speed is mostly important during health check. Also please check whether the CIFS repository has a gateway server located close to it.
Post Reply

Who is online

Users browsing this forum: No registered users and 16 guests