Comprehensive data protection for all workloads
Post Reply
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Corrupted backup copy

Post by mkretzer »

Hello,

Case 02404420

yesterday we had a outage of the connection to our remote backup copy target. The outage happened as some copy jobs copied data and some merged. Also, we had a defective disk (of a RAID 6) the day before in one of the remote locations storages.
There was no outage of power and connection between the copy target host and its storage system. Just the WAN link.

Today the two health checks that ran both showed corrupted backups.

How likely is it that the corruption comes from the WAN link failure?

Markus
veremin
Product Manager
Posts: 20270
Liked: 2252 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Corrupted backup copy

Post by veremin »

Abrupt stop of merge operation might leave backups in an inconsistent state. Support team should be able to provide the precise root cause after debug log investigation. Thanks.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Corrupted backup copy

Post by foggy » 2 people like this post

Next job run will repair corrupt restore points and finalize the merge correctly, this will happen automatically.
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Corrupted backup copy

Post by mkretzer »

@foggy:
Thats what i thought. In the last 4 years we had crashes, disable of copy jobs in the middle of HC / Merge / Transfer but never had our monthly health check show an inconsistency.
We checked more backups and every backup file on the copy storage target shows inconsistencies for some VMs - even after merges.

Right now we are health-checking another backup copy on the same target repo server but a different (also different brand) backend storage.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Corrupted backup copy

Post by foggy »

Then please check with support.
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Corrupted backup copy

Post by mkretzer »

"whether it was caused by network issues - No, it wasn't"

Sounds like the backend storage is the problem - that would be nice that way we can continue to trust Veeam backup copies...
Gostev
Chief Product Officer
Posts: 31459
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Corrupted backup copy

Post by Gostev » 1 person likes this post

Inconsistent restore point due to failed/aborted merge is not a problem as it will be repaired automatically by the next run (and previous restore points will not be affected regardless). As long as the backup storage itself is solid and not causing the corruption, there's nothing to worry about. This is why it is important to determine the root cause with our support. Thanks!
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Corrupted backup copy

Post by mkretzer »

Will do. We also suspect the storage now.

Problem is it is a copy target and support still has not provided us information how we can run the Validator without a full Veeam installation...
Eamonn Deering
Service Provider
Posts: 33
Liked: 4 times
Joined: Feb 29, 2012 1:42 pm
Full Name: EamonnD
Location: Dublin, Ireland
Contact:

Re: Corrupted backup copy

Post by Eamonn Deering »

If you suspect the storage then let us know what storage are you using and firmware level? Someone out there may be able to through some light on the matter.
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Corrupted backup copy

Post by mkretzer »

It was a MD3600f from Dell. But that storage has no data corruption protection. Now we migrated to a HUS110 from Hitachi which has all the protections. Still, the same backup shows as corrupted (we re-seeded it from the main site).

Main site backup was also checked and shows no error.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Corrupted backup copy

Post by foggy »

So you've copied the verified backups from source, use them to seed the job. and they are shown as corrupt on target?
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Corrupted backup copy

Post by mkretzer »

No, not immediately. The first check was ok.

Here is what happened exactly:

- Backup copy job mirroring backup from the primary site (last active full ~ 4 weeks ago, health check yesterday) to a remote site. Remote site had a Dell MD3600f (RAID 6 config) as backup storage.
- Last week this MD3600f had an strange warning that a disk in in per-failure state.
- My co-worker sadly replaced the wrong disk from the same RAID 6. We waited for the raid rebuild to finish before we replaced the real defective disk. My theory is that the first rebuild was done with the data from the partly dead disk
- After all the rebuilds we saw corruption on nearly all health checks we did on the remote site
- Since we had a free Hitachi HUS110 on the main site we used temporary backup copy jobs to create a new backup seed on that storage
- We then brought that storage system to the remote site, created & rescanned the repo, disabled the copy jobs, removed the old, corrupt backups copied from configuration, targeted the copy jobs to the new repo and mapped the backups
- All copy jobs did an initial health check without errors
- After two days (and the first merge) another health check showed a corruption in the same VM file again as on the old storage
- A heath check on the primary site showed no issues with the primary site backups

The only thing i can think of if that Veeam tried to "heal" the corrupt blocks on the remote site after the backup target was replaced (we kept the same jobs).
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Corrupted backup copy

Post by foggy »

Looks strange, please ask support to investigate.
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Corrupted backup copy

Post by mkretzer » 1 person likes this post

They are on it. Case 02409456 now.
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Corrupted backup copy

Post by mkretzer »

One more thing: How should the "auto-healing" of the copy job data work? I do not have the feeling it is doing its job!
Gostev
Chief Product Officer
Posts: 31459
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Corrupted backup copy

Post by Gostev »

mkretzer wrote:How should the "auto-healing" of the copy job data work?
https://helpcenter.veeam.com/docs/backu ... tml?ver=95
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Corrupted backup copy

Post by mkretzer »

So until the merge of that point is done health check can still find corrupted data?
Gostev
Chief Product Officer
Posts: 31459
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Corrupted backup copy

Post by Gostev »

Not sure I get the question... but health check verifies the latest restore point regardless of its placement in the backup files.
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Corrupted backup copy

Post by mkretzer »

But it always shows the error in the vbk:
06.12.2017 09:54:42 :: Disk wsus_1-flat.vmdk of VM wsus.sw.buhl-data.com is corrupted, possible reason: Storage I/O issue. Corrupted data is located in the following backup files: wsus.vm-781264D2017-12-04T132056.vbk

If there is an incremental point after that (which fixes the "chain" but not the vbk until merge and another health check will run will Veeam still show the corruption in this health check?
DGrinev
Veteran
Posts: 1943
Liked: 247 times
Joined: Dec 01, 2016 3:49 pm
Full Name: Dmitry Grinev
Location: St.Petersburg
Contact:

Re: Corrupted backup copy

Post by DGrinev »

Hi,

No, since we're reading backup chain starting from the latest increment (which consist of healthy data blocks), the corrupted data blocks are not needed anymore. Thanks!
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Corrupted backup copy

Post by mkretzer »

Then this does not seem to work. In the today copy job a corruption of a smaller VM got detected, the new incement was transfered but validator still shows the corruption. But i really have the feeling that validator checks only the vbk or at least starts with the vbk.

I will try to get a health check running and see if the result differs.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Corrupted backup copy

Post by foggy »

Veeam Backup Validator is a different thing, unlike health check, it recalculates checksums for the entire backup chain.
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Corrupted backup copy

Post by mkretzer »

So in that case after ther backup with the correction is merged validator should show no error anymore?
DGrinev
Veteran
Posts: 1943
Liked: 247 times
Joined: Dec 01, 2016 3:49 pm
Full Name: Dmitry Grinev
Location: St.Petersburg
Contact:

Re: Corrupted backup copy

Post by DGrinev »

After a brief discussion with the QA team, there are no errors expected after the merge of healthy blocks is done. Thanks!
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Corrupted backup copy

Post by mkretzer »

Strange: According to support health check might not really fix these issues:
"About the HealthCheck - we can't be sure that it will be able to heal this particular corruption, and probably BackupCopyJob chain should be recreated. "
DGrinev
Veteran
Posts: 1943
Liked: 247 times
Joined: Dec 01, 2016 3:49 pm
Full Name: Dmitry Grinev
Location: St.Petersburg
Contact:

Re: Corrupted backup copy

Post by DGrinev »

Unfortunately, there are several unique conditions leading to inconsistency of the backup chain without ability to fix it by Health Check.
I assume the support team found those conditions during the investigation and report you the negative results. Thanks!
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Corrupted backup copy

Post by mkretzer »

Ok that leads to the question - what can health check fix and what not? From what i understand there was just one backup block which was too short after decompression. Why should this not be fixable?
DGrinev
Veteran
Posts: 1943
Liked: 247 times
Joined: Dec 01, 2016 3:49 pm
Full Name: Dmitry Grinev
Location: St.Petersburg
Contact:

Re: Corrupted backup copy

Post by DGrinev »

That's the good question, I will try to find out next week and give you an example. Thanks!
Gostev
Chief Product Officer
Posts: 31459
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Corrupted backup copy

Post by Gostev » 2 people like this post

The answer to this question is in the name of the feature, which is "storage-level corruption guard". And it does just that, nothing more and nothing less: detects silent data corruptions in your backup storage, and attempts to make the latest restore point restorable by copying all corrupted blocks over again from the source.
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Corrupted backup copy

Post by mkretzer » 1 person likes this post

And it works :-)
Last restore point shows clean health even without AF.
Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 161 guests