-
- Veeam Legend
- Posts: 1207
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Corrupted backup copy
Hello,
Case 02404420
yesterday we had a outage of the connection to our remote backup copy target. The outage happened as some copy jobs copied data and some merged. Also, we had a defective disk (of a RAID 6) the day before in one of the remote locations storages.
There was no outage of power and connection between the copy target host and its storage system. Just the WAN link.
Today the two health checks that ran both showed corrupted backups.
How likely is it that the corruption comes from the WAN link failure?
Markus
Case 02404420
yesterday we had a outage of the connection to our remote backup copy target. The outage happened as some copy jobs copied data and some merged. Also, we had a defective disk (of a RAID 6) the day before in one of the remote locations storages.
There was no outage of power and connection between the copy target host and its storage system. Just the WAN link.
Today the two health checks that ran both showed corrupted backups.
How likely is it that the corruption comes from the WAN link failure?
Markus
-
- Product Manager
- Posts: 20450
- Liked: 2318 times
- Joined: Oct 26, 2012 3:28 pm
- Full Name: Vladimir Eremin
- Contact:
Re: Corrupted backup copy
Abrupt stop of merge operation might leave backups in an inconsistent state. Support team should be able to provide the precise root cause after debug log investigation. Thanks.
-
- Veeam Software
- Posts: 21144
- Liked: 2143 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Corrupted backup copy
Next job run will repair corrupt restore points and finalize the merge correctly, this will happen automatically.
-
- Veeam Legend
- Posts: 1207
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: Corrupted backup copy
@foggy:
Thats what i thought. In the last 4 years we had crashes, disable of copy jobs in the middle of HC / Merge / Transfer but never had our monthly health check show an inconsistency.
We checked more backups and every backup file on the copy storage target shows inconsistencies for some VMs - even after merges.
Right now we are health-checking another backup copy on the same target repo server but a different (also different brand) backend storage.
Thats what i thought. In the last 4 years we had crashes, disable of copy jobs in the middle of HC / Merge / Transfer but never had our monthly health check show an inconsistency.
We checked more backups and every backup file on the copy storage target shows inconsistencies for some VMs - even after merges.
Right now we are health-checking another backup copy on the same target repo server but a different (also different brand) backend storage.
-
- Veeam Software
- Posts: 21144
- Liked: 2143 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Corrupted backup copy
Then please check with support.
-
- Veeam Legend
- Posts: 1207
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: Corrupted backup copy
"whether it was caused by network issues - No, it wasn't"
Sounds like the backend storage is the problem - that would be nice that way we can continue to trust Veeam backup copies...
Sounds like the backend storage is the problem - that would be nice that way we can continue to trust Veeam backup copies...
-
- Chief Product Officer
- Posts: 31905
- Liked: 7402 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Corrupted backup copy
Inconsistent restore point due to failed/aborted merge is not a problem as it will be repaired automatically by the next run (and previous restore points will not be affected regardless). As long as the backup storage itself is solid and not causing the corruption, there's nothing to worry about. This is why it is important to determine the root cause with our support. Thanks!
-
- Veeam Legend
- Posts: 1207
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: Corrupted backup copy
Will do. We also suspect the storage now.
Problem is it is a copy target and support still has not provided us information how we can run the Validator without a full Veeam installation...
Problem is it is a copy target and support still has not provided us information how we can run the Validator without a full Veeam installation...
-
- Service Provider
- Posts: 33
- Liked: 4 times
- Joined: Feb 29, 2012 1:42 pm
- Full Name: EamonnD
- Location: Dublin, Ireland
- Contact:
Re: Corrupted backup copy
If you suspect the storage then let us know what storage are you using and firmware level? Someone out there may be able to through some light on the matter.
-
- Veeam Legend
- Posts: 1207
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: Corrupted backup copy
It was a MD3600f from Dell. But that storage has no data corruption protection. Now we migrated to a HUS110 from Hitachi which has all the protections. Still, the same backup shows as corrupted (we re-seeded it from the main site).
Main site backup was also checked and shows no error.
Main site backup was also checked and shows no error.
-
- Veeam Software
- Posts: 21144
- Liked: 2143 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Corrupted backup copy
So you've copied the verified backups from source, use them to seed the job. and they are shown as corrupt on target?
-
- Veeam Legend
- Posts: 1207
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: Corrupted backup copy
No, not immediately. The first check was ok.
Here is what happened exactly:
- Backup copy job mirroring backup from the primary site (last active full ~ 4 weeks ago, health check yesterday) to a remote site. Remote site had a Dell MD3600f (RAID 6 config) as backup storage.
- Last week this MD3600f had an strange warning that a disk in in per-failure state.
- My co-worker sadly replaced the wrong disk from the same RAID 6. We waited for the raid rebuild to finish before we replaced the real defective disk. My theory is that the first rebuild was done with the data from the partly dead disk
- After all the rebuilds we saw corruption on nearly all health checks we did on the remote site
- Since we had a free Hitachi HUS110 on the main site we used temporary backup copy jobs to create a new backup seed on that storage
- We then brought that storage system to the remote site, created & rescanned the repo, disabled the copy jobs, removed the old, corrupt backups copied from configuration, targeted the copy jobs to the new repo and mapped the backups
- All copy jobs did an initial health check without errors
- After two days (and the first merge) another health check showed a corruption in the same VM file again as on the old storage
- A heath check on the primary site showed no issues with the primary site backups
The only thing i can think of if that Veeam tried to "heal" the corrupt blocks on the remote site after the backup target was replaced (we kept the same jobs).
Here is what happened exactly:
- Backup copy job mirroring backup from the primary site (last active full ~ 4 weeks ago, health check yesterday) to a remote site. Remote site had a Dell MD3600f (RAID 6 config) as backup storage.
- Last week this MD3600f had an strange warning that a disk in in per-failure state.
- My co-worker sadly replaced the wrong disk from the same RAID 6. We waited for the raid rebuild to finish before we replaced the real defective disk. My theory is that the first rebuild was done with the data from the partly dead disk
- After all the rebuilds we saw corruption on nearly all health checks we did on the remote site
- Since we had a free Hitachi HUS110 on the main site we used temporary backup copy jobs to create a new backup seed on that storage
- We then brought that storage system to the remote site, created & rescanned the repo, disabled the copy jobs, removed the old, corrupt backups copied from configuration, targeted the copy jobs to the new repo and mapped the backups
- All copy jobs did an initial health check without errors
- After two days (and the first merge) another health check showed a corruption in the same VM file again as on the old storage
- A heath check on the primary site showed no issues with the primary site backups
The only thing i can think of if that Veeam tried to "heal" the corrupt blocks on the remote site after the backup target was replaced (we kept the same jobs).
-
- Veeam Software
- Posts: 21144
- Liked: 2143 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Corrupted backup copy
Looks strange, please ask support to investigate.
-
- Veeam Legend
- Posts: 1207
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: Corrupted backup copy
They are on it. Case 02409456 now.
-
- Veeam Legend
- Posts: 1207
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: Corrupted backup copy
One more thing: How should the "auto-healing" of the copy job data work? I do not have the feeling it is doing its job!
-
- Chief Product Officer
- Posts: 31905
- Liked: 7402 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Corrupted backup copy
https://helpcenter.veeam.com/docs/backu ... tml?ver=95mkretzer wrote:How should the "auto-healing" of the copy job data work?
-
- Veeam Legend
- Posts: 1207
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: Corrupted backup copy
So until the merge of that point is done health check can still find corrupted data?
-
- Chief Product Officer
- Posts: 31905
- Liked: 7402 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Corrupted backup copy
Not sure I get the question... but health check verifies the latest restore point regardless of its placement in the backup files.
-
- Veeam Legend
- Posts: 1207
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: Corrupted backup copy
But it always shows the error in the vbk:
06.12.2017 09:54:42 :: Disk wsus_1-flat.vmdk of VM wsus.sw.buhl-data.com is corrupted, possible reason: Storage I/O issue. Corrupted data is located in the following backup files: wsus.vm-781264D2017-12-04T132056.vbk
If there is an incremental point after that (which fixes the "chain" but not the vbk until merge and another health check will run will Veeam still show the corruption in this health check?
06.12.2017 09:54:42 :: Disk wsus_1-flat.vmdk of VM wsus.sw.buhl-data.com is corrupted, possible reason: Storage I/O issue. Corrupted data is located in the following backup files: wsus.vm-781264D2017-12-04T132056.vbk
If there is an incremental point after that (which fixes the "chain" but not the vbk until merge and another health check will run will Veeam still show the corruption in this health check?
-
- Veteran
- Posts: 1943
- Liked: 247 times
- Joined: Dec 01, 2016 3:49 pm
- Full Name: Dmitry Grinev
- Location: St.Petersburg
- Contact:
Re: Corrupted backup copy
Hi,
No, since we're reading backup chain starting from the latest increment (which consist of healthy data blocks), the corrupted data blocks are not needed anymore. Thanks!
No, since we're reading backup chain starting from the latest increment (which consist of healthy data blocks), the corrupted data blocks are not needed anymore. Thanks!
-
- Veeam Legend
- Posts: 1207
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: Corrupted backup copy
Then this does not seem to work. In the today copy job a corruption of a smaller VM got detected, the new incement was transfered but validator still shows the corruption. But i really have the feeling that validator checks only the vbk or at least starts with the vbk.
I will try to get a health check running and see if the result differs.
I will try to get a health check running and see if the result differs.
-
- Veeam Software
- Posts: 21144
- Liked: 2143 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Corrupted backup copy
Veeam Backup Validator is a different thing, unlike health check, it recalculates checksums for the entire backup chain.
-
- Veeam Legend
- Posts: 1207
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: Corrupted backup copy
So in that case after ther backup with the correction is merged validator should show no error anymore?
-
- Veteran
- Posts: 1943
- Liked: 247 times
- Joined: Dec 01, 2016 3:49 pm
- Full Name: Dmitry Grinev
- Location: St.Petersburg
- Contact:
Re: Corrupted backup copy
After a brief discussion with the QA team, there are no errors expected after the merge of healthy blocks is done. Thanks!
-
- Veeam Legend
- Posts: 1207
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: Corrupted backup copy
Strange: According to support health check might not really fix these issues:
"About the HealthCheck - we can't be sure that it will be able to heal this particular corruption, and probably BackupCopyJob chain should be recreated. "
"About the HealthCheck - we can't be sure that it will be able to heal this particular corruption, and probably BackupCopyJob chain should be recreated. "
-
- Veteran
- Posts: 1943
- Liked: 247 times
- Joined: Dec 01, 2016 3:49 pm
- Full Name: Dmitry Grinev
- Location: St.Petersburg
- Contact:
Re: Corrupted backup copy
Unfortunately, there are several unique conditions leading to inconsistency of the backup chain without ability to fix it by Health Check.
I assume the support team found those conditions during the investigation and report you the negative results. Thanks!
I assume the support team found those conditions during the investigation and report you the negative results. Thanks!
-
- Veeam Legend
- Posts: 1207
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: Corrupted backup copy
Ok that leads to the question - what can health check fix and what not? From what i understand there was just one backup block which was too short after decompression. Why should this not be fixable?
-
- Veteran
- Posts: 1943
- Liked: 247 times
- Joined: Dec 01, 2016 3:49 pm
- Full Name: Dmitry Grinev
- Location: St.Petersburg
- Contact:
Re: Corrupted backup copy
That's the good question, I will try to find out next week and give you an example. Thanks!
-
- Chief Product Officer
- Posts: 31905
- Liked: 7402 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Corrupted backup copy
The answer to this question is in the name of the feature, which is "storage-level corruption guard". And it does just that, nothing more and nothing less: detects silent data corruptions in your backup storage, and attempts to make the latest restore point restorable by copying all corrupted blocks over again from the source.
-
- Veeam Legend
- Posts: 1207
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: Corrupted backup copy
And it works
Last restore point shows clean health even without AF.
Last restore point shows clean health even without AF.
Who is online
Users browsing this forum: No registered users and 36 guests