Comprehensive data protection for all workloads
Post Reply
unsichtbarre
Service Provider
Posts: 226
Liked: 39 times
Joined: Mar 08, 2010 4:05 pm
Full Name: John Borhek
Contact:

Veeam: Whoops your backup restore chain is corrupted

Post by unsichtbarre »

Support case#: 02029131

Veeam 9.0.0.1715
vSphere 6.0 4192238

We woke up recently to an entire job that had failed and would not continue with a message that the "restore chain is corrupted". There were no problems or data corruption on the Repository where the backup was stored. In other words, "whoops, your backups and long-term archive are now garbage." The implications:

Job would not continue or run subsequently
Backup copy job was subsequently rendered invalid
Restores would not work

We asked for a Root Cause analysis, because obviously "whoops, it's gone" is not an acceptable answer for any enterprise solution.
Unfortunately, Veeam support literally came up with nothing.

Any help or ideas?
John Borhek, Solutions Architect
https://vmsources.com
ekisner
Expert
Posts: 202
Liked: 34 times
Joined: Jul 26, 2012 8:04 pm
Full Name: Erik Kisner
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by ekisner » 1 person likes this post

Disk corruption happens... that would be my first guess. A very bitter pill to swallow for sure. I would suggest that tape archives are a better idea than disk backups (I know many who've seen disk-only backup shops get decimated exactly like you just did).

A tape library isn't remarkably expensive, and it offers a peace of mind that disk backups will never offer. I do both. 10 days worth of backups to the disk array, plus a GFS tape job. Worst case I lose 7 days of data going back to the last tape.

Barring that, ReFS v3 recently released with 2016 offers a great deal in the way of data integrity. Assuming it was disk corruption and you wanted to stay with disk-only backups, that would definitely be a good thing to look into (as well it increases the performance of backups with COW).

Something that I've never used or been able to get working right would be Veeam's SureBackup. If a VM can power on from a backup, there's definitely no corruption (you can of course write additional scripts to test things more thoroughly if so chosen). But again I've never had any success getting that to work right.
Mike Resseler
Product Manager
Posts: 8044
Liked: 1263 times
Joined: Feb 08, 2013 3:08 pm
Full Name: Mike Resseler
Location: Belgium
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by Mike Resseler »

Hi John,

I looked at the support case and it seems that support is doing additional research and testing. Please keep working with your support engineer to figure out what has happened

Erik, please create a separate thread for SureBackup. We can try to figure out why it doesn't work for you because it is a powerful tool and I would prefer that it works in your environment :-)

Thanks
Mike
unsichtbarre
Service Provider
Posts: 226
Liked: 39 times
Joined: Mar 08, 2010 4:05 pm
Full Name: John Borhek
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by unsichtbarre »

ekisner wrote:Disk corruption happens... that would be my first guess.
Thanks for the input, however we choose to store daily incremental backups on disk for ease of access. The tape library is for the Backup Copy job. Unfortunately, when the initial backup chain is corrupted, the Backup Copy job becomes useless!

We have also ruled out disk corruption (on the backup repository) in this case.

We will wait for our Veeam support engineer and continue to solicit input from the community, in case anyone else has experienced this.

Thanks!
John Borhek, Solutions Architect
https://vmsources.com
Delo123
Veteran
Posts: 361
Liked: 109 times
Joined: Dec 28, 2012 5:20 pm
Full Name: Guido Meijers
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by Delo123 »

How big are the Volumes / Backups? Backing up to NTFS? did you format the volumes with /L? What is the underlying storage on the backup repository?
itrabbit
Influencer
Posts: 20
Liked: 6 times
Joined: Nov 24, 2016 6:50 am
Full Name: Matt Dunleavy
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by itrabbit »

I am watching this thread, because we are just starting down the road of deploying veeam for our virtual machines and physicals.

I do know that it has self healing mechanisms that you can enable - where these enabled? but I don't really know what that would do other than say its corrupt.

Also if the previous backup backed up correctly, couldn't you delete the latest file and then try again? You must be able to go up the chain to recover some good data.

In all honesty, a "whoops your chain is corrupt try again" is not an acceptable result.

Do you do full weekly backups, or full synthetic backups?
infused
Service Provider
Posts: 178
Liked: 13 times
Joined: Apr 20, 2013 9:25 am
Full Name: Hayden Kirk
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by infused »

Would be interesting in your setup. Been running veeam now for about 50-60vms for around 5 years without issue.
Mapleuser
Influencer
Posts: 16
Liked: never
Joined: Sep 17, 2015 2:19 am
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by Mapleuser »

I've been predicting that this could happen.
Which is why I insist on doing an active full backup regularly rather than going on forever incremental.
At least if there are any corruption, the last active full should be a good copy?
I raised this with my Veeam sales person multiple times and they always shrug it off as "it will never happen"

Now, I'm wondering how SureBackup will prevent this because what it does is simply verification?
I need a prevention tool rather than a verification tool.
stevenrodenburg1
Expert
Posts: 135
Liked: 20 times
Joined: May 31, 2011 9:11 am
Full Name: Steven Rodenburg
Location: Switzerland
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by stevenrodenburg1 » 1 person likes this post

"Which is why I insist on doing an active full backup regularly rather than going on forever incremental."
Which is exactly what anyone should do.
Software and hardware are made by people and are prone to failure. End of discussion. One should ALWAYS have regular full's ("fresh chains" so to say), regardless of products.

"I raised this with my Veeam sales person multiple times and they always shrug it off as: it will never happen"
Who, in his right mind, believe what sales-people say. With all due respect but they'll say anything... Listen to your common sense instead, which you did. Common sense always prevails.
epaape
Veeam ProPartner
Posts: 3
Liked: never
Joined: Feb 17, 2014 9:24 am
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by epaape »

Hi,
i had a situation where the customer was using the "per vm backup" option (backup to scale-out repository and then copy to data domain for ltr). There is a issue with the sql express version and veeam 9.0. During the backup too many sessions where open with the database and the backup info wasnt updated in the database. So there were new restore points in our repository and the old restore points in the database. Support ended up writing an sql sequence to update database. In the end we had to update to veeam 9.5. This issue is known in 9.0. I cant remember all exact information, if you need anything else i can take a look at my notes on this.
In the end we only lost one backup chain... Good thing tape out exists. 30 days of backups gone still isnt fun tho.
Regards Eike
Mike Resseler
Product Manager
Posts: 8044
Liked: 1263 times
Joined: Feb 08, 2013 3:08 pm
Full Name: Mike Resseler
Location: Belgium
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by Mike Resseler »

@Mapleuser: SureBackup CAN NOT prevent corruption. It is a verification of whether your backup can be restored and started. Which means you don't only have some data but the VM is actually booted (quarantined environment) and tested with scripts so that you are sure that your backup is recoverable and not corrupt

@StevenRodenburg1: There are different topics on this forum of the importance of an Active Full Backup or for a Synthetic Full Backup but I certainly won't disagree with you that forever incremental might not always be the best idea. In case of corruption you can have massive issues.

@epaape: Thanks for this information. It would be great to tell us more on this. Is this on 9.0 GA or in a update rollup (etc.). So folks, please upgrade all to 9.5 :-)
Gostev
Chief Product Officer
Posts: 31459
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by Gostev » 1 person likes this post

unsichtbarre wrote:We have also ruled out disk corruption (on the backup repository) in this case.
Hmm... how has it been ruled out, if I may ask?
unsichtbarre wrote:Unfortunately, when the initial backup chain is corrupted, the Backup Copy job becomes useless!
It's not a correct statement because Backup Copy is technically a forward incremental job, so all previously created restored points cannot be affected by newly appearing corruption in source backups.
Mapleuser wrote:I've been predicting that this could happen.
Which is why I insist on doing an active full backup regularly rather than going on forever incremental.
At least if there are any corruption, the last active full should be a good copy?
No, not when your backup storage is experiencing issues - as then even the last active full will be bad. This is why you want to use SureBackup, or at least have storage-level corruption guard enabled in the backup job settings (while not a replacement for SureBackup, it will catch storage-level corruptions or any backup file chain inconsistencies).

As far as Active Full, it is largely useless unless you want some extra protection from possible yet unknown hypervisor-based changed block tracking bugs - although there have been bugs in VMware CBT before that resulted in corrupted active full backups. Really, only SureBackup can guarantee your VM backups (and most importantly applications running in those VMs) are recoverable.
arosas
Enthusiast
Posts: 63
Liked: 10 times
Joined: Jun 09, 2015 9:33 pm
Full Name: Tony Rosas
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by arosas » 2 people like this post

We ran into something similar. We use long term retention on disk with GFS copy jobs. We ended up removing the data from backups, then re-importing manually into Veeam. This allowed us to run restore against the "supposedly" corrupt data. The problem with this is we now have to manually manage the old data and manually delete expired data. You will also have a new chain to start with an active full so when running restores you would need to look in two places for the data which is not a burden. It's not a solution but at least it's a workaround should you need to run restores against it which worked in our case.

Veeam support was really good with working with our storage vendor, finding the issue and issuing a patch to the storage array Exagrid. Although it took extremely long for this investigation and patch process to happen, it worked out in the end.
mkaec
Veteran
Posts: 462
Liked: 133 times
Joined: Jul 16, 2015 1:31 pm
Full Name: Marc K
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by mkaec »

Gostev wrote: As far as Active Full, it is largely useless unless you want some extra protection from possible yet unknown hypervisor-based changed block tracking bugs...
As was mentioned in another post, hardware and software are made by people and people make mistakes. An active full operation is easier to implement than more complex operations that requiring merging. Periodic active fulls provide potential protection from bugs in the merge logic. I think you're right that SureBackup will provide that as well. One thing to keep in mind is that, before veeam came along, the backup software industry was been known for having buggy products that regularly resulted in corrupted backup files. Some of us still have the scars from those times.
Gostev
Chief Product Officer
Posts: 31459
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by Gostev »

mkaec wrote:As was mentioned in another post, hardware and software are made by people and people make mistakes. An active full operation is easier to implement than more complex operations that requiring merging. Periodic active fulls provide potential protection from bugs in the merge logic.
Agree, but storage-level corruption guard feature provides the same protection and more, so I like it better ;) especially since I've seen plenty of customers where active full would require a few days to complete due to the immense size of their environment.

Note that I am not advocating against using active fulls here, rather I'm just saying that there is a better way to ensure consistency of the latest restore point than doing an active full (which it definitely won't hurt to continue doing in addition if it meets your backup window).
RubinCompServ
Service Provider
Posts: 259
Liked: 65 times
Joined: Mar 16, 2015 4:00 pm
Full Name: David Rubin
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by RubinCompServ »

Gostev wrote:...storage-level corruption guard feature...
Gostev,

Where would I go to configure the Corruption Guard feature?
tdewin
Veeam Software
Posts: 1775
Liked: 646 times
Joined: Mar 02, 2012 1:40 pm
Full Name: Timothy Dewin
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by tdewin »

You can find it here : https://helpcenter.veeam.com/docs/backu ... tml?ver=95 (part of the advanced settings under the storage section)
RubinCompServ
Service Provider
Posts: 259
Liked: 65 times
Joined: Mar 16, 2015 4:00 pm
Full Name: David Rubin
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by RubinCompServ »

Thanks!
Mike72677
Novice
Posts: 7
Liked: never
Joined: Nov 09, 2015 7:25 pm
Full Name: Mike T
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by Mike72677 »

I've been dealing with a Backup Copy Job issue where the monthly checkup would say there is data corruption and a possible storage I/O issue. I've had a case open with support for 2 - 3 months now. Last steps I took were to wipe out the data set, start a new one and have the Health Check to run daily. So far no issues, but also so far no reason/cause for the issue. Disks at the repository all test out fine and no I/O issues being reported by the OS.
larry
Veteran
Posts: 387
Liked: 97 times
Joined: Mar 24, 2010 5:47 pm
Full Name: Larry Walker
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by larry » 3 people like this post

Had lighting hit the roof, though the AC to server room, jumped to my server, cooked the server. Don’t care who made the unit as it was melted. That is why I do two local backups to two different servers, one remote backup and tape every night, throw in SAN snapshots both local and remote as well. For the tape job I backup to disk then files to tape to keep extra days of restore points on tape and to have it its own backup job. To me my most important data is the current data which I always have 3 or more copies of, all their own backup chain. We use SAN integration so we only send data off site once and backup from SAN snapshots at both sites. I need to sleep at night. After using Veeam for years I have never had a corrupted backup but still...
Martin Damgaard
Influencer
Posts: 19
Liked: 8 times
Joined: Aug 31, 2015 6:31 am
Full Name: Martin Damgaard
Contact:

Re: Veeam: Whoops your backup restore chain is corrupted

Post by Martin Damgaard »

arosas wrote:We ran into something similar. We use long term retention on disk with GFS copy jobs. We ended up removing the data from backups, then re-importing manually into Veeam. This allowed us to run restore against the "supposedly" corrupt data. The problem with this is we now have to manually manage the old data and manually delete expired data. You will also have a new chain to start with an active full so when running restores you would need to look in two places for the data which is not a burden. It's not a solution but at least it's a workaround should you need to run restores against it which worked in our case.

Veeam support was really good with working with our storage vendor, finding the issue and issuing a patch to the storage array Exagrid. Although it took extremely long for this investigation and patch process to happen, it worked out in the end.
THIS, is exactly what we have experienced too!

Something have gone rotten somewhere in the pipe. No fingers pointing at Veeam, probably Power shortage, network problems, cosmic rays, whatever...
Everything checks out at repository as OK (Experienced this on ZFS based, Linux based NASses, and MS Windows repositories - both locally and offsite), and i am 100% certain this is NOT due to HW failure! Diskchecks, checksums, scrubs, etc. all shows no problem with the volume or VBK files.

Anyway, result is a broken backup chain. OK no problem, i can live with loosing the latest backup chain (because this is a long term archival GFS storage).
Missing a couple of days, weeks, even a month or two worth of archive backups, i can live with.
But not being able to somehow manually discard/delete or ignore the missing part of the chain, until the GFS scheme catches up, is really a pain in the bottom hole!

- Martin
Post Reply

Who is online

Users browsing this forum: Bing [Bot], Google [Bot], sergiosergio and 250 guests