-
- Novice
- Posts: 5
- Liked: never
- Joined: Nov 18, 2009 9:36 pm
- Full Name: Olof Christensson
- Contact:
All instances of the storage metadata are corrupted.
Hi,
I am getting an error that occured yesterday after the veeam backup server rebooted during a backup, we are getting this message in every backup and we cannot do any restore from the actual backupjob. New backupjobs works fine however.
Dir failed, storageFileName "F:\Backup\Backup.vbk"
All instances of the storage metadata are corrupted.
How can i recover from this?
I am getting an error that occured yesterday after the veeam backup server rebooted during a backup, we are getting this message in every backup and we cannot do any restore from the actual backupjob. New backupjobs works fine however.
Dir failed, storageFileName "F:\Backup\Backup.vbk"
All instances of the storage metadata are corrupted.
How can i recover from this?
-
- Product Manager
- Posts: 22014
- Liked: 1354 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: All instances of the storage metadata are corrupted.
Hello Olof,
Please contact our support (support@veeam.com) regarding this question, also do not forget to provide us all the logs from Help->Support Information, however it seems like your backups got corrupted due to that reboot, but let us see the logs to see if we could help you restore from it.
Thank you.
Please contact our support (support@veeam.com) regarding this question, also do not forget to provide us all the logs from Help->Support Information, however it seems like your backups got corrupted due to that reboot, but let us see the logs to see if we could help you restore from it.
Thank you.
-
- SVP, Product Management
- Posts: 23615
- Liked: 3117 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: All instances of the storage metadata are corrupted.
Also please include information on what is the storage you are backing up to. We have specifically tested all available backup storage types in the scenario with hard resetting Veeam Backup server while backup job is running, and so chance of such corruption should be extremely low. We should be able to investigate why this has happened from job logs.
-
- Expert
- Posts: 367
- Liked: 41 times
- Joined: May 15, 2012 2:21 pm
- Full Name: Arun
- Contact:
Re: All instances of the storage metadata are corrupted.
I encountered the same error while trying to import backups copied using the backup copy job in v7 from an offsite external drive. The job statisitics say the backup copy job completed successfully in Veeam. However during the import the same error comes in.
-
- Product Manager
- Posts: 22014
- Liked: 1354 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: All instances of the storage metadata are corrupted.
What is the external drive? Is it a regular disk? Do you have a support case ticket opened on this?
-
- Expert
- Posts: 367
- Liked: 41 times
- Joined: May 15, 2012 2:21 pm
- Full Name: Arun
- Contact:
Re: All instances of the storage metadata are corrupted.
The external drive is a Western digital My book Essentials. I have a support case opened on this Case # 00439056. The backup copy job statistics say that the job completed successfully and the full transformation was completed.
But the import does not succeed. By default the number of restore points is set to 2.
So once backup job completed for this particular job, I could see a vbk and a vbm file on the target repository ( external drive). However for other jobs also that copied to the same drive with similar retension settings, there is a vbk, vib and a vbm file. So does this mean the next restore point was not copied? If it was strangely there was no warnings or indications about this.
Thanks
But the import does not succeed. By default the number of restore points is set to 2.
So once backup job completed for this particular job, I could see a vbk and a vbm file on the target repository ( external drive). However for other jobs also that copied to the same drive with similar retension settings, there is a vbk, vib and a vbm file. So does this mean the next restore point was not copied? If it was strangely there was no warnings or indications about this.
Thanks
-
- Product Manager
- Posts: 22014
- Liked: 1354 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: All instances of the storage metadata are corrupted.
Arun,
Thanks!
So only 1 backup copy job is affected by this? Or other jobs are just regular backup jobs?zak2011 wrote:However for other jobs also that copied to the same drive with similar retension settings
Thanks!
-
- Expert
- Posts: 367
- Liked: 41 times
- Joined: May 15, 2012 2:21 pm
- Full Name: Arun
- Contact:
Re: All instances of the storage metadata are corrupted.
Hi Vitaliy,
Two Backup copy jobs were affected. All other backup copy jobs seems to be fine.
Thanks.
Two Backup copy jobs were affected. All other backup copy jobs seems to be fine.
Thanks.
-
- Product Manager
- Posts: 22014
- Liked: 1354 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: All instances of the storage metadata are corrupted.
Ok, need to understand what is the difference between these jobs, keep us updated on the support team research.
-
- Expert
- Posts: 367
- Liked: 41 times
- Joined: May 15, 2012 2:21 pm
- Full Name: Arun
- Contact:
Re: All instances of the storage metadata are corrupted.
Yes, will do Vitaliy.
-
- Expert
- Posts: 152
- Liked: 24 times
- Joined: May 16, 2011 4:00 am
- Full Name: Alex Macaronis
- Location: Brisbane, Australia
- Contact:
Re: All instances of the storage metadata are corrupted.
I've had this with a v6.5 backup chain, unfortunately, the response from support was eventually "Oh dear, looks like they're un-recoverable".
-
- Expert
- Posts: 108
- Liked: 5 times
- Joined: Mar 27, 2012 10:13 pm
- Full Name: Chad Killion
- Contact:
Re: All instances of the storage metadata are corrupted.
Just had the same experience. Nothing could be done and we had to abandon the backup-chain and begin a new one. Support suggested it was a storage problem and to check storage. We are writing to a a 4 head distributed gluster filesystem with over 200TB. That will prove to be difficult to say the least.
-
- SVP, Product Management
- Posts: 23615
- Liked: 3117 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: All instances of the storage metadata are corrupted.
You can catch the issues like this very early on by using periodic SureBackup job with full scan option enabled (new v7 feature). This will physically read all data blocks holding the tested restore point, and ensure that the content of the block matches the CRC stored with the block.
-
- Expert
- Posts: 367
- Liked: 41 times
- Joined: May 15, 2012 2:21 pm
- Full Name: Arun
- Contact:
Re: All instances of the storage metadata are corrupted.
I have had the same experience and support says nothing can be done and there is no other way to determine at which step and at which drive the data get corrupted.
If Veeam job statistics say the job completed successfully and the backup is copied why does this happen?
If Veeam job statistics say the job completed successfully and the backup is copied why does this happen?
-
- SVP, Product Management
- Posts: 23615
- Liked: 3117 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: All instances of the storage metadata are corrupted.
Most likely some issue with the storage, in the past years we've been mostly seeing this with low end NAS featuring questionable performance optimizations. For example, the some storage will not handle FLUSH command correctly, returning success before the data hits the disks to achieve better performance numbers. Or, as simple as storage not writing some other data instead of the data we asked it to write due to faulty hardware (in 6 years and with 80'000+ customers, we've seen it all by now). It is very important that the storage does not "cheat" us.
Veeam backup file format includes two identical records of storage metadata for redundancy, and they are never updated at the same time, for at least one to remain "good" in case of a crash, or data corruption occurring during the file update.
So, it is simply impossible to break both metadata instances when:
1) Storage writes the data we asked it to write (not the case with faulty RAID controllers)
2) Storage medium stores the data correctly (not the case with silent corruptions issues aka bit rot)
3) Properly processes FLUSH command. The latter is a synchronous call for us, meaning we issue FLUSH command and wait for the data to hit the disks before moving on. If storage returns success but flush does not really happen, and we start updating another instance of metadata while first one is unsaved, and this is one way to run into the said issue.
In human language, the issues look like:
1) We ask storage to write "MOM", but it writes "DAD" instead and returns success.
2) We ask storage to write "MOM", and it writes "MOM" and returns success, but if you try to read the data block, you will get "MAM".
3) We ask storage to commit the write of "MOM" to disks, and it returns success but does not actually write data to disks, keeping it in buffer for a short period of time for performance optimization purposes.
Answering your question, we can only judge on these reported successes, so we mark the job as successful. This is why, it is very important to use SureBackup to verify that what was written into the backup file is what we asked, especially once you get this error at least once and your backup storage becomes a suspect.
Veeam backup file format includes two identical records of storage metadata for redundancy, and they are never updated at the same time, for at least one to remain "good" in case of a crash, or data corruption occurring during the file update.
So, it is simply impossible to break both metadata instances when:
1) Storage writes the data we asked it to write (not the case with faulty RAID controllers)
2) Storage medium stores the data correctly (not the case with silent corruptions issues aka bit rot)
3) Properly processes FLUSH command. The latter is a synchronous call for us, meaning we issue FLUSH command and wait for the data to hit the disks before moving on. If storage returns success but flush does not really happen, and we start updating another instance of metadata while first one is unsaved, and this is one way to run into the said issue.
In human language, the issues look like:
1) We ask storage to write "MOM", but it writes "DAD" instead and returns success.
2) We ask storage to write "MOM", and it writes "MOM" and returns success, but if you try to read the data block, you will get "MAM".
3) We ask storage to commit the write of "MOM" to disks, and it returns success but does not actually write data to disks, keeping it in buffer for a short period of time for performance optimization purposes.
Answering your question, we can only judge on these reported successes, so we mark the job as successful. This is why, it is very important to use SureBackup to verify that what was written into the backup file is what we asked, especially once you get this error at least once and your backup storage becomes a suspect.