Host-based backup of VMware vSphere VMs.
Post Reply
DEG
Influencer
Posts: 14
Liked: 1 time
Joined: Jan 29, 2016 11:02 am
Full Name: Michael Thorsen
Contact:

Backup Copy Job - Corrupted Data

Post by DEG »

Hello,

I've been using VBR for years and Backup Copy Jobs for years wihtout any issues.
We have had the same infrastructre of HPE, 3PAR 8200/7200 and dedicated physical HPE Veeam Backup and Replication Server and Offsite servers for 3 years without change on the physical infrastructure - of course the firmware and software have been updated and are today fully updated (vSphere 6.7 latest update, Veeam Backup and Replication 10).

Backup Jobs:
I have the same backup strategy for all VMs:
Full Weekly and Monthly Backup and daily incremental.
- Create active full backups automatically "First Saturday, All Month".

This is working 100%.

Backup Copy Jobs:
I have the same backup strategy for all VMs:
I copy backup to servers offsite.
Periodic copu (pruning) at set intervals.

Now this monday morning, I see this error on 3 different virtual machines, on my Backup Copy Job email warning :
Error code:
Disk xyz.domain.local-flat.vmdk on host xyz.domain.local is corrupted. Possible reason: Storage I/O issue. Corrupted data is located in the following backup files: xyz.domain.local BCJD2020-11-01T000000_A28A_M.vbk

I do 1 Yearly, 6 Monthly and 4 Weekly Full Backups to keep for archival purposes.
I use the setting: "Read the entire restore point from source backup instead of synthezizing it from increments".

We (myself and another IT admin) have been going through our VMware and 3PAR storage and no errors were found.
All other Backup Jobs and Backup Copy Jobs for other 50+ VM's were successfull.

Questions
Why am I seeing this error on 3 VM's - and only on the Backup Copy Job, not the actual Backup Job
- is the error focusing on my data storage (3Pars) or my Backup Storage (HPE internal disks) or my Backup Copy Storage (2 offsite servers providing offsite repositories).

How can I manually check that the backup files are OK - both the actual backup files on the VBR server and the Backup Copy Jobs files.
- I now about SureBackup but I wanted to hear if there was a simpler way for specific files.

Is there a better way to do the backup than weekly ful, daily incrementals with the weekly active fulls being copied to backup copy jobs offsite servers.?

Thanks in advance
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Backup Copy Job - Corrupted Data

Post by Gostev »

It is impossible to say for sure from the tiny log snippet you provided where the corrupted file is located. Feel free to open a support case, and our engineer will help you to identify what storage is malfunctioning.

We recommend using the storage-level corruption guard functionality for periodic automatic health checks of your backup files.

The easiest way to manually check the backup files are OK is by creating a SureBackup job, with an option to check the entire backup file enabled. And if your product edition does not include SureBackup functionality, you can use the Veeam Backup Validator command line utility.

Thanks!
DEG
Influencer
Posts: 14
Liked: 1 time
Joined: Jan 29, 2016 11:02 am
Full Name: Michael Thorsen
Contact:

Re: Backup Copy Job - Corrupted Data

Post by DEG »

Hi Gostev,

I know exactly where the file is located.
I know and have access to the vmware servers, the Veeam backup server, the offsite backup servers. I can find all files.

It was more to figure out why this happens.
The files do exist and on the VBR10 server, the job went well. On the offsite server for Backup Copy Jobs this error is thrown for 3 specific Windows servers (VM guests) when doing the offsite Backup Copy Job.

Storage-level corruption guard - I understand this, but is this nescessary for Active Full backups I run weekly. I had this enabled back in the days but was adviced per my previous threads in this forum to not have it enabled on active full backups.

I have the Enterprise Plus license which covers SureBackup so might as well look into that.

Thanks
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Backup Copy Job - Corrupted Data

Post by Gostev »

It happens because storage is not somehow magically immune to issues. Like everything else in IT, it will fail you periodically. Thanks to the law of big numbers, with over 850K of active Veeam Backup & Replication installs we see this almost every day in support! While from an individual user's perspective, it translates into an issue once in 5-10 years may be.

Whether you do active or synthetic full obviously makes zero difference when your backup storage can't read the backup file back. This is exactly why we keep saying that doing active fulls is largely pointless, and the only thing that actually helps is testing your backups. You're extremely lucky the Backup Copy jobs detected the storage issue as a consequence of just trying to do what it needs to do... imagine you finding out the same only at restore time instead.
DEG
Influencer
Posts: 14
Liked: 1 time
Joined: Jan 29, 2016 11:02 am
Full Name: Michael Thorsen
Contact:

Re: Backup Copy Job - Corrupted Data

Post by DEG »

Thanks Gostev.

It might be my english skill but this specific error message: "Disk winserver1.domain.local_1-flat.vmdk on host winserver1.domain.local is corrupted. Possible reason: Storage I/O issue. Corrupted data is located in the following backup files: winserver1.domain.local Backup Copy JobD2020-10-31T200000_4193_W.vbk"

which "host" does it refer to?
Which of the three "servers/storage" is the issue located on?
The production SAN hosting the vmware guest vmdk file, the backup server storing the .VBK file or the storage where I offload the Backup Copy Job .VBK files?

I dont understand which storage/server the error message refers to.

I understand if its server hosting the Backup Copy Job.
But I dont understand if its the SAN as - how do Veeam know the vmdk file is corrupted? The servers (vm guests) with this errors are running fine as we speak.

Thanks.
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Backup Copy Job - Corrupted Data

Post by Gostev »

As I already explained above, we cannot answer any of these questions just from the single error message you posted. You do need to open a support case to have our support engineers get these details for you from the debug logs. Also, please include the support case ID here, as requested when you click New Topic. As explained there, topics about technical issues without a support case ID are eventually removed by moderators. Thanks!
DEG
Influencer
Posts: 14
Liked: 1 time
Joined: Jan 29, 2016 11:02 am
Full Name: Michael Thorsen
Contact:

Re: Backup Copy Job - Corrupted Data

Post by DEG »

Great, thansk a lot.

I just thought it was a generic error which per you second answer is routine, and might happen to "always point towards" say BCJ storage destinations.

I will contact support.
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Backup Copy Job - Corrupted Data

Post by Gostev »

Storage issues are routine, but they exhibit themselves in a wide variety of consequences ;) and unfortunately, in the majority of cases they become known at the restore time only. This is exactly why I said you were so lucky for some other process to have caught the issue, considering you don't test your backups.
NightBird
Expert
Posts: 242
Liked: 57 times
Joined: Apr 28, 2009 8:33 am
Location: Strasbourg, FRANCE
Contact:

Re: Backup Copy Job - Corrupted Data

Post by NightBird »

Planified Backup health check in maintenance tab will catch this kind of issue. Isn’t it ?
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Backup Copy Job - Corrupted Data

Post by Gostev »

That is correct, but it does not happen every day normally due to significant I/O requirements, so you could got for a week or a month before you know about the issue. SureBackup however is normally performed daily, if not for all VMs - then at least for a subset of most critical ones. So it will catch backup storage misbehaving much sooner (as well as many other recoverability issues that the health check can never possibly detect).
foggy
Veeam Software
Posts: 21073
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Backup Copy Job - Corrupted Data

Post by foggy »

DEG wrote: Nov 02, 2020 1:09 pm But I dont understand if its the SAN as - how do Veeam know the vmdk file is corrupted? The servers (vm guests) with this errors are running fine as we speak.
It means that some of the blocks stored in the mentioned backup file and related to this VMDK are corrupt/cannot be read/match the CRC.
Post Reply

Who is online

Users browsing this forum: No registered users and 63 guests