So last month we upgraded from Veeam B&R 8.0 to 9.0U1 (We have plans to be on 9.5U1 in the next month or so) and we experienced an odd issue with our remote site backup copy jobs.
For context we have five sites that are several hundred km away and we perform backup copy jobs over the WAN with our remote sites connected via DMVPN on 10Mb/10Mb links to our primary sites that are connect via 100Mb/100Mb links.
At our remote sites we're performing daily forward incremental with weekly active full backups.
These are all being copied via a backup copy job that runs daily from 4am-8am, we have it keeping a two week chain before pruning the last active full with a health check being performed weekly on Sunday with all backups going to our Data Domain using DD Boost.
Prior to the upgrade we'd never had a reported issue with backup copy jobs.
After the upgrade we noticed that the first weekly full copy reported "Processing $server Error: The storage file '/$Job_name/$Backup_File.vib' was not verified."
For troubleshooting I did new local fulls to be copied over and new active fulls from the copy job portion, eventually they would all report not verified again.
I cloned a new job from the old job and at first the job was successful but then failed again with not verified.
I ran the Veeam backup from https://www.veeam.com/kb2086
and it reported the backups as valid but the job was still saying not verified.
Through this I also created a support case to troubleshoot the issue 02053972 as I was stumped on what could be causing this issue since I effectively created a whole new chain, new folder structure and new copies.
Eventually the only thing that seemed to resolve this is manually creating a new job and setting it up the same as the original job and running copies with that.
So I have a few questions I am hoping might be able to get answered.
Why would the Veeam Validator report a valid backup job on the target repository but Veeam B&R report the backup copy job is invalid?
Support advised me this could be caused by a number of issues, such as an unstable connection, disk write errors and alluded to more potential causes, how do we determine the root cause so we can fix this rather than continually addressing the symptom?
Why does this happen in general? My understanding was Veeam B&R had some form of inflight error checking or if the job is invalid why wouldn't it just do a recopy of the job to get a valid copy if it couldn't determine which blocks were bad?