I posted the basic issue here:
http://forums.veeam.com/veeam-backup-re ... 26249.html
In short I was getting:
Code: Select all
Exception from server: Failed to decompress LZ4 block: Incorrect decompression result or length (result: '-865272', expected length: '1048576').
Unable to receive file.
(https://www.veeam.com/kb1795)
Since veeam performs an integrity check on its own backup, I can only assume that veeam believed this file to be a healthy backup file. As such, it is possible that a disk issue corrupted this backup at some point after verification.
What I want to prevent is this happening again, so I am looking to do one or both of the following:
1) Validate that backup files are good without having to actually restore them. I can't believe that an actual restore is required to validate the ability to properly read/decompress a file - there must be a way to stream this through without actually restoring. How can this be done?
2) I need a good strategy to prevent this from happening again - or at least mitigate this as much as possible with the existing infrastructure we have.
At present we have two on-site storage locations that can hold a full backup of all VMs. Let's call them StorageA and StorageB. And, for ease of discussion let's just say they are both 1TB and that a standard full backup of all data is approximately 500GB.
Ideally, we don't want to use both StorageA and StorageB fully for these backups, but approximately half.
I was thinking that I could do a 500GB backup to StorageA and then do a copy job to StorageB. However, this would seemingly not prevent the issue where the file itself is corrupt - i believe that the copy job would simply continue to copy the corrupt file.
Then I thought of setting up two backup jobs independent on staggered days. One that backs up to StorageA and the other to StorageB. However, I'm not sure how say an incremental from jobB is going to effect the next incremental from jobB, since I'm not sure how Veeam marks the data that has been backed up.
The other possibility is having more than one full backup from a single job stored at any given time. Doing this I can utilize the entirety of StorageA (but of course this won't mitigate a full disk failure).
I realize that there are best practices, but I'm confined by the available resources.
What technique is best to that I am more likely to recover from another source if I have the above issue again.
Thanks.