Host-based backup of Microsoft Hyper-V VMs.
Post Reply
jake1110
Enthusiast
Posts: 40
Liked: 2 times
Joined: Sep 20, 2012 6:19 pm
Full Name: Jake Ernst
Location: Des Moines, IA
Contact:

Zlib Decompression Error - Why doesn't Veeam warn me?

Post by jake1110 »

Would it be possible to make Veeam more proactive in detecting and fixing corruption within its own backup files? The reason I ask is that I have around 250 repositories in my environment and it seems that this zlib decompression error has struck at the worst possible times during a restore. Check disk always comes back clean and we never have errors on these volumes. I read a while back that there was an issue with Server 2012 hosts, which one of my corruptions was on(the other time it was 2012R2). Is this still an issue?

Why is it that I only find out that there’s an issue when attempting to restore? I assumed Veeam checked for corruption when running the jobs. When Veeam tells me my backups are successful and there’s no errors, I’d like to believe it. But over time, this sporadic corruption of Veeam’s own backups is shaking my confidence. Twice, I’ve lost servers completely and had to rebuild. I have zero insight into which backups are healthy and which ones are corrupt.

I’m always told its file system corruption, but I never find any. Yes I plan on opening a new support ticket for this as well. I’d just like to suggest better checking for corruption in the backup data!
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Zlib Decompression Error - Why doesn't Veeam warn me?

Post by Gostev »

Hi, Jake.

Storage-level corruptions are indeed quite common.

Let me explain what happens in your case. When writing each data block to the disk, we also write a checksum of this block to the designated area of the backup file (actually, we write the checksum twice for redundancy). When restoring, we verify the data obtained from disk against that checksum. Basically, in your case the storage returns us completely different data comparing to what we wrote to it back when the backup file was created.

Do you have SureBackup jobs setup? This is specifically designed to test your backups for recoverability.

Thanks!
jake1110
Enthusiast
Posts: 40
Liked: 2 times
Joined: Sep 20, 2012 6:19 pm
Full Name: Jake Ernst
Location: Des Moines, IA
Contact:

Re: Zlib Decompression Error - Why doesn't Veeam warn me?

Post by jake1110 »

Thanks for the reply. Nope, no Surebackup being used.

The reason for this is simply the environment here. We'd have 250+ more jobs to add on top of the nightly backup jobs. The main issue I saw is that a Veeam server starts to buckle under the load of about 100 concurrent jobs. Things just begin to error out at that point. This is also the reason I use Hyper-V replication rather than Veeam, despite the I/O hit :cry:

I'm hoping V8 is more scalable for us. Our DB and application servers easily handle the load, its just the Veeam application itself that has problems. We've mostly worked around this.

Do you know of any other customers of our "branch office" environment running Sure Backup jobs everywhere? I mean, it just adds so much more complexity. For instance, when Veeam runs an restore using "Other OS" and creates the Linux appliance, it occasionally doesn't remove the appliance after ending the job. We have to go out manually and do this. I'm worried we'd see more problems like this get introduced.

Would running active full backups clear out any corruption in the backup chain? If it builds an entirely new chain, I would consider that an acceptable risk. I could simply run an active full each week, reducing my risk.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Zlib Decompression Error - Why doesn't Veeam warn me?

Post by foggy »

Active full will indeed start the chain from scratch, so it is not affected by the storage-level corruptions that took place while previous chain was created. However it does not protect you from the new ones that could occur.

Could you temporarily switch to some other repository to see whether the problem persists?
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Zlib Decompression Error - Why doesn't Veeam warn me?

Post by Gostev »

jake1110 wrote:The main issue I saw is that a Veeam server starts to buckle under the load of about 100 concurrent jobs. Things just begin to error out at that point.
This is not due to a heavy server load, or lack of scalability though. Windows OS simply has a hard limit of how many new processes any given service can give birth to (each job is a separate process). But from what I've heard on this before, v8 will address that by changing the way those job processes are spawned. And there is a workaround now, allowing you to increase this limit in Windows through a registry edit (our support has the info).
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Zlib Decompression Error - Why doesn't Veeam warn me?

Post by Gostev »

Just ran across the corresponding KB article
https://www.veeam.com/kb1909
jake1110
Enthusiast
Posts: 40
Liked: 2 times
Joined: Sep 20, 2012 6:19 pm
Full Name: Jake Ernst
Location: Des Moines, IA
Contact:

Re: Zlib Decompression Error - Why doesn't Veeam warn me?

Post by jake1110 »

Yes, I noticed it always spawned a new process each job. My previous backup solution didn't do this, so I assumed it was a software limitation. I wasn't aware of this registry key and nobody told me this when I scoping our deployment with Veeam. I'll have to give it a shot. I don't want to come across as upset as I've been mostly successful with Veeam and my environment is pretty reliable now. That's great to hear about v8 and I can't wait to try it, especially after all the improvements I saw from 6.5 to 7. Perhaps I'll be able to test out running all my replica jobs in Veeam too. I could finally consolidate things a bit more!

Foggy - Thanks for the info! I actually can't switch repositories for these jobs. I have a branch office type of setup, where each branch has it's own storage device. There's not enough bandwidth between each location to back up to. Our HQ office has many backup repositories that never seem to have this corruption, but they're all large SAN devices. I'll probably setup weekly active fulls setup for our branch site. They wouldn't be in too bad of shape if we had to go back up to a week. It's a lot better than having nothing at all, which is what happened to me before.
Post Reply

Who is online

Users browsing this forum: Artem Filatov and 24 guests