Discussions specific to the VMware vSphere hypervisor
Post Reply
braddock
Influencer
Posts: 13
Liked: 2 times
Joined: May 10, 2016 3:06 pm
Contact:

Possibility of corruption from taking frequent snapshots?

Post by braddock » Jun 08, 2016 7:35 pm

Hi,
Can taking a snapshot corrupt a VM at the vmdk level?

We have a linux VM that we want to replicate using Veeam to a new host every 30 mins, and I know there was an issue when Vsphere 6 first came out where taking a snapshot could corrupt the original VM. I think that's been fixed a long time ago thoguh.

Is there an increased chance of something going wrong the lower the replication interval of a job or does Veeam not touch the original VM's in any way?

Thanks.

braddock
Influencer
Posts: 13
Liked: 2 times
Joined: May 10, 2016 3:06 pm
Contact:

Re: Possibility of corruption from taking frequent snapshots

Post by braddock » Jun 09, 2016 2:53 pm

Hi again, let me rephrase this question - I've researched into this a bit more through the forum.

What is the safest way to backup a linux VM that runs with a transactional database (Progress) using with Veeam in 2016 ? The method that is least likely to 'break' the guest VM.

Gostev said in this post from 2009 : ( veeam-backup-replication-f2/transaction ... t1557.html )
Gotev wrote:"The general consensus for hot backups of Linux VM is the following:
- If you are NOT running databases, mail servers or other transactional applications in VM - hot backups using "Enable VMware Tools Quiescence " is fine.
- If you are running transactional applications in VM - you should use pre-freeze and post-thaw scripts to stop/suspend and start/resume services before/after snapshot creation, as described here for instance."
Gostev also says on that thread that taking a backup without using the vmware tools quiescence will not affect the production VM (though this is in regards to Windows)
Gotev wrote:"3. Next is to run backup without VMware Tools quiescence. Backup will be crash consistent but often restorable, production VM will not be affected."
Now in our case we have tested the pre-freeze and post-thaw scripts and they do work but the impact on the system is that they freeze the Database for 5-15 seconds which is very noticeable to users during working hours. So we want to create crash-consistent backups instead (not using VMware tools quiescense OR the pre and post scripts) and depend on the built in application level database backups in the event of a restore.

So going back to my question above - can taking snapshot without using vmwaretools quiesce corrupt a PRODUCTION linux VM ?

Thanks

braddock
Influencer
Posts: 13
Liked: 2 times
Joined: May 10, 2016 3:06 pm
Contact:

Re: Possibility of corruption from taking frequent snapshots

Post by braddock » Jun 13, 2016 7:27 pm 2 people like this post

No replies : ) I guess most people use windows on here with Veeam.

Anyway I worked this out myself and will put this here for anyone else who comes across this via a google search:

In certain situations, taking a snapshot with Vmware tools quiescence can and will crash the guest OS on occasion (if linux). We run RHEL 5 which has an incompatibility with vmware tools (vmtoolsd) - we can either upgrade Red Hat version or disable VMware tools on the guest OS. Unfortunately due to the application that we run upgrading the OS is not possible for us so if we want to use Veeam taking crash consistent backups is the only option.

We tested this for 2 weeks taking a backup through Veeam every 15 minutes and twice the guest OS crashed when a snapshot was initiated (out of hundreds).

links:

https://kb.vmware.com/selfservice/micro ... Id=2038606

https://access.redhat.com/solutions/484303

ukguy
Novice
Posts: 9
Liked: never
Joined: Jul 26, 2016 2:30 pm
Contact:

Re: Possibility of corruption from taking frequent snapshots

Post by ukguy » Jun 10, 2018 10:42 pm

I know this is an old thread but it’s the first time I’ve found anyone who is directly addressing this.
We are finding the corruption is on the replicas.

On one vm (centos6) we don’t use quiescing because of the ffreeze kernel bug which crashes the vm.

On the other vm, we do use vmtools quiescing and preference scripts yet there is still corruption.

Like the OP, we can’t use preference which freezes the database on a busy production server.

So far we have no solution but have just switched to VMware paravirtual to see if it reduces corruption.
I thought it had but on checking I can still see metadata inode issues, slow boot and on reboot it jumps into fsck.

It was taking ages to fsck so I’m about to test if it still does.

Crash consistent... ok until you need them for “immediate” failover. Backup vendors promote the great features, but the underlying centos/VMware/etc issues prevent these features being as great as they are.

Post Reply

Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 36 guests