Discussions specific to the VMware vSphere hypervisor
Post Reply
russwatkins
Influencer
Posts: 19
Liked: 1 time
Joined: Jan 27, 2012 11:37 am
Full Name: Russell Watkins
Contact:

Length of time to remove snapshot.

Post by russwatkins » Jul 16, 2012 8:07 am

Hi all,

I had an issue last week which I need to gather some information on.

I have just put a new VM on to our Veeam backups. The VM consists of Windows 2003 Enterprise and Oracle 9i. I am not using any freeze/thaw or anything fancy, just a straight forward veeam backup in with our other backup jobs. The snapshot phase worked perfectly fine, taking seconds to create and remove. Whilst this server is Oracle, there are only a handful of users who use it and none whilst the backups are taking place.

The backup was fine for a week or so. We then were completing a change on this system (installing another version of Oracle) and a snapshot was taken. The backups were not paused whilst this maintenence work was being carried out. Subsequently the backups ran and the snapshot took 5 hours to remove.

My presumption is that one of the following has occured:

- Oracle 9i is not fully compatible with snapshotting in this way,

- Too many changes were taking place, hence the snapshot removal taking so long.


In the process of trying to get to the root cause of this issue, I need to try and analyse why the snapshot took so long to remove. Other members of my team see this as a failure of the snapshot/backup system, although i'm pretty sure the system is not the problem and this is "by design"!

Can anyone point me to any Windows/Veeam/VMWare Logs which will give allow me to analyse the issue?


Many thanks



Russell

foggy
Veeam Software
Posts: 17827
Liked: 1492 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Length of time to remove snapshot.

Post by foggy » Jul 16, 2012 9:11 am

Russell, if the backup process took place right at the moment of installing another version of Oracle on this system, this could cause a lot of changes inside the VM while it was running on the snapshot created by the backup job. In this case, it is expected that the snapshot commit took longer than usually.

russwatkins
Influencer
Posts: 19
Liked: 1 time
Joined: Jan 27, 2012 11:37 am
Full Name: Russell Watkins
Contact:

Re: Length of time to remove snapshot.

Post by russwatkins » Jul 16, 2012 10:50 am

Hi Foggy,

Many thanks for your answer. It seems like it is as I thought - the system was functioning as it should. A training issue within the department I think!

5 hours seems a little excessive however (in my managers eyes), so I wondered if there are any logs that I can look at to prove that everything was working as it should?

J1mbo
Expert
Posts: 261
Liked: 29 times
Joined: May 03, 2011 12:51 pm
Full Name: James Pearce
Contact:

Re: Length of time to remove snapshot.

Post by J1mbo » Jul 16, 2012 10:58 am

Snapshot consolidation speed is greatly influenced by storage configuration (disk type/RAID level/spindle count) vs active IO load at the time. It's not controlled by Veeam though.

averylarry
Expert
Posts: 261
Liked: 28 times
Joined: Mar 22, 2011 7:43 pm
Full Name: Ted
Contact:

Re: Length of time to remove snapshot.

Post by averylarry » Jul 16, 2012 4:59 pm

You should note that snapshot removal technically isn't part of Veeam. Veeam just tells VMware to remove the snapshot and then waits for VMware to confirm (at least that's how I understand it) . . .

During the snapshot removal, you can look at the datastore to see the size of the snapshot files so you can tell just how much (probably-not-sequential) data has to be read and then written into the primary .vmdk file. Also during the snapshot removal you can look at the performance graphs to see how much I/O is going on. If you have vCenter, you may be able to look at the history and see what it was doing. For instance perhaps 20 Mb/s for 5 hours.

dellock6
Veeam Software
Posts: 5650
Liked: 1586 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Length of time to remove snapshot.

Post by dellock6 » Jul 16, 2012 5:47 pm

As we told many times in several threads, you can replicate this behaviour by simply doing a snapshot in vCenter of that VM, wait for some time (basically the same amount Veeam would have taken to complete the backup) and then commit the snapshot. You would see the same behaviour with and without involving Veeam.

This is part of a good vSphere/Veeam design, if you know you are going to backup that VM, you need to properly size the underlying storage in terms of space AND performances to complete snapshots operations without impacting VM operations.

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2019
Veeam VMCE #1

rbrambley
Veeam Software
Posts: 455
Liked: 54 times
Joined: Jun 16, 2009 1:23 pm
Full Name: Rich Brambley
Contact:

Re: Length of time to remove snapshot.

Post by rbrambley » Jul 16, 2012 11:23 pm

Excessive time to commit a VMware snapshot, especially with a VM hosting a heavy I/O database or messaging VM, is often the result of too much I/O for the underlying datastore. You've got the normal I/O of the VM combined with the extra I/O of commiting the changes from the 0001.vmdk (for example) made during the time the snapshot was open (the backup). I can't explain why it was fine at first for you though - did you add more data, more users, etc to the Oracle VM or did you add more VMs to the same datastore where the Oracle VM lives - resulting in more "acivity"?

What version of vSphere are you using?

If you are using vSphere 4 or earlier, the snapshot is actually created on the datastore where the VM's .vmx file lives. I've seen the time it takes to commit snapshots drastically reduced just by moving the VM's .vmx file to another datastore. Oversimplified, you are then dividing the production and snapshot I/O across the different datatstores. If you have different spindles and controllers for the different datastores it gets even better.

Is your Oracle VM's .vmx file on the same datastore as the .vmdks? If so try moving it.

Beware - make sure datastore you move the .vmx to has a block size and enough free space to handle the snapshot.

I've known customers to build a 1TB datastore just for .vmx files so VMware snapshots would perform better.

If you are using vSphere 5 then the snapshots are created on each datastore where there is a vmdk. Does your Oracle VM have multiple .vmdks you can separate to different datastores?

dellock6
Veeam Software
Posts: 5650
Liked: 1586 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Length of time to remove snapshot.

Post by dellock6 » Jul 17, 2012 7:22 am

I got some customers adding local SSD storage on ESXi 5.0 servers just to redirect snapshots there, so the main storage is hit only for reads while the VM is snapshotted, and thus be able to consolidate snapshots in a faster way. We did it once using Fusion-IO cards for heavy IO exchange and SQL servers and the results were quite impressive.

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2019
Veeam VMCE #1

Post Reply

Who is online

Users browsing this forum: No registered users and 12 guests