-
- Influencer
- Posts: 19
- Liked: 1 time
- Joined: Jan 27, 2012 11:37 am
- Full Name: Russell Watkins
- Contact:
Length of time to remove snapshot.
Hi all,
I had an issue last week which I need to gather some information on.
I have just put a new VM on to our Veeam backups. The VM consists of Windows 2003 Enterprise and Oracle 9i. I am not using any freeze/thaw or anything fancy, just a straight forward veeam backup in with our other backup jobs. The snapshot phase worked perfectly fine, taking seconds to create and remove. Whilst this server is Oracle, there are only a handful of users who use it and none whilst the backups are taking place.
The backup was fine for a week or so. We then were completing a change on this system (installing another version of Oracle) and a snapshot was taken. The backups were not paused whilst this maintenence work was being carried out. Subsequently the backups ran and the snapshot took 5 hours to remove.
My presumption is that one of the following has occured:
- Oracle 9i is not fully compatible with snapshotting in this way,
- Too many changes were taking place, hence the snapshot removal taking so long.
In the process of trying to get to the root cause of this issue, I need to try and analyse why the snapshot took so long to remove. Other members of my team see this as a failure of the snapshot/backup system, although i'm pretty sure the system is not the problem and this is "by design"!
Can anyone point me to any Windows/Veeam/VMWare Logs which will give allow me to analyse the issue?
Many thanks
Russell
I had an issue last week which I need to gather some information on.
I have just put a new VM on to our Veeam backups. The VM consists of Windows 2003 Enterprise and Oracle 9i. I am not using any freeze/thaw or anything fancy, just a straight forward veeam backup in with our other backup jobs. The snapshot phase worked perfectly fine, taking seconds to create and remove. Whilst this server is Oracle, there are only a handful of users who use it and none whilst the backups are taking place.
The backup was fine for a week or so. We then were completing a change on this system (installing another version of Oracle) and a snapshot was taken. The backups were not paused whilst this maintenence work was being carried out. Subsequently the backups ran and the snapshot took 5 hours to remove.
My presumption is that one of the following has occured:
- Oracle 9i is not fully compatible with snapshotting in this way,
- Too many changes were taking place, hence the snapshot removal taking so long.
In the process of trying to get to the root cause of this issue, I need to try and analyse why the snapshot took so long to remove. Other members of my team see this as a failure of the snapshot/backup system, although i'm pretty sure the system is not the problem and this is "by design"!
Can anyone point me to any Windows/Veeam/VMWare Logs which will give allow me to analyse the issue?
Many thanks
Russell
-
- Veeam Software
- Posts: 21139
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Length of time to remove snapshot.
Russell, if the backup process took place right at the moment of installing another version of Oracle on this system, this could cause a lot of changes inside the VM while it was running on the snapshot created by the backup job. In this case, it is expected that the snapshot commit took longer than usually.
-
- Influencer
- Posts: 19
- Liked: 1 time
- Joined: Jan 27, 2012 11:37 am
- Full Name: Russell Watkins
- Contact:
Re: Length of time to remove snapshot.
Hi Foggy,
Many thanks for your answer. It seems like it is as I thought - the system was functioning as it should. A training issue within the department I think!
5 hours seems a little excessive however (in my managers eyes), so I wondered if there are any logs that I can look at to prove that everything was working as it should?
Many thanks for your answer. It seems like it is as I thought - the system was functioning as it should. A training issue within the department I think!
5 hours seems a little excessive however (in my managers eyes), so I wondered if there are any logs that I can look at to prove that everything was working as it should?
-
- Veteran
- Posts: 261
- Liked: 29 times
- Joined: May 03, 2011 12:51 pm
- Full Name: James Pearce
- Contact:
Re: Length of time to remove snapshot.
Snapshot consolidation speed is greatly influenced by storage configuration (disk type/RAID level/spindle count) vs active IO load at the time. It's not controlled by Veeam though.
-
- Veteran
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
- Contact:
Re: Length of time to remove snapshot.
You should note that snapshot removal technically isn't part of Veeam. Veeam just tells VMware to remove the snapshot and then waits for VMware to confirm (at least that's how I understand it) . . .
During the snapshot removal, you can look at the datastore to see the size of the snapshot files so you can tell just how much (probably-not-sequential) data has to be read and then written into the primary .vmdk file. Also during the snapshot removal you can look at the performance graphs to see how much I/O is going on. If you have vCenter, you may be able to look at the history and see what it was doing. For instance perhaps 20 Mb/s for 5 hours.
During the snapshot removal, you can look at the datastore to see the size of the snapshot files so you can tell just how much (probably-not-sequential) data has to be read and then written into the primary .vmdk file. Also during the snapshot removal you can look at the performance graphs to see how much I/O is going on. If you have vCenter, you may be able to look at the history and see what it was doing. For instance perhaps 20 Mb/s for 5 hours.
-
- VeeaMVP
- Posts: 6166
- Liked: 1971 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Length of time to remove snapshot.
As we told many times in several threads, you can replicate this behaviour by simply doing a snapshot in vCenter of that VM, wait for some time (basically the same amount Veeam would have taken to complete the backup) and then commit the snapshot. You would see the same behaviour with and without involving Veeam.
This is part of a good vSphere/Veeam design, if you know you are going to backup that VM, you need to properly size the underlying storage in terms of space AND performances to complete snapshots operations without impacting VM operations.
Luca.
This is part of a good vSphere/Veeam design, if you know you are going to backup that VM, you need to properly size the underlying storage in terms of space AND performances to complete snapshots operations without impacting VM operations.
Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- Veeam Software
- Posts: 481
- Liked: 57 times
- Joined: Jun 16, 2009 1:23 pm
- Full Name: Rich Brambley
- Contact:
Re: Length of time to remove snapshot.
Excessive time to commit a VMware snapshot, especially with a VM hosting a heavy I/O database or messaging VM, is often the result of too much I/O for the underlying datastore. You've got the normal I/O of the VM combined with the extra I/O of commiting the changes from the 0001.vmdk (for example) made during the time the snapshot was open (the backup). I can't explain why it was fine at first for you though - did you add more data, more users, etc to the Oracle VM or did you add more VMs to the same datastore where the Oracle VM lives - resulting in more "acivity"?
What version of vSphere are you using?
If you are using vSphere 4 or earlier, the snapshot is actually created on the datastore where the VM's .vmx file lives. I've seen the time it takes to commit snapshots drastically reduced just by moving the VM's .vmx file to another datastore. Oversimplified, you are then dividing the production and snapshot I/O across the different datatstores. If you have different spindles and controllers for the different datastores it gets even better.
Is your Oracle VM's .vmx file on the same datastore as the .vmdks? If so try moving it.
Beware - make sure datastore you move the .vmx to has a block size and enough free space to handle the snapshot.
I've known customers to build a 1TB datastore just for .vmx files so VMware snapshots would perform better.
If you are using vSphere 5 then the snapshots are created on each datastore where there is a vmdk. Does your Oracle VM have multiple .vmdks you can separate to different datastores?
What version of vSphere are you using?
If you are using vSphere 4 or earlier, the snapshot is actually created on the datastore where the VM's .vmx file lives. I've seen the time it takes to commit snapshots drastically reduced just by moving the VM's .vmx file to another datastore. Oversimplified, you are then dividing the production and snapshot I/O across the different datatstores. If you have different spindles and controllers for the different datastores it gets even better.
Is your Oracle VM's .vmx file on the same datastore as the .vmdks? If so try moving it.
Beware - make sure datastore you move the .vmx to has a block size and enough free space to handle the snapshot.
I've known customers to build a 1TB datastore just for .vmx files so VMware snapshots would perform better.
If you are using vSphere 5 then the snapshots are created on each datastore where there is a vmdk. Does your Oracle VM have multiple .vmdks you can separate to different datastores?
-
- VeeaMVP
- Posts: 6166
- Liked: 1971 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Length of time to remove snapshot.
I got some customers adding local SSD storage on ESXi 5.0 servers just to redirect snapshots there, so the main storage is hit only for reads while the VM is snapshotted, and thus be able to consolidate snapshots in a faster way. We did it once using Fusion-IO cards for heavy IO exchange and SQL servers and the results were quite impressive.
Luca.
Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Who is online
Users browsing this forum: No registered users and 27 guests