-
- Expert
- Posts: 164
- Liked: 9 times
- Joined: Jan 28, 2014 5:41 pm
- Contact:
Snapshot Removal slowness
We have a VM which is around 850gb. It has lots of changes daily which means that the VM replication can sometimes take a bit of time. The actual data transfer for the replication takes around 1hr 20min. The total time for the job can take over 6hrs! The rest of the time is spent removing snapshot (according to vSphere) which I'm assuming is doing some merging? I'm starting to wonder if we did just a straight full replication every time if it will be quicker than doing the merge process.
Replication Details
1 Restore Point
Data Transfer = Direct (WAN Accelerator is greyed out)
Any suggestions or alternatives on how to speed this up?
Thanks
Replication Details
1 Restore Point
Data Transfer = Direct (WAN Accelerator is greyed out)
Any suggestions or alternatives on how to speed this up?
Thanks
-
- Veteran
- Posts: 1943
- Liked: 247 times
- Joined: Dec 01, 2016 3:49 pm
- Full Name: Dmitry Grinev
- Location: St.Petersburg
- Contact:
Re: Snapshot Removal slowness
Hi,
You can reduce the snapshot consolidation time by integration with storage systems (if you're running any supported storage for integration).
Or you can try to offload the datastore where this VM runs.
Also, I'd recommended you to retain more than 1 restore point for replication, in case of the RP corruption due to malware etc failover becomes unavailable.
Please review the existing discussion you might find useful information there. Thanks!
You cannot avoid the merge process.B.F. wrote: I'm starting to wonder if we did just a straight full replication every time if it will be quicker than doing the merge process.
You can reduce the snapshot consolidation time by integration with storage systems (if you're running any supported storage for integration).
Or you can try to offload the datastore where this VM runs.
Also, I'd recommended you to retain more than 1 restore point for replication, in case of the RP corruption due to malware etc failover becomes unavailable.
Please review the existing discussion you might find useful information there. Thanks!
-
- Expert
- Posts: 164
- Liked: 9 times
- Joined: Jan 28, 2014 5:41 pm
- Contact:
Re: Snapshot Removal slowness
Not sure if I understand what that means.DGrinev wrote: Or you can try to offload the datastore where this VM runs.
Another thought was maybe replicate more often throughout the day. That way it will incorporate the changes throughout the day in smaller chunks instead of one 6+ hour chunk once a day?
Thanks
-
- Veeam Software
- Posts: 26
- Liked: 12 times
- Joined: Jun 26, 2014 7:02 pm
- Full Name: Denis Churaev
- Location: Bucharest, Romania
- Contact:
Re: Snapshot Removal slowness
Hello B.F.,
DGrinev meant that it is likely that your datastore is quite busy with active I/O, that would be the most likely explanation to the snapshot removal slowness.
If you would find a way to put this highly transactional VM on a faster/less IO intensive datastore, the issue may be gone alltogether.
Another possibility to consider is a replication job run at a different time, when the VMs on the datastores are not as active, so that the snapshot operations would be faster. This could only work if you have an analysis of datastore latency/IO during the day and notice that at some periods of time the datastore disk is less loaded, i.e. if the users are less active during the night. Of course in our AlwaysOn 24/7 world it is not always possible to have this "breather".
You could also look deeper into what actually makes the disk operations slow. The underlying issue may be either in the fragmentation/blocksize of the VMFS datastore, the connection between the host and the storage with data holding the datastore, or poor random read from the disk device. However this type of analysis usually takes some resources in a form of a specialist who closely works with performance troubleshooting and knows VMware technology well, so it's not always trivial. And in the end you need to consider what to change, which may result in an uncomfortable decision: i.e. a necessity to buy faster disks or reformat datastore, which is not always possible.
DGrinev meant that it is likely that your datastore is quite busy with active I/O, that would be the most likely explanation to the snapshot removal slowness.
If you would find a way to put this highly transactional VM on a faster/less IO intensive datastore, the issue may be gone alltogether.
Another possibility to consider is a replication job run at a different time, when the VMs on the datastores are not as active, so that the snapshot operations would be faster. This could only work if you have an analysis of datastore latency/IO during the day and notice that at some periods of time the datastore disk is less loaded, i.e. if the users are less active during the night. Of course in our AlwaysOn 24/7 world it is not always possible to have this "breather".
You could also look deeper into what actually makes the disk operations slow. The underlying issue may be either in the fragmentation/blocksize of the VMFS datastore, the connection between the host and the storage with data holding the datastore, or poor random read from the disk device. However this type of analysis usually takes some resources in a form of a specialist who closely works with performance troubleshooting and knows VMware technology well, so it's not always trivial. And in the end you need to consider what to change, which may result in an uncomfortable decision: i.e. a necessity to buy faster disks or reformat datastore, which is not always possible.
-
- Expert
- Posts: 164
- Liked: 9 times
- Joined: Jan 28, 2014 5:41 pm
- Contact:
Re: Snapshot Removal slowness
The datastore where this job places all the replica's is actually isolated from other active VM's
I did a little more digging on the one particular VM that consistently takes much longer. It has SQL server installed and the install was done by another vendor. Looked into the scheduled SQL tasks and I discovered that every night it does an Index Reorganization along with Updating the Statistics. I'm assuming this process would churn a lot of data changes. Disabled that job and sure enough, reduced the amount of time it takes by 2hrs! Made the adjustment so that the re-index is only done once a week (daily is unnecessary from what I've read).
Thanks
I did a little more digging on the one particular VM that consistently takes much longer. It has SQL server installed and the install was done by another vendor. Looked into the scheduled SQL tasks and I discovered that every night it does an Index Reorganization along with Updating the Statistics. I'm assuming this process would churn a lot of data changes. Disabled that job and sure enough, reduced the amount of time it takes by 2hrs! Made the adjustment so that the re-index is only done once a week (daily is unnecessary from what I've read).
Thanks
-
- Veeam Software
- Posts: 21139
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Snapshot Removal slowness
I also wonder is it a temporary snapshot that is created for the source VM during backup or the replica VM restore point (snapshot) that is merged when retention applies at the end of the job.
-
- Expert
- Posts: 164
- Liked: 9 times
- Joined: Jan 28, 2014 5:41 pm
- Contact:
Re: Snapshot Removal slowness
It's the replica VM doing disk consolidation when removing the snapshot according to vCenter
-
- Veeam Software
- Posts: 21139
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Snapshot Removal slowness
Then look at the target datastore performance, since there's no ability to perform full replication each time.
-
- Expert
- Posts: 164
- Liked: 9 times
- Joined: Jan 28, 2014 5:41 pm
- Contact:
Re: Snapshot Removal slowness
Perhaps backing up to a ReFS storage would be a better option in this case?
-
- Veeam Software
- Posts: 21139
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Snapshot Removal slowness
It depends n your requirements, since backup and replication have different scenarios behind.
Who is online
Users browsing this forum: rhys.hammond and 42 guests