-
- Enthusiast
- Posts: 42
- Liked: 9 times
- Joined: Jun 16, 2009 11:36 am
- Full Name: Joep Piscaer
Latency Control for distributed / hyper-converged datastores
Hi guys,
We're running Nutanix as our production storage environment. We'd like to enable storage latency control for our Nutanix NFS datastores, but we're weary to do so given the distributed nature of the datastores.
1. Can we safely enable storage latency control for Nutanix datastore?
2. If not, can we do a feature request to support distributed datastores by monitoring the latency for any given datastore from all hosts in a vSphere cluster?
We're running Nutanix as our production storage environment. We'd like to enable storage latency control for our Nutanix NFS datastores, but we're weary to do so given the distributed nature of the datastores.
1. Can we safely enable storage latency control for Nutanix datastore?
2. If not, can we do a feature request to support distributed datastores by monitoring the latency for any given datastore from all hosts in a vSphere cluster?
-
- Veeam Software
- Posts: 21139
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Latency Control for distributed / hyper-converged datast
Joep, since Veeam B&R operates latency data provided by VMware, I'd be more interested in whether VMware itself correctly reports latency for such datastores.
-
- VeeaMVP
- Posts: 6166
- Liked: 1971 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Latency Control for distributed / hyper-converged datast
Uhm, interesting. The datatore is shared between all the nodes in the cluster, and even if each CVM exposes its own "local" view on the datastore, at the vCenter layer it all appears as a single large shared datastore. It all comes down to how Nutanix exposes read latency informations to vCenter, is this value an average of the entire volume, or something else? Because we read the read latency from vCenter, so anything exposed by Nutanix is what we take for granted.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- Enthusiast
- Posts: 42
- Liked: 9 times
- Joined: Jun 16, 2009 11:36 am
- Full Name: Joep Piscaer
Re: Latency Control for distributed / hyper-converged datast
I figured you'd do your own stats vs. pulling them up from vCenter. Makes is a bit more complicated, although I do still see a use case for per-host datastore metrics to optimize for distributed systems..
-
- VP, Product Management
- Posts: 6035
- Liked: 2860 times
- Joined: Jun 05, 2009 12:57 pm
- Full Name: Tom Sightler
- Contact:
Re: Latency Control for distributed / hyper-converged datast
Veeam uses per-host metrics from vCenter for this. Here's a simple example from the Veeam log when we setup a monitor for datastore performance:
In this example host-13413 is "esx03" in my cluster and we are monitoring metrics 144 & 145, which correspond to "totalReadlatecy" and "totalWritelatency" for that datastore as that host sees it (i.e. as measured by VMware). Obviously though each host can have a totally different view of latency for a given datastore as seen in the below two screenshots showing the same time frame (well almost, off by 1 minute) for two different host within the same cluster based on their view of the same datastore (latency is very high because of some ongoing testing, no worries):
So we already have a "per-host" view of the datastore, however, I agree that from Veeam's perspective we would not have enough knowledge of the underlying per-host caching model in such a distributed environment and thus, if latency for that datastore is high on one node, we would most likely would not assign any additional tasks to other nodes in the cluster. You would almost want a case where we would assign at least one task per-host (assuming there are such task), and only throttle/limit beyond that.
Interestingly, I haven't really seen this become an issue in the field on Nutanix, and I've used I/O control there. It may simply be that it takes far more than a single task to overload the latency of a given host anyway (in my experience) so there's always plenty of headroom. It would definitely be something good to test in more detail.
Code: Select all
[25.10.2015 01:06:40] <43> Info [DatastoreIO] Checking availability host 'host-13413' metrics. Metrics ids: [144,145], interval: 20, startTime: '01.06.21.032', endTime: '01.06.41.032'
So we already have a "per-host" view of the datastore, however, I agree that from Veeam's perspective we would not have enough knowledge of the underlying per-host caching model in such a distributed environment and thus, if latency for that datastore is high on one node, we would most likely would not assign any additional tasks to other nodes in the cluster. You would almost want a case where we would assign at least one task per-host (assuming there are such task), and only throttle/limit beyond that.
Interestingly, I haven't really seen this become an issue in the field on Nutanix, and I've used I/O control there. It may simply be that it takes far more than a single task to overload the latency of a given host anyway (in my experience) so there's always plenty of headroom. It would definitely be something good to test in more detail.
Who is online
Users browsing this forum: No registered users and 54 guests