-
- Influencer
- Posts: 24
- Liked: 2 times
- Joined: Feb 18, 2020 5:45 pm
- Full Name: Kevin Chubb
- Contact:
Help me understand datastore latency alarms
Lately I've been getting alarms for datastore read and write latency and I'm having trouble understanding what I'm seeing.
For example I have an alarm email at 8:50 PM with details stating: "Disk/Datastore: Datastore Read Latency" (102.0 Milliseconds) is above a defined threshold (100.0 Milliseconds)
Here are the 'Datastore read latency' alarm rules (pretty sure they're the default).
Here is the 'Datastore Read Latency' performance graph for 'Last day.' You can see that there is a spike up to 7 ms at about 8:50 PM.
So why does the email alarm say 102.0 ms but the performance graph only shows 7 ms?
For example I have an alarm email at 8:50 PM with details stating: "Disk/Datastore: Datastore Read Latency" (102.0 Milliseconds) is above a defined threshold (100.0 Milliseconds)
Here are the 'Datastore read latency' alarm rules (pretty sure they're the default).
Here is the 'Datastore Read Latency' performance graph for 'Last day.' You can see that there is a spike up to 7 ms at about 8:50 PM.
So why does the email alarm say 102.0 ms but the performance graph only shows 7 ms?
-
- Veeam Software
- Posts: 742
- Liked: 188 times
- Joined: Nov 01, 2016 11:26 am
- Contact:
Re: Help me understand datastore latency alarms
Hello Kevin,
As far as I remember, the alarm checks the max_value metric, while performance graphs use several aggregated current_value metrics.
Having that, the alarm is more precise and you indeed faced 102 ms around 8:50:00, while the graph checked values at 8:49:10, 8:49:30...8:50:10 and provided a single average value for the point on a graph.
Thanks
As far as I remember, the alarm checks the max_value metric, while performance graphs use several aggregated current_value metrics.
Having that, the alarm is more precise and you indeed faced 102 ms around 8:50:00, while the graph checked values at 8:49:10, 8:49:30...8:50:10 and provided a single average value for the point on a graph.
Thanks
-
- Influencer
- Posts: 24
- Liked: 2 times
- Joined: Feb 18, 2020 5:45 pm
- Full Name: Kevin Chubb
- Contact:
Re: Help me understand datastore latency alarms
Okay that's helpful, thank you.
If it's checking the max value then what's the purpose of the 15 minute time period?
If it's checking the max value then what's the purpose of the 15 minute time period?
-
- Influencer
- Posts: 24
- Liked: 2 times
- Joined: Feb 18, 2020 5:45 pm
- Full Name: Kevin Chubb
- Contact:
Re: Help me understand datastore latency alarms
Also if it's checking the max value then why is the field called Aggregation?
-
- Veeam Software
- Posts: 742
- Liked: 188 times
- Joined: Nov 01, 2016 11:26 am
- Contact:
Re: Help me understand datastore latency alarms
Hello Kevin,
Aggregation is just a general label for the alarm rule. It is possible to use min, max and avg functions against the data set as the rule might be applied to the multiple alarms with customization.
So the alarm rule populates the performance data and select min/max or avg among all numbers for that period. Then it check thresholds and start again. In practice 15 minutes it enough to prevent the alarm storm.
Thanks
Aggregation is just a general label for the alarm rule. It is possible to use min, max and avg functions against the data set as the rule might be applied to the multiple alarms with customization.
So the alarm rule populates the performance data and select min/max or avg among all numbers for that period. Then it check thresholds and start again. In practice 15 minutes it enough to prevent the alarm storm.
Thanks
-
- Influencer
- Posts: 24
- Liked: 2 times
- Joined: Feb 18, 2020 5:45 pm
- Full Name: Kevin Chubb
- Contact:
Re: Help me understand datastore latency alarms
Okay I understand now, thank you.
Who is online
Users browsing this forum: Google [Bot] and 2 guests