Real-time performance monitoring and troubleshooting
Post Reply
JRRW
Enthusiast
Posts: 76
Liked: 45 times
Joined: Dec 10, 2019 3:59 pm
Full Name: Ryan Walker
Contact:

Metrics skewed - Hyper-V 2019 Cluster

Post by JRRW »

All,
I'm on the CE so figured I'd start here, before going to support:

(5) 2019 Hyper-V (Not core) installs in a cluster, with ~ 20 CSV running 2x8gb FC to 3PAR 7200c and 2x8gb to SAN and a Pure x50

I'm seeing some really odd metrics that don't match Task manager and/or other monitoring in two key areas:
-Disk latency
-Memory usage

For Disk Latency when I compare max to whatsup gold monitoring the same systems, I see a huge difference (as in WUG shows latency of <1-2ms and Veeam reports >7-20ms) that doesn't make any sense, our SAN switches don't show even half the bandwidth being used, and our SANs are asleep (the Pure x50 is an all NVMe array making it next to impossible to get high latency outside of network congestion)
When we run weekly performance reports, multiple CSV will show 'error' on latency with max latency waaayyy beyond what it should be hitting, and so I'm wondering if they're wrong.

For Memory, it mostly errors at hyper-v services memory usage - but these hosts all have at LEAST 150Gb of RAM free at all times; they're 512gb RAM and rarely use more than 300-330gb per host.

Anyone see a similar issue in their environments?
HannesK
Product Manager
Posts: 14314
Liked: 2889 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Metrics skewed - Hyper-V 2019 Cluster

Post by HannesK »

Hello,
task manager has different metrics, yes. Can you maybe tell us what "WUG" is? I tried to google it, but no luck.

Veeam ONE uses the values that Microsoft / Hyper-V (well, or VMware if used) gives us. If you don't believe in them, then I can only recommend to check with support.

Best regards,
Hannes
JRRW
Enthusiast
Posts: 76
Liked: 45 times
Joined: Dec 10, 2019 3:59 pm
Full Name: Ryan Walker
Contact:

Re: Metrics skewed - Hyper-V 2019 Cluster

Post by JRRW »

Hi Hannes,

WUG=Whatsup Gold (I reference it in 'when I compare max to whatsup gold monitoring') - sorry for the acronym usage!

I just sat through an SAP presentation and wanted to throw things in anger over how many acronyms they use... :D

So my question then is if there is documentation on where some of these metrics are pulled /combined in/from. Even within VeeamONE itself many of the alarms and such define very vaguely.

In specific for Disk Latency on CSV: When I look at the metrics on WhatsUp Gold which is polling CSV latency (looking at it on each host, not all hosts combined) I do not show the latency VeeamONE is reporting --- is that because VeeamONE is combining ALL hosts CSV Latency into 'one' metric? If so, the alarms/warnings are super misleading and not all that helpful.
In example then during a poll at Thursday at 23:45:
  • Host01 -> CSV01 = 2.5ms Latency
  • Host02 -> CSV01 = 1.5ms Latency
  • Host03 -> CSV01 = .5ms Latency
  • Host04 -> CSV01 = .25ms Latency
  • Host05 -> CSV01 = .25ms Latency
Does Veeam report under Hyper-V Cluster -> Cluster Shared Volumes -> CSV01 = 2ms (as the 'max' during that poll) or does it show as 5ms (2.5+1.5+.5+.25+.25=5)

For memory: is this Hyper-v services Memory? This Page defines the alarm as "Average Hyper-V Services memory usage for 15 minutes is above 80%." and reason of "This host is low on available memory." which in 99% of the cases we get this warning, is not true. The hosts have plenty of remaining unused memory.

~PerplexedInArms
wishr
Veteran
Posts: 3077
Liked: 453 times
Joined: Aug 07, 2018 3:11 pm
Full Name: Fedor Maslov
Contact:

Re: Metrics skewed - Hyper-V 2019 Cluster

Post by wishr » 1 person likes this post

Hi Ryan,

Could you please create a support case and let me know your case ID? I'll ask our support team to collect all the details and take a look at them.

Thanks
Post Reply

Who is online

Users browsing this forum: No registered users and 8 guests