Host-based backup of Microsoft Hyper-V VMs.
Post Reply
kosta88
Novice
Posts: 3
Liked: never
Joined: Jun 12, 2024 6:41 am
Contact:

Veeam Restore Hyper-V Failover Cluster High CPU

Post by kosta88 »

Hello,
is there a known issue around restoration of a Windows virtual machine into Hyper-V Failover Cluster (in our case, Azure Stack HCI 22h2) which might cause a whole cluster to go to high CPU usage?
Note that we did not remove the virtual machine from the failover cluster prior to restoring it. But also were not aware that this might be a needed thing to do. We have never done it before, for machine restores.
We have following suspicion:
When Veeam starts to restore a VM that is actively part of the FC, some (or all nodes) peak the CPU usage to 100%. If we are lucky, there are nodes that remain at low CPU usage, and we might be able to live migrate the VMs from the nodes that are 100%, thus lowering the CPU load.
The solution apparently is solved when any of the nodes is not at 100%, then the whole cluster calms down. One of the services that seems to be at top is clussvc.
This has now happened twice in connection with one VM restore, that was still part of the cluster.
Note that also, after restore, and even removing the VM that was restored, including removing from FC, and shutting it down and removing it from Hyper-V doesn't relax the situation.

We currently cannot ascertain for 100% that it is caused by this restore, because we also had issues without Veeam restoring, but the timing has been on spot the two times Veeam was involved.

Thanks
johan.h
Veeam Software
Posts: 737
Liked: 196 times
Joined: Jun 05, 2013 9:45 am
Full Name: Johan Huttenga
Contact:

Re: Veeam Restore Hyper-V Failover Cluster High CPU

Post by johan.h »

If you've also had issues without Veeam it could simply be resource allocation on your cluster, cluster state or remotely possible - integration with monitoring tools - like SCVMM.

If clussvc is at 100% it would make me wonder about the general health of the cluster. When running normal production workloads, does the overall cluster resource (CPU, RAM, networking, etc) usage stay around or below 80%?

You say it gets better when Live Migrating away from the over-provisioned node? Another possibility is an issue with how I/O threads are handled. Did you enable Fix 636159629 as part of the February 2025 CU? https://www.veeam.com/kb4717
kosta88
Novice
Posts: 3
Liked: never
Joined: Jun 12, 2024 6:41 am
Contact:

Re: Veeam Restore Hyper-V Failover Cluster High CPU

Post by kosta88 »

Of course it could be cluster itself. That is why I wrote suspicion and whether something known exists. We do have issues with the cluster, very often actually, this thing with 100% CPU just came on top. The whole cluster was deployed by a rather reputable company, but of course, that mustn't mean anything.
Clussvc is itself not at 100%, but it is on top of processes together with svchost. What I observed is that if I remove certain amount of VMs from the node, and as the CPU becomes free, it will then "catch up" and all is well again. We currently have Eventlog issue being analyzed. But, no, actually nodes are well under 50% CPU load usually.

My only thought is that it happened apparently when I started a VM restore into the cluster. It's quite hard to test since it basically brings half of our customers down, when it happens.

Our Veeam server as well as HCI nodes are up-to-date, when it comes to windows updates.

Currently, we are heavily investigating the issue, however I am also checking connections to Veeam, if it might have been a trigger.
Post Reply

Who is online

Users browsing this forum: No registered users and 25 guests