-
dasfliege
- Service Provider
- Posts: 328
- Liked: 69 times
- Joined: Nov 17, 2014 1:48 pm
- Full Name: Florin
- Location: Switzerland
- Contact:
Failover Cluster crashing during backup after 2025 update
We've in-place upgraded our 6-node Hyper-V 2022 cluster to 2025. Everything was working fine until we raised the cluster function level to 2025.
Since then, when we start the veeam backups (on-host), some of the nodes immediately are at 99% CPU, stop responding and disconnect (timeout) several CSVs.
We are already troubleshooting the issue with MS, but i wondered if there are any other people facing the same problems, since it seems to be directly related to backup operations or other heavy load operations on the hosts.
Veeam Case 07912704
Since then, when we start the veeam backups (on-host), some of the nodes immediately are at 99% CPU, stop responding and disconnect (timeout) several CSVs.
We are already troubleshooting the issue with MS, but i wondered if there are any other people facing the same problems, since it seems to be directly related to backup operations or other heavy load operations on the hosts.
Veeam Case 07912704
-
david.domask
- Veeam Software
- Posts: 3148
- Liked: 720 times
- Joined: Jun 28, 2016 12:12 pm
- Contact:
Re: Failover Cluster crashing during backup after 2025 update
Hi Florin,
Thank you for sharing the case number and sorry to hear about the difficulties.
I can see Support has already begun the investigation and requested additional information you provided, so please continue with Support as they will be able to better comment after a review of the debug logs. At first blush, not aware of issues related specifically to increasing the functional cluster level, so please continue with Support on the investigation.
Thank you for sharing the case number and sorry to hear about the difficulties.
I can see Support has already begun the investigation and requested additional information you provided, so please continue with Support as they will be able to better comment after a review of the debug logs. At first blush, not aware of issues related specifically to increasing the functional cluster level, so please continue with Support on the investigation.
David Domask | Product Management: Principal Analyst
-
Frosty
- Expert
- Posts: 212
- Liked: 46 times
- Joined: Dec 22, 2009 9:00 pm
- Full Name: Stephen Frost
- Contact:
Re: Failover Cluster crashing during backup after 2025 update
Probably doesn't help -- but -- we run a 2-Node HyperV Cluster which was built fresh on Windows Server 2025 (not upgraded from 2022) -- we have not noticed any problems with Veeam backups.
-
dasfliege
- Service Provider
- Posts: 328
- Liked: 69 times
- Joined: Nov 17, 2014 1:48 pm
- Full Name: Florin
- Location: Switzerland
- Contact:
Re: Failover Cluster crashing during backup after 2025 update
Just wanted to give a quick overview of what we've found out so far, in case some others run into the same issues:
We are seeing severe instability in a 6-node Hyper-V cluster after upgrading the Cluster Functional Level from 2022 to 2025.
Environment:
- 6-node Hyper-V cluster, Windows Server 2025
- CSVs on shared SAN (no SAN / MPIO errors)
- Veeam Backup & Replication (on-host backups)
- Backups ran stable for a long time on FL 2022 with the same load
Symptoms:
- Immediately after backup jobs start (before significant data transfer):
- One cluster node suddenly goes to ~99% CPU
- Host becomes almost unresponsive
- No single process shows constant high CPU, but:
- Failover Cluster Service
- WMI-related activity spike intermittently
- CSVs may later enter paused / redirected state (STATUS_IO_TIMEOUT)
- The affected node varies between runs (not host-specific)
Key Observation:
- On the affected node, we consistently see a very high number of VeeamHVWMIProxy processes (e.g. 20+), while healthy nodes show only 2–5 instances.
- During the incident, WMI becomes extremely slow or unresponsive on the affected node.
- Once WMI responsiveness recovers, the node stabilizes.
Mitigations tested:
- Sophos XDR completely removed from all nodes
- Windows Defender exclusions for:
- All Veeam folders & processes
- C:\ClusterStorage\*
- Issue still reproducible
Notable behavior:
- Disabling Windows Defender on the affected node during the incident leads to:
- CPU dropping
- Gradual host recovery
- Backups continuing without stopping
Microsoft feedback:
- MS confirmed that Cluster FL 2025 introduces stricter control-path and resiliency behavior (faster hang detection, CSV auto-pause).
- Workloads that were stable under FL 2022 may now trigger protective behavior under FL 2025 when backup/WMI activity and filter stacks are involved.
- No official regression article yet.
We are seeing severe instability in a 6-node Hyper-V cluster after upgrading the Cluster Functional Level from 2022 to 2025.
Environment:
- 6-node Hyper-V cluster, Windows Server 2025
- CSVs on shared SAN (no SAN / MPIO errors)
- Veeam Backup & Replication (on-host backups)
- Backups ran stable for a long time on FL 2022 with the same load
Symptoms:
- Immediately after backup jobs start (before significant data transfer):
- One cluster node suddenly goes to ~99% CPU
- Host becomes almost unresponsive
- No single process shows constant high CPU, but:
- Failover Cluster Service
- WMI-related activity spike intermittently
- CSVs may later enter paused / redirected state (STATUS_IO_TIMEOUT)
- The affected node varies between runs (not host-specific)
Key Observation:
- On the affected node, we consistently see a very high number of VeeamHVWMIProxy processes (e.g. 20+), while healthy nodes show only 2–5 instances.
- During the incident, WMI becomes extremely slow or unresponsive on the affected node.
- Once WMI responsiveness recovers, the node stabilizes.
Mitigations tested:
- Sophos XDR completely removed from all nodes
- Windows Defender exclusions for:
- All Veeam folders & processes
- C:\ClusterStorage\*
- Issue still reproducible
Notable behavior:
- Disabling Windows Defender on the affected node during the incident leads to:
- CPU dropping
- Gradual host recovery
- Backups continuing without stopping
Microsoft feedback:
- MS confirmed that Cluster FL 2025 introduces stricter control-path and resiliency behavior (faster hang detection, CSV auto-pause).
- Workloads that were stable under FL 2022 may now trigger protective behavior under FL 2025 when backup/WMI activity and filter stacks are involved.
- No official regression article yet.
-
_tcpip_
- Lurker
- Posts: 2
- Liked: never
- Joined: Sep 28, 2023 1:33 pm
- Contact:
Re: Failover Cluster crashing during backup after 2025 update
Hi,
Which file system are you using?
That sounds similar to the refs bug under 2025.
veeam-backup-replication-f2/server-2022 ... 96912.html
If it's refs, was the version upgraded during the upgrade?
Could it be?
Which file system are you using?
That sounds similar to the refs bug under 2025.
veeam-backup-replication-f2/server-2022 ... 96912.html
If it's refs, was the version upgraded during the upgrade?
Could it be?
-
dasfliege
- Service Provider
- Posts: 328
- Liked: 69 times
- Joined: Nov 17, 2014 1:48 pm
- Full Name: Florin
- Location: Switzerland
- Contact:
Re: Failover Cluster crashing during backup after 2025 update
I've seen this thread as well, but i don't think it has anything to do with our case, since our problems appear on the Hyper-V hosts and not on the backupserver.
On our Hype-V nodes we are using NTFS for the boot partition and CSVFS for the CSVs.
On our Hype-V nodes we are using NTFS for the boot partition and CSVFS for the CSVs.
-
_tcpip_
- Lurker
- Posts: 2
- Liked: never
- Joined: Sep 28, 2023 1:33 pm
- Contact:
Re: Failover Cluster crashing during backup after 2025 update
I think csvfs works with an underlying file system (NTFS or ReFS).
And if it's refs, it doesn't matter whether it's a backuserver or Hyper-V host. Because the error relates to refs.
If it is based on NTFS, this is irrelevant.
Boot partition in ntfs is ok. They can't do refs yet. In the next version.
And if it's refs, it doesn't matter whether it's a backuserver or Hyper-V host. Because the error relates to refs.
If it is based on NTFS, this is irrelevant.
Boot partition in ntfs is ok. They can't do refs yet. In the next version.
Who is online
Users browsing this forum: Bing [Bot] and 4 guests