We have a problem with backup and only backup on a CSV storage Windows 2012R2 up to date with last KB.
On backup with Veeam we have high latency up to 5 seconds and sometimes 15 seconds !
Perfmon show latency only on IO Redirect. So it seems there is no problem with storage.
And IO redirect show high latency also with only one node !
I dont understand why there is IO Redirect when we have only one node UP. Why CSV dont make direct IO ?
Do you use software or hardware VSS provider? Also not sure I fully understand when you say that only one node is running. Can you please clarify this?
We use software VSS provider. With our ZFS san we dont have vss hardware provider.
I was thinking that if we have only one node UP it was not possible to have IO redirect. But even only one node up there is IO redirect with high latency.
Today i use Perfmon and the "Direct IO Failure redirection" counter is not incremented. So this is the proof that the latency cause is between CSV and HyperV.
Also Get-StorageReliabilityCounter show latency up to 50ms.
Node1 is the owner of the CSV.
When node1 and node2 are up we have high latency on redirected IO.
When node1 is alone (node2 is halted) we have same high latency on redirected IO.
I meant to say if ever had a direct IO working during backup job. Assuming that all below is correct, then please contact our technical support team to review your setup:
1. You're using Software VSS provider
2. You don't have Event ID = 5125 on the Hyper-V node acting as the owner of CSV
3. When one host is shut down (not owner of the CSV) and you start to backup VMs residing on the CSV owner host and still observe redirected IO all the time
After reviewing your setup our team should be able to understand what is causing this behavior of the Hyper-V host. Since we do not control this process (redirecting traffic), it would be beneficial for all community members to know the the results of your investigation.
No solution for this problem without using hardware provider.
We have this problem with 4 SAN (10Gpbs, infiniband and FC) : ZFS san, P2000, msa2000 and HP 3PAR
This happens only with CSV on Windows 2012 r2 cluster. And the IO stats in Windows shows latency ONLY on IO redirect, but the CSV status never change to "io redirect".
We do seem to have a similar issue where we get huge response times on all VMs in the CSV during the backup. With huge response times I mean they sometimes freeze up for 10 seconds. I do not see any performance stress on the SAN (reports <<10ms on the disks), not on the network (2x 10Gbit Team, SAN connected through 2x 10Gbit MPIO), no CPU usage.
The hugest spikes in latency I always see in
\Device\Harddisk\Volume6\System Volume Information\... which correlates to the CSV that hosts the VMs.
As soon as I perform any action in the VMs (log in, open internet explorer, log out is all that's needed to cause this).
Case ID is 02053685.
Could someone let me know whether I can check if the IOs are redirected as well in our case? It does not show up in Failover Cluster Manager - which metric should I check in perfmon?