-
- Enthusiast
- Posts: 47
- Liked: 10 times
- Joined: Aug 26, 2019 7:04 am
- Full Name: Christine Boersen
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
For the "Failed to map guest I/O buffer for write access with status 0xC0000044"
I see those occasionally as well... Double check your permissions on the VHDX
(see url) https://redmondmag.com/articles/2017/08 ... blems.aspx
I see those occasionally as well... Double check your permissions on the VHDX
(see url) https://redmondmag.com/articles/2017/08 ... blems.aspx
-
- Veeam Software
- Posts: 723
- Liked: 185 times
- Joined: Jun 05, 2013 9:45 am
- Full Name: Johan Huttenga
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Christine, I would venture to say that permissions would only affect whether a file is accessible or not at all. Permissions would not be a factor in situations with variable I/O performance (as in the VHD is accessible, but reads and writes are performing poorly)
-
- Enthusiast
- Posts: 47
- Liked: 10 times
- Joined: Aug 26, 2019 7:04 am
- Full Name: Christine Boersen
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Johan.h
You are right, I was thinking it could help in some odd inherited permissions issue.
You are right, I was thinking it could help in some odd inherited permissions issue.
-
- Enthusiast
- Posts: 36
- Liked: 4 times
- Joined: Jun 14, 2016 9:36 am
- Full Name: Pter Pumpkin
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
If anyone has any Microsoft cases open, could you please PM the ticket numbers to me? The MS tech we're working with has acknowledged this thread and has asked for as many case numbers as possible.
-
- Enthusiast
- Posts: 47
- Liked: 10 times
- Joined: Aug 26, 2019 7:04 am
- Full Name: Christine Boersen
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
No case number.
However some more data points
Some additional things that helped
- Disable Deduplication (helped some, the built in jobs seemed to not be helping, however even when the built in jobs disabled, a job on low background settings could have the issue during the deduplication (like 10% cores, 10% memory, StopIfBusy, Low Priority)
- NTFS instead of ReFS (helped some)
- Lower the number of columns from 8 (auto value) to 2 (Equiv of 20, 1TB NVMe / Node) - This helped quite a bit
- Limit the number of Virtual Disks from 4 to 2 (on CSV, one Assigned as file server) - This helped moderately
The above has made the issue minimal until there is a patch.
It still occurs occasionally during planned node Pause/Drain operations
However some more data points
Some additional things that helped
- Disable Deduplication (helped some, the built in jobs seemed to not be helping, however even when the built in jobs disabled, a job on low background settings could have the issue during the deduplication (like 10% cores, 10% memory, StopIfBusy, Low Priority)
- NTFS instead of ReFS (helped some)
- Lower the number of columns from 8 (auto value) to 2 (Equiv of 20, 1TB NVMe / Node) - This helped quite a bit
- Limit the number of Virtual Disks from 4 to 2 (on CSV, one Assigned as file server) - This helped moderately
The above has made the issue minimal until there is a patch.
It still occurs occasionally during planned node Pause/Drain operations
-
- Lurker
- Posts: 1
- Liked: never
- Joined: Oct 17, 2018 11:32 am
- Full Name: Jesper Skovdal
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Hi we have the same problem with disk latency solved by live migrating VM once in a while.
Ther is a Microsoft support number i will post it later.
The environment is hyper-v 2019 cluster on ucs blades and Infinidat storage.
I hope posting here we can collect as many cases as possible.
Regards
Jesper
Ther is a Microsoft support number i will post it later.
The environment is hyper-v 2019 cluster on ucs blades and Infinidat storage.
I hope posting here we can collect as many cases as possible.
Regards
Jesper
-
- Enthusiast
- Posts: 76
- Liked: 16 times
- Joined: Oct 27, 2017 5:42 pm
- Full Name: Nick
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Wishr,
I did open a case with Microsoft Support on this last October but after 2 months of interacting with them (scores of Conversations, Reports, Logs & Remote Sessions, etc.) they – without any explanation of why – just stopped responding! The last I heard from them was in January when they said they would 'get back to me' ... but they never did and never even replied to my follow-up messages!
FWIW, I also had a case open with Dell Support (who was pretty accommodating) but that was essentially put 'on hold' when they concluded that it wasn’t a Hardware related issue.
Nick
I did open a case with Microsoft Support on this last October but after 2 months of interacting with them (scores of Conversations, Reports, Logs & Remote Sessions, etc.) they – without any explanation of why – just stopped responding! The last I heard from them was in January when they said they would 'get back to me' ... but they never did and never even replied to my follow-up messages!
FWIW, I also had a case open with Dell Support (who was pretty accommodating) but that was essentially put 'on hold' when they concluded that it wasn’t a Hardware related issue.
Nick
-
- Enthusiast
- Posts: 76
- Liked: 16 times
- Joined: Oct 27, 2017 5:42 pm
- Full Name: Nick
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
DG-MC,
You referenced “Version 10 VMs”. Was that a typo? AFAIK 9.1 is currently the highest Version (on Win10/Server 1903).
Thanks,
Nick
You referenced “Version 10 VMs”. Was that a typo? AFAIK 9.1 is currently the highest Version (on Win10/Server 1903).
Thanks,
Nick
-
- Veteran
- Posts: 3077
- Liked: 455 times
- Joined: Aug 07, 2018 3:11 pm
- Full Name: Fedor Maslov
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Hi Nick,
Sad to hear that.
I recently spoke to @pterpumpkin in private and provided him the number of the ticket we had opened with Microsoft some time ago. Also, as far as I got from his post above he has an ongoing conversation with Microsoft engineer and Microsoft needs as many support cases as possible to push a fix for this issue.
Thanks
Sad to hear that.
I recently spoke to @pterpumpkin in private and provided him the number of the ticket we had opened with Microsoft some time ago. Also, as far as I got from his post above he has an ongoing conversation with Microsoft engineer and Microsoft needs as many support cases as possible to push a fix for this issue.
Thanks
-
- Enthusiast
- Posts: 36
- Liked: 4 times
- Joined: Jun 14, 2016 9:36 am
- Full Name: Pter Pumpkin
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Thank you for all the PM's with case numbers! Please keep them coming. I have a total of 6 now (including my own).
-
- Lurker
- Posts: 1
- Liked: 1 time
- Joined: Sep 07, 2020 7:12 am
- Full Name: Bartłomiej Kowalczyk
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Hello,
A very interesting topic.
We use a backup system other than veeam, but the problem is the same.
I have registered case to Microsoft Service Request <120052425000074 >.
Unfortunately, it remains unanswered.
A very interesting topic.
We use a backup system other than veeam, but the problem is the same.
I have registered case to Microsoft Service Request <120052425000074 >.
Unfortunately, it remains unanswered.
-
- Influencer
- Posts: 19
- Liked: 18 times
- Joined: Jul 06, 2020 2:31 pm
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Hi
For information, I had the person following our case at Microsoft and he told me that he had grouped together 7/8 cases with the same problem.
I think they're trying to reproduce the problem in the lab.
Best Regards
For information, I had the person following our case at Microsoft and he told me that he had grouped together 7/8 cases with the same problem.
I think they're trying to reproduce the problem in the lab.
Best Regards
-
- Enthusiast
- Posts: 47
- Liked: 10 times
- Joined: Aug 26, 2019 7:04 am
- Full Name: Christine Boersen
- Contact:
[Possibly Solved, LONG] Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
OK.. I have another update. I believe I see what the combination of issues are on a Dell 12g/13g server and it's a combination of
- Note this is an ALL PCIe NVMe Cluster so results may vary. I still need to test a non-cluster, but I expect similar results
- I will be doing additional testing later on a non-cluster machine and will add another post with those results
Causes (all conditions)
- Meltdown/Spectre microcode BIOS + OS patches (hinted at from some posts I found, but no real solution without leaving the server vulnerable)
- The New Hyper-V Core Scheduler vs Classic Scheduler (as a result of above)
- VM spans NUMA boundaries on processors/Logical cores (this affected ReFS Guest (integrity on or off) more than NTFS Guest)
The following seems to have gotten me the performance I expected (Guest I/O being no more than 5-10% slower than Host I/O, with latency delays between Host and Guest tracking each other, vs the Guest latency spiking and/or other Guests affect each other)
How to Diagnose the problem
To Track down the error, you can set all the VM's to use any StorageQOSPolicy (e.g. minimum IOPS 50, maximum 0, maxbandwidth 0) - this is more for monitoring as it DOES NOT solve the problem before the changes detailed below
- Run this command to track latency of the VM's and compare it with the Host HW I/O Latency, they should follow/track each other (one VM shouldn't just kill others performance latency, causing the Storage latency bugs in the original topic of the message), Note this command because of averaging will be a few seconds behind depending on number of I/Os issued (since it has to average out a little, hence it should "track/follow" Host latency + a few percent for overhead.
Get-StorageQoSFlow | Sort-Object InitiatorLatency | Select -Last 10
(The example below are real numbers, on a Host volume CSVFS_NTFS, Guest Volume tested is NTFS. Both are using default NTFS settings)
I can't get in to all of the details on how to read the results of the diskspd command here, but there are some good articles if you google it. The important highlights
So when running the below testing with diskspd, I could see huge differences between Host latency and guest latency. The Command I was using to put load on the Host or Guest to compare the latency uses diskspd from microsoft (free download), in a horrible worst case to simulate SQL whereas
- This will leave HW cache enabled, Software disabled (needed for SSD/NVMe to work correctly)
- It writes a 20GB file to iotest.dat so ensure that points to your test volume (e.g. here for F: drive)
- You can read the full breakdown of the command parameters using diskspd /?
- it will run for ~ 35 seconds (30 sec test time + warm up and cool down)
- If you run the command a second time, (output to a different file) the values should be similar between run1 and run2. If they aren't check for other I/O on the volume and/or run a 3rd time and average as appropriate)
diskspd -b8K -d30 -o4 -t8 -Su -r -w25 -L -Z1G -c20G F:\iotest.dat > testResults.txt
Comparing the CPU Used, compare the following
(CPU load, bad, notice the low CPU load)
CPU | Usage | User | Kernel | Idle
-------------------------------------------
0| 11.82%| 0.73%| 11.09%| 88.18%
1| 11.56%| 0.63%| 10.94%| 88.44%
2| 11.35%| 0.94%| 10.42%| 88.65%
3| 11.09%| 0.73%| 10.36%| 88.91%
4| 11.25%| 0.47%| 10.78%| 88.75%
5| 11.25%| 0.63%| 10.63%| 88.75%
6| 10.99%| 0.57%| 10.42%| 89.01%
7| 10.47%| 0.47%| 10.00%| 89.53%
8| 7.19%| 0.94%| 6.25%| 92.81%
9| 6.36%| 0.94%| 5.42%| 93.64%
10| 5.63%| 0.68%| 4.95%| 94.37%
11| 5.05%| 0.05%| 5.00%| 94.95%
12| 5.21%| 0.31%| 4.90%| 94.79%
13| 7.61%| 1.36%| 6.26%| 92.39%
14| 5.84%| 0.52%| 5.32%| 94.16%
15| 5.99%| 0.94%| 5.05%| 94.01%
16| 6.36%| 1.09%| 5.27%| 93.64%
17| 5.73%| 0.89%| 4.84%| 94.27%
18| 5.16%| 0.26%| 4.90%| 94.84%
19| 5.01%| 0.42%| 4.59%| 94.99%
20| 5.52%| 0.52%| 5.00%| 94.48%
21| 5.06%| 0.31%| 4.74%| 94.94%
22| 4.79%| 0.36%| 4.43%| 95.21%
23| 6.26%| 0.26%| 6.00%| 93.74%
-------------------------------------------
avg.| 7.61%| 0.63%| 6.98%| 92.39%
(CPU Load, good, notice it actually makes some of the CPU's busy now)
CPU | Usage | User | Kernel | Idle
-------------------------------------------
0| 95.42%| 3.07%| 92.35%| 4.58%
1| 96.46%| 2.92%| 93.55%| 3.54%
2| 96.56%| 2.55%| 94.01%| 3.44%
3| 95.52%| 3.12%| 92.40%| 4.48%
4| 95.26%| 2.86%| 92.40%| 4.74%
5| 95.00%| 3.49%| 91.51%| 5.00%
6| 95.16%| 2.71%| 92.45%| 4.84%
7| 95.58%| 2.86%| 92.71%| 4.42%
8| 35.64%| 0.47%| 35.17%| 64.36%
9| 35.17%| 0.31%| 34.86%| 64.83%
10| 32.43%| 0.42%| 32.01%| 67.57%
11| 31.86%| 0.47%| 31.39%| 68.14%
12| 30.14%| 0.26%| 29.88%| 69.86%
13| 31.39%| 0.21%| 31.18%| 68.61%
14| 27.11%| 0.16%| 26.95%| 72.89%
15| 27.33%| 0.26%| 27.07%| 72.67%
16| 25.51%| 0.47%| 25.04%| 74.49%
17| 27.54%| 0.21%| 27.33%| 72.46%
18| 28.32%| 0.21%| 28.11%| 71.68%
19| 25.86%| 0.10%| 25.75%| 74.14%
20| 28.58%| 0.05%| 28.53%| 71.42%
21| 26.69%| 0.21%| 26.48%| 73.31%
22| 25.98%| 0.16%| 25.82%| 74.02%
23| 26.65%| 0.16%| 26.50%| 73.35%
-------------------------------------------
avg.| 51.30%| 1.15%| 50.14%| 48.70%
Comparing the summary at the end to start, you should see the values in 95th or 99th and below "tracking" the latency from the Get-StorageQOSFlow command above.
(summary example, Bad VM Guest, before changes, see how the Write latency goes horribly bad, during this time, the StorageQOSFlow showed horrible latency, which affected other volumes, yet the Host IO Latency (Windows Admin Center for example) showed the latency being low during this same time, proving it was "overhead" induced in the Hyper-Visor somewhere)
total:
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
min | 0.033 | 0.466 | 0.033
25th | 0.137 | 83.024 | 0.196
50th | 0.331 | 91.370 | 0.498
75th | 0.606 | 100.709 | 4.642
90th | 0.962 | 113.450 | 94.638
95th | 1.504 | 170.363 | 103.408
99th | 4.117 | 226.077 | 178.741
3-nines | 78.729 | 1024.335 | 405.798
4-nines | 309.404 | 1901.500 | 1512.306
5-nines | 311.810 | 2077.842 | 2063.667
6-nines | 312.010 | 2077.842 | 2077.842
7-nines | 312.010 | 2077.842 | 2077.842
8-nines | 312.010 | 2077.842 | 2077.842
9-nines | 312.010 | 2077.842 | 2077.842
max | 312.010 | 2077.842 | 2077.842
(summary example, Good example, notice the latency)
total:
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
min | 0.124 | 0.375 | 0.124
25th | 1.418 | 1.601 | 1.462
50th | 2.038 | 2.232 | 2.088
75th | 2.944 | 3.139 | 2.996
90th | 4.298 | 4.549 | 4.361
95th | 5.492 | 5.911 | 5.598
99th | 9.346 | 9.946 | 9.506
3-nines | 23.907 | 26.661 | 24.948
4-nines | 69.912 | 96.569 | 80.284
5-nines | 99.576 | 100.424 | 99.909
6-nines | 101.410 | 106.454 | 106.129
7-nines | 106.453 | 106.454 | 106.454
8-nines | 106.453 | 106.454 | 106.454
9-nines | 106.453 | 106.454 | 106.454
max | 106.453 | 106.454 | 106.454
So now to the solution in my environment
- Disabling SMT/HyperThreading in the BIOS
- this forces fallback to the core scheduler
- ensure all your VM's run the following and set HWThreadCountPerCore to 0 on Windows Server 2019 host, 1 on Windows Server 2016 host, Whereas VMName is the VM (Mine was previously set to 0, which is follow SMT Setting), see https://docs.microsoft.com/en-us/window ... nistrator.
Set-VMProcessor -VMName <VMName> -HwThreadCountPerCore <0, 1, 2>
- Ensure no VM spans NUMA (per VM Logical Processor <= Physical core count of smallest physical processor)
*NOW*, CBT only occasionally causes some storage latency FOR JUST THE Guest Volume Involved now, and only if the Guest Volume is REFS. If the Guest partition NTFS, this was not observed, DURING the backup, after large data changes (added 1.2 TB of partitions to CBT). Subsequent CBT backups did not cause latency issues.
This fix also worked for host CSVFS_ReFS (with/without integrity), Guest NTFS. Guest REFS (no integrity tested) as well (needs more testing on my end)
Obviously CSVFS_ReFS is slower on my insane test (25-40% slower) but no I/O Latency spiking issues, just "not as fast" in absolute numbers
I still have more testing to do but Hoping the above helps M$ track this down, and others solve/work around the issue for their environments
Thanks,
Christine
- Note this is an ALL PCIe NVMe Cluster so results may vary. I still need to test a non-cluster, but I expect similar results
- I will be doing additional testing later on a non-cluster machine and will add another post with those results
Causes (all conditions)
- Meltdown/Spectre microcode BIOS + OS patches (hinted at from some posts I found, but no real solution without leaving the server vulnerable)
- The New Hyper-V Core Scheduler vs Classic Scheduler (as a result of above)
- VM spans NUMA boundaries on processors/Logical cores (this affected ReFS Guest (integrity on or off) more than NTFS Guest)
The following seems to have gotten me the performance I expected (Guest I/O being no more than 5-10% slower than Host I/O, with latency delays between Host and Guest tracking each other, vs the Guest latency spiking and/or other Guests affect each other)
How to Diagnose the problem
To Track down the error, you can set all the VM's to use any StorageQOSPolicy (e.g. minimum IOPS 50, maximum 0, maxbandwidth 0) - this is more for monitoring as it DOES NOT solve the problem before the changes detailed below
- Run this command to track latency of the VM's and compare it with the Host HW I/O Latency, they should follow/track each other (one VM shouldn't just kill others performance latency, causing the Storage latency bugs in the original topic of the message), Note this command because of averaging will be a few seconds behind depending on number of I/Os issued (since it has to average out a little, hence it should "track/follow" Host latency + a few percent for overhead.
Get-StorageQoSFlow | Sort-Object InitiatorLatency | Select -Last 10
(The example below are real numbers, on a Host volume CSVFS_NTFS, Guest Volume tested is NTFS. Both are using default NTFS settings)
I can't get in to all of the details on how to read the results of the diskspd command here, but there are some good articles if you google it. The important highlights
So when running the below testing with diskspd, I could see huge differences between Host latency and guest latency. The Command I was using to put load on the Host or Guest to compare the latency uses diskspd from microsoft (free download), in a horrible worst case to simulate SQL whereas
- This will leave HW cache enabled, Software disabled (needed for SSD/NVMe to work correctly)
- It writes a 20GB file to iotest.dat so ensure that points to your test volume (e.g. here for F: drive)
- You can read the full breakdown of the command parameters using diskspd /?
- it will run for ~ 35 seconds (30 sec test time + warm up and cool down)
- If you run the command a second time, (output to a different file) the values should be similar between run1 and run2. If they aren't check for other I/O on the volume and/or run a 3rd time and average as appropriate)
diskspd -b8K -d30 -o4 -t8 -Su -r -w25 -L -Z1G -c20G F:\iotest.dat > testResults.txt
Comparing the CPU Used, compare the following
(CPU load, bad, notice the low CPU load)
CPU | Usage | User | Kernel | Idle
-------------------------------------------
0| 11.82%| 0.73%| 11.09%| 88.18%
1| 11.56%| 0.63%| 10.94%| 88.44%
2| 11.35%| 0.94%| 10.42%| 88.65%
3| 11.09%| 0.73%| 10.36%| 88.91%
4| 11.25%| 0.47%| 10.78%| 88.75%
5| 11.25%| 0.63%| 10.63%| 88.75%
6| 10.99%| 0.57%| 10.42%| 89.01%
7| 10.47%| 0.47%| 10.00%| 89.53%
8| 7.19%| 0.94%| 6.25%| 92.81%
9| 6.36%| 0.94%| 5.42%| 93.64%
10| 5.63%| 0.68%| 4.95%| 94.37%
11| 5.05%| 0.05%| 5.00%| 94.95%
12| 5.21%| 0.31%| 4.90%| 94.79%
13| 7.61%| 1.36%| 6.26%| 92.39%
14| 5.84%| 0.52%| 5.32%| 94.16%
15| 5.99%| 0.94%| 5.05%| 94.01%
16| 6.36%| 1.09%| 5.27%| 93.64%
17| 5.73%| 0.89%| 4.84%| 94.27%
18| 5.16%| 0.26%| 4.90%| 94.84%
19| 5.01%| 0.42%| 4.59%| 94.99%
20| 5.52%| 0.52%| 5.00%| 94.48%
21| 5.06%| 0.31%| 4.74%| 94.94%
22| 4.79%| 0.36%| 4.43%| 95.21%
23| 6.26%| 0.26%| 6.00%| 93.74%
-------------------------------------------
avg.| 7.61%| 0.63%| 6.98%| 92.39%
(CPU Load, good, notice it actually makes some of the CPU's busy now)
CPU | Usage | User | Kernel | Idle
-------------------------------------------
0| 95.42%| 3.07%| 92.35%| 4.58%
1| 96.46%| 2.92%| 93.55%| 3.54%
2| 96.56%| 2.55%| 94.01%| 3.44%
3| 95.52%| 3.12%| 92.40%| 4.48%
4| 95.26%| 2.86%| 92.40%| 4.74%
5| 95.00%| 3.49%| 91.51%| 5.00%
6| 95.16%| 2.71%| 92.45%| 4.84%
7| 95.58%| 2.86%| 92.71%| 4.42%
8| 35.64%| 0.47%| 35.17%| 64.36%
9| 35.17%| 0.31%| 34.86%| 64.83%
10| 32.43%| 0.42%| 32.01%| 67.57%
11| 31.86%| 0.47%| 31.39%| 68.14%
12| 30.14%| 0.26%| 29.88%| 69.86%
13| 31.39%| 0.21%| 31.18%| 68.61%
14| 27.11%| 0.16%| 26.95%| 72.89%
15| 27.33%| 0.26%| 27.07%| 72.67%
16| 25.51%| 0.47%| 25.04%| 74.49%
17| 27.54%| 0.21%| 27.33%| 72.46%
18| 28.32%| 0.21%| 28.11%| 71.68%
19| 25.86%| 0.10%| 25.75%| 74.14%
20| 28.58%| 0.05%| 28.53%| 71.42%
21| 26.69%| 0.21%| 26.48%| 73.31%
22| 25.98%| 0.16%| 25.82%| 74.02%
23| 26.65%| 0.16%| 26.50%| 73.35%
-------------------------------------------
avg.| 51.30%| 1.15%| 50.14%| 48.70%
Comparing the summary at the end to start, you should see the values in 95th or 99th and below "tracking" the latency from the Get-StorageQOSFlow command above.
(summary example, Bad VM Guest, before changes, see how the Write latency goes horribly bad, during this time, the StorageQOSFlow showed horrible latency, which affected other volumes, yet the Host IO Latency (Windows Admin Center for example) showed the latency being low during this same time, proving it was "overhead" induced in the Hyper-Visor somewhere)
total:
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
min | 0.033 | 0.466 | 0.033
25th | 0.137 | 83.024 | 0.196
50th | 0.331 | 91.370 | 0.498
75th | 0.606 | 100.709 | 4.642
90th | 0.962 | 113.450 | 94.638
95th | 1.504 | 170.363 | 103.408
99th | 4.117 | 226.077 | 178.741
3-nines | 78.729 | 1024.335 | 405.798
4-nines | 309.404 | 1901.500 | 1512.306
5-nines | 311.810 | 2077.842 | 2063.667
6-nines | 312.010 | 2077.842 | 2077.842
7-nines | 312.010 | 2077.842 | 2077.842
8-nines | 312.010 | 2077.842 | 2077.842
9-nines | 312.010 | 2077.842 | 2077.842
max | 312.010 | 2077.842 | 2077.842
(summary example, Good example, notice the latency)
total:
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
min | 0.124 | 0.375 | 0.124
25th | 1.418 | 1.601 | 1.462
50th | 2.038 | 2.232 | 2.088
75th | 2.944 | 3.139 | 2.996
90th | 4.298 | 4.549 | 4.361
95th | 5.492 | 5.911 | 5.598
99th | 9.346 | 9.946 | 9.506
3-nines | 23.907 | 26.661 | 24.948
4-nines | 69.912 | 96.569 | 80.284
5-nines | 99.576 | 100.424 | 99.909
6-nines | 101.410 | 106.454 | 106.129
7-nines | 106.453 | 106.454 | 106.454
8-nines | 106.453 | 106.454 | 106.454
9-nines | 106.453 | 106.454 | 106.454
max | 106.453 | 106.454 | 106.454
So now to the solution in my environment
- Disabling SMT/HyperThreading in the BIOS
- this forces fallback to the core scheduler
- ensure all your VM's run the following and set HWThreadCountPerCore to 0 on Windows Server 2019 host, 1 on Windows Server 2016 host, Whereas VMName is the VM (Mine was previously set to 0, which is follow SMT Setting), see https://docs.microsoft.com/en-us/window ... nistrator.
Set-VMProcessor -VMName <VMName> -HwThreadCountPerCore <0, 1, 2>
- Ensure no VM spans NUMA (per VM Logical Processor <= Physical core count of smallest physical processor)
*NOW*, CBT only occasionally causes some storage latency FOR JUST THE Guest Volume Involved now, and only if the Guest Volume is REFS. If the Guest partition NTFS, this was not observed, DURING the backup, after large data changes (added 1.2 TB of partitions to CBT). Subsequent CBT backups did not cause latency issues.
This fix also worked for host CSVFS_ReFS (with/without integrity), Guest NTFS. Guest REFS (no integrity tested) as well (needs more testing on my end)
Obviously CSVFS_ReFS is slower on my insane test (25-40% slower) but no I/O Latency spiking issues, just "not as fast" in absolute numbers
I still have more testing to do but Hoping the above helps M$ track this down, and others solve/work around the issue for their environments
Thanks,
Christine
-
- Veteran
- Posts: 528
- Liked: 144 times
- Joined: Aug 20, 2015 9:30 pm
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
If NUMA spanning is actually part of the issue, you can disable/block NUMA spanning at the host level by doing "Set-VMHost -NumaSpanningEnabled $false" in PowerShell. I've always done this on my Hyper-V servers to improve performance. It would be interesting to see what it looks like with hyperthreading enabled but NUMA spanning disabled.
-
- Enthusiast
- Posts: 47
- Liked: 10 times
- Joined: Aug 26, 2019 7:04 am
- Full Name: Christine Boersen
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Also, the more I look at this, I think we are all chasing two partially overlapping problems...
#1 the guest VM performance issue, which is what I've documented seems to be resolved with the Guest VM, for many hours, The steps to return to the Classic Scheduler and SMT/HyperThreading disabled in the BIOS solves the performance now, until #2 happens.
#2 the I/O Scheduler just seeming to get confused. When #1 is resolved, the improved performance seems to reduce the frequency of #2 since the Guest I/Os are writing more quickly, not because the cause of #2 has been resolved
- Numa spanning only seemed to upset CSVFS_ReFS and/or ReFS *directly*, and may be a red herring
- When on CSVFS_ReFS, the storage subsystem could bring itself to a near halt, when migrating a 400GB or larger sized ReFS Guest with CBT (whether guest was running or not), even after ensuring I had done a backup of the offline VM (so There shouldn't be any "changes" after that, right?!?!)
- When things slow down enough to get the originally post's error ("Source: Microsoft-Windows-Hyper-V-StorageVSP"), I've seen it show for the target, VM's irrespective of if the path is for DAS or CSV storage, irrespective of Cluster., MANY times, the path is the path of a file that doesn't exist on the volume anymore. And irrespective of if I've issued Flush-Volume commands against the Volume. I can also see (at least on a Cluster), that sometimes the QOSFlow is duplicated after moving a volume between storage... And it doesn't drop the QOSFlow for the "old" location/path until you have stopped and restarted the VM (even hours after the VM was done being moved)
Some hypothesis,
I've also noticed, once you get the i/O "quiet" on the Hyper-V Host/Cluster, *AND* shutdown most/all VMs. The storage subsystem will catch back up... and it seems like it clears out what ever was hanging it. And can stay that way even under load afterwards..
It is almost like I/O scheduler and/or CBT gets "confused" and thinks an "old" I/O hasn't completed, and that starts hanging subsequent I/O dependent on that Read/Write (even though it occurred)....
So for now, Host CSV is CSVFS_NTFS, and all but one partition (due to timing, will convert tonight) are NTFS. And there were no IO issues during all of that, moving everything ~ 1.2GB/s or more, with no delay. It was moved from CSVFS_ReFS, and the entire time, the latency on the CSVFS_NTFS destination was lower than the latency on the CSVFS_ReFS Source.
So after tonight, in any scenario where there is a Guest VM, I am avoiding ReFS on the host or guest. I will leave CBT on for a few days, and if there are any issues, completely disable that to see if that finishes eliminating the issues.
I will continue to use ReFS for my backup storage, as that seems to work without any issues, as long as Guest VMs are not on the partition.
I'll give some more reports back on this, but to summarize
Problem 1 - Guest VM has poor I/O Performance on WIndows Server 2019 hosted VM
- Turn off SMT/Hyperthreading and let the classic scheduler work the way things used to, which means the spectre/meltdown patches aren't part of the problem
- Don't use RefS/CSVFS_ReFS for Host or Guest (overlaps with helping with Problem 2)
Problem 2 - The originally reported ("Source: Microsoft-Windows-Hyper-V-StorageVSP") error and I/Os hanging as a result
- Perform steps for Problem 1 to improve performance, which reduces the occurrence
- TBD, after I run with this configuration for a few days, disable CBT completely in the backups
I think it's the combination of these overlaps that making it so difficult to track down/reproduce properly to be 100% reliable for Problem 2
Again, hope all of this helps us diagnose as a community what the underlying bugs causing this are, and worst case, these are additional data points that may solve the problems for you
Thanks,
Christine
#1 the guest VM performance issue, which is what I've documented seems to be resolved with the Guest VM, for many hours, The steps to return to the Classic Scheduler and SMT/HyperThreading disabled in the BIOS solves the performance now, until #2 happens.
#2 the I/O Scheduler just seeming to get confused. When #1 is resolved, the improved performance seems to reduce the frequency of #2 since the Guest I/Os are writing more quickly, not because the cause of #2 has been resolved
- Numa spanning only seemed to upset CSVFS_ReFS and/or ReFS *directly*, and may be a red herring
- When on CSVFS_ReFS, the storage subsystem could bring itself to a near halt, when migrating a 400GB or larger sized ReFS Guest with CBT (whether guest was running or not), even after ensuring I had done a backup of the offline VM (so There shouldn't be any "changes" after that, right?!?!)
- When things slow down enough to get the originally post's error ("Source: Microsoft-Windows-Hyper-V-StorageVSP"), I've seen it show for the target, VM's irrespective of if the path is for DAS or CSV storage, irrespective of Cluster., MANY times, the path is the path of a file that doesn't exist on the volume anymore. And irrespective of if I've issued Flush-Volume commands against the Volume. I can also see (at least on a Cluster), that sometimes the QOSFlow is duplicated after moving a volume between storage... And it doesn't drop the QOSFlow for the "old" location/path until you have stopped and restarted the VM (even hours after the VM was done being moved)
Some hypothesis,
I've also noticed, once you get the i/O "quiet" on the Hyper-V Host/Cluster, *AND* shutdown most/all VMs. The storage subsystem will catch back up... and it seems like it clears out what ever was hanging it. And can stay that way even under load afterwards..
It is almost like I/O scheduler and/or CBT gets "confused" and thinks an "old" I/O hasn't completed, and that starts hanging subsequent I/O dependent on that Read/Write (even though it occurred)....
So for now, Host CSV is CSVFS_NTFS, and all but one partition (due to timing, will convert tonight) are NTFS. And there were no IO issues during all of that, moving everything ~ 1.2GB/s or more, with no delay. It was moved from CSVFS_ReFS, and the entire time, the latency on the CSVFS_NTFS destination was lower than the latency on the CSVFS_ReFS Source.
So after tonight, in any scenario where there is a Guest VM, I am avoiding ReFS on the host or guest. I will leave CBT on for a few days, and if there are any issues, completely disable that to see if that finishes eliminating the issues.
I will continue to use ReFS for my backup storage, as that seems to work without any issues, as long as Guest VMs are not on the partition.
I'll give some more reports back on this, but to summarize
Problem 1 - Guest VM has poor I/O Performance on WIndows Server 2019 hosted VM
- Turn off SMT/Hyperthreading and let the classic scheduler work the way things used to, which means the spectre/meltdown patches aren't part of the problem
- Don't use RefS/CSVFS_ReFS for Host or Guest (overlaps with helping with Problem 2)
Problem 2 - The originally reported ("Source: Microsoft-Windows-Hyper-V-StorageVSP") error and I/Os hanging as a result
- Perform steps for Problem 1 to improve performance, which reduces the occurrence
- TBD, after I run with this configuration for a few days, disable CBT completely in the backups
I think it's the combination of these overlaps that making it so difficult to track down/reproduce properly to be 100% reliable for Problem 2
Again, hope all of this helps us diagnose as a community what the underlying bugs causing this are, and worst case, these are additional data points that may solve the problems for you
Thanks,
Christine
-
- Enthusiast
- Posts: 47
- Liked: 10 times
- Joined: Aug 26, 2019 7:04 am
- Full Name: Christine Boersen
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Ok.. I've done more testing, now on NON-Clustered servers.
All Servers were Dell 11g and 12g servers.
All are DAS against Both SAS SFF HDD (10K and 15K RPM) and SATA SSD (prosumer and consumer grade) on HW Raid (PERC H700p, PERC H710p (internal controllers), and PERC H800 (external) )
So that eliminates Storage Spaces and S2D and Networking (40Gbe) as a factor in the performance
This Solution (disabling SMT/HyperThreading) has dramatically improved disk I/O for VM's in all scenarios I've tested.
Additionally (second test), converting the Hosts' volume back to NTFS, had a minimal effect on performance, but enough to be worth converting them back.
I also did some NUMA testing, and had little/no change in performance (within margin of error)
Can someone else (Original Poster?) Try turning off the SMT/HyperThreading on their Host to see if this improves their performance as well?
Thanks,
Christine
All Servers were Dell 11g and 12g servers.
All are DAS against Both SAS SFF HDD (10K and 15K RPM) and SATA SSD (prosumer and consumer grade) on HW Raid (PERC H700p, PERC H710p (internal controllers), and PERC H800 (external) )
So that eliminates Storage Spaces and S2D and Networking (40Gbe) as a factor in the performance
This Solution (disabling SMT/HyperThreading) has dramatically improved disk I/O for VM's in all scenarios I've tested.
Additionally (second test), converting the Hosts' volume back to NTFS, had a minimal effect on performance, but enough to be worth converting them back.
I also did some NUMA testing, and had little/no change in performance (within margin of error)
Can someone else (Original Poster?) Try turning off the SMT/HyperThreading on their Host to see if this improves their performance as well?
Thanks,
Christine
-
- Lurker
- Posts: 1
- Liked: never
- Joined: Apr 03, 2020 5:13 pm
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Peter can you please PM me and I'll reply with my similar case number. The forum is not allowing me to send PMs yet because I have not been in discussions. Thanks!pterpumpkin wrote: ↑Sep 02, 2020 8:20 pm Thank you for all the PM's with case numbers! Please keep them coming. I have a total of 6 now (including my own).
-
- Chief Product Officer
- Posts: 31816
- Liked: 7302 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Well, now you can
-
- Enthusiast
- Posts: 47
- Liked: 10 times
- Joined: Aug 26, 2019 7:04 am
- Full Name: Christine Boersen
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Just to directly follow up, NUMA was not part of the performance issue.nmdange wrote: ↑Sep 09, 2020 2:33 am If NUMA spanning is actually part of the issue, you can disable/block NUMA spanning at the host level by doing "Set-VMHost -NumaSpanningEnabled $false" in PowerShell. I've always done this on my Hyper-V servers to improve performance. It would be interesting to see what it looks like with hyperthreading enabled but NUMA spanning disabled.
-
- Lurker
- Posts: 1
- Liked: never
- Joined: Sep 16, 2020 12:36 pm
- Full Name: Nick Bennett
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Hi,
I'm seeing the event ID 9 on a single VM in a customer environment. Very intermittent in terms of occurence, sometimes we run fine for days without an issue and then we get it twice in 2 days. Also the period of time the issue occurs for varies greatly.
From reviewing this thread I get the impression disabling CBT in Veeam has no effect and that the issue lies within the Microsoft RCT Driver on the Hyper-V hosts, something that we can't actually disable and the workaround is to migrate the VM to another host interrupting the RCT process which is causing the issue. I'll try this next time is occurs.
Anyone with MS tickets logged getting any sensible feedback.
Thanks
I'm seeing the event ID 9 on a single VM in a customer environment. Very intermittent in terms of occurence, sometimes we run fine for days without an issue and then we get it twice in 2 days. Also the period of time the issue occurs for varies greatly.
From reviewing this thread I get the impression disabling CBT in Veeam has no effect and that the issue lies within the Microsoft RCT Driver on the Hyper-V hosts, something that we can't actually disable and the workaround is to migrate the VM to another host interrupting the RCT process which is causing the issue. I'll try this next time is occurs.
Anyone with MS tickets logged getting any sensible feedback.
Thanks
-
- Enthusiast
- Posts: 47
- Liked: 10 times
- Joined: Aug 26, 2019 7:04 am
- Full Name: Christine Boersen
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
@nickthebennett
Look at my previous few replies for some details that may help (essentially, turning off SMT/Hyperthreading/Logical processors in the BIOS, and using NTFS instead of ReFS for both the Host Volume and the VM Client Volume
Also above is how I have been testing it to show the large difference in performance (before and after). In all the servers I've tested (specs above), this solved the issue WITHOUT disabling CBT.
Let use know your machine specs and config, and if your environment has positive results when trying my suggestions,
Thanks,
Christine
Look at my previous few replies for some details that may help (essentially, turning off SMT/Hyperthreading/Logical processors in the BIOS, and using NTFS instead of ReFS for both the Host Volume and the VM Client Volume
Also above is how I have been testing it to show the large difference in performance (before and after). In all the servers I've tested (specs above), this solved the issue WITHOUT disabling CBT.
Let use know your machine specs and config, and if your environment has positive results when trying my suggestions,
Thanks,
Christine
-
- Novice
- Posts: 3
- Liked: 1 time
- Joined: Jun 20, 2020 6:41 pm
- Full Name: Giovani Moda
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Hello.
First of all, Christine, absolutely amazing work. I really hope this dedicated work of yours helps MS track down this issue.
You know, I always thought that guest VMs performance on Server 2019, specially when running on 14th generation Dell Servers, was just somewhat "off" when compared to the same setup on Server 2016. I think you know what I'm saying: laggy screens, a few extra seconds to open an application or to list the contents of a folder, etc, etc. But I could never really put my finger on it and, as long as everything was running, I just brushed it off and kept it going. But now it has become an issue.
Well, to the point: a few weeks ago I got a call from a MS engineer who gave me two registry keys to supposedly address the CBT issue:
I'm testing this on a very reduced lab environment for a week now, as I don't have access to high end servers on my lab, and haven't noticed anything wicked going on. But since the issue, at least for me, is very hard to reproduce I can not say that it has indeed fixed anything. Backups are running, the guest VMs seems to be responding normally and no alerts have been generated so far. I could not find any documentation about these keys, though, so this is something that is really bugging me.
Anyway, I'm sharing this so maybe some of you who can more easily reproduce the issue can maybe test it on a larger scale. Who knows, right?
Regards,
Giovani
First of all, Christine, absolutely amazing work. I really hope this dedicated work of yours helps MS track down this issue.
You know, I always thought that guest VMs performance on Server 2019, specially when running on 14th generation Dell Servers, was just somewhat "off" when compared to the same setup on Server 2016. I think you know what I'm saying: laggy screens, a few extra seconds to open an application or to list the contents of a folder, etc, etc. But I could never really put my finger on it and, as long as everything was running, I just brushed it off and kept it going. But now it has become an issue.
Well, to the point: a few weeks ago I got a call from a MS engineer who gave me two registry keys to supposedly address the CBT issue:
Code: Select all
HKLM\System\CurrentControlSet\Services\vhdmp\Parameters\DisableResiliency = 1 (REG_DWORD)
HKLM\software\Microsoft\Windows nt\CurrentVersion\Virtualization\Worker\DisableResiliency = 1 (REG_DWORD)
Anyway, I'm sharing this so maybe some of you who can more easily reproduce the issue can maybe test it on a larger scale. Who knows, right?
Regards,
Giovani
-
- Enthusiast
- Posts: 47
- Liked: 10 times
- Joined: Aug 26, 2019 7:04 am
- Full Name: Christine Boersen
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Giovani,
If you could, could you run the tests I gave above (and below where F: is the volume), both outside a VM on the host's physical volume, and from within a VM that has its vhdx on the same physical volume.
Run the tests both before the changes you just posted (and see if they are drastically different, see above), and then run them afterwards (and compare to the before) for differences. It's much quicker than waiting for the intermittent errors, since they only start showing @ 10 sec + delays to I/O
diskspd -b8K -d30 -o4 -t8 -Su -r -w25 -L -Z1G -c20G F:\iotest.dat > testResults.txt
I'll have to test out your changes when I get some free time next week, but so far, since I turned off the HyperThreading, and used NTFS on every host volume that has guest vhdx's on them, as well as for the volume inside the vhdx, I haven't had any issues
HTH,
Christine
If you could, could you run the tests I gave above (and below where F: is the volume), both outside a VM on the host's physical volume, and from within a VM that has its vhdx on the same physical volume.
Run the tests both before the changes you just posted (and see if they are drastically different, see above), and then run them afterwards (and compare to the before) for differences. It's much quicker than waiting for the intermittent errors, since they only start showing @ 10 sec + delays to I/O
diskspd -b8K -d30 -o4 -t8 -Su -r -w25 -L -Z1G -c20G F:\iotest.dat > testResults.txt
I'll have to test out your changes when I get some free time next week, but so far, since I turned off the HyperThreading, and used NTFS on every host volume that has guest vhdx's on them, as well as for the volume inside the vhdx, I haven't had any issues
HTH,
Christine
-
- Enthusiast
- Posts: 36
- Liked: 4 times
- Joined: Jun 14, 2016 9:36 am
- Full Name: Pter Pumpkin
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Great news!!!! Just got off a call from Microsoft. They have confirmed that this has now been submitted as a bug. The product group is currently investigating the issue and working on a fix. They will review severity of the issue and possibly release it as a fix in a round of Windows Updates. If they determine that the issue is not that severe or wide spread and there is a sufficient workaround, they may not release a fix at all
They are also 90% sure that the issue only occurs on Server 2016 VM's that have been migrated from Hyper-V 2016 to Hyper-V 2019. They're confident that if you build a fresh VM with Server 2016 or Server 2019 on a Hyper-V 2019 host/cluster, the issue will not reoccur.
They believe that deleting all the "RCT reference points" which is just the RCT files associated with the VM may also resolve the issue. They were not 100% confident on this though, but possibly worth a try.
They are also 90% sure that the issue only occurs on Server 2016 VM's that have been migrated from Hyper-V 2016 to Hyper-V 2019. They're confident that if you build a fresh VM with Server 2016 or Server 2019 on a Hyper-V 2019 host/cluster, the issue will not reoccur.
They believe that deleting all the "RCT reference points" which is just the RCT files associated with the VM may also resolve the issue. They were not 100% confident on this though, but possibly worth a try.
-
- Enthusiast
- Posts: 76
- Liked: 16 times
- Joined: Oct 27, 2017 5:42 pm
- Full Name: Nick
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
“Microsoft. ... They are also 90% sure that the issue only occurs on Server 2016 VM's that have been migrated from Hyper-V 2016 to Hyper-V 2019. They're confident that if you build a fresh VM with Server 2016 or Server 2019 on a Hyper-V 2019 host/cluster, the issue will not reoccur.”
Our case was/is on a fresh Server 2019 Hyper-V Host with a fresh Server 2019 VM and a fresh Server 2016 VM – and I personally made that absolutely clear to MS Support... each & every time the case was escalated to the next tier support.
“They believe that deleting all the "RCT reference points" which is just the RCT files associated with the VM may also resolve the issue. They were not 100% confident on this though, but possibly worth a try.”
Tried that... No help...
Our case was/is on a fresh Server 2019 Hyper-V Host with a fresh Server 2019 VM and a fresh Server 2016 VM – and I personally made that absolutely clear to MS Support... each & every time the case was escalated to the next tier support.
“They believe that deleting all the "RCT reference points" which is just the RCT files associated with the VM may also resolve the issue. They were not 100% confident on this though, but possibly worth a try.”
Tried that... No help...
-
- Enthusiast
- Posts: 47
- Liked: 10 times
- Joined: Aug 26, 2019 7:04 am
- Full Name: Christine Boersen
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Nick-SAC, have you tried my solution yet?
-
- Enthusiast
- Posts: 76
- Liked: 16 times
- Joined: Oct 27, 2017 5:42 pm
- Full Name: Nick
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
No, I'm sorry to say that I haven't had a chance to try it yet Christine. I'm booked solid with other jobs right now...
-
- Novice
- Posts: 6
- Liked: never
- Joined: Sep 29, 2020 11:59 am
- Full Name: Casper Glasius-Nyborg
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
We are experiencing the same as you all do. The performance improved greatly when we disabled HT on a few host. But we learned that a live migration was still needed to get the performance back. That for me sounds like the RCT needs to be reset.
As many of you have, we also have a MS ticket on this and are not really getting anywhere. @pterpumpkin could you help out with your own MS ticket number, which I could reference to. Anybody else who have a MS ticket on this issue please PM me as well.
Best regards
Casper
As many of you have, we also have a MS ticket on this and are not really getting anywhere. @pterpumpkin could you help out with your own MS ticket number, which I could reference to. Anybody else who have a MS ticket on this issue please PM me as well.
Best regards
Casper
-
- Enthusiast
- Posts: 47
- Liked: 10 times
- Joined: Aug 26, 2019 7:04 am
- Full Name: Christine Boersen
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Casper,
Ok, this is confirming my observations as well, that there are two issues.
- The HT/Meltdown/Spectre performance problem
- The RCT getting messed up
Have you tried the registry keys from above (copied here) that Giovani posted?
Hope that helps (I have *NOT* had a chance to try the registry keys yet, in middle of a deadline right now (on 22 hrs this work day so far)
Christine
Ok, this is confirming my observations as well, that there are two issues.
- The HT/Meltdown/Spectre performance problem
- The RCT getting messed up
Have you tried the registry keys from above (copied here) that Giovani posted?
Code: Select all
HKLM\System\CurrentControlSet\Services\vhdmp\Parameters\DisableResiliency = 1 (REG_DWORD)
HKLM\software\Microsoft\Windows nt\CurrentVersion\Virtualization\Worker\DisableResiliency = 1 (REG_DWORD)
Christine
-
- Influencer
- Posts: 12
- Liked: 3 times
- Joined: Aug 27, 2019 8:55 am
- Full Name: LeslieUC
- Contact:
Re: Windows Server 2019 Hyper-V VM I/O Performance Problem
Disabling CBT and HT/Meltdown/Spectre is not the solutoion. The Problem is Microsoft RCT.
When doing an on-host backup with veeam a RCT file is created. When the RCT file is on the disk and you are testing with diskspd ons this disk, after this point the I/O Performance problem is there and remains.
But if you change or delete the backup job from on-host to an Agent job and delete the RCT files from the disks, the I/O Performance loss does not occur with diskspd testing. (We have our SQL server virtualized but we are doing an Agent backup job which solves our problems for now)
Also this is not a Veeam problem, but all back-up vendors that use RCT have the same issue and I/O performance loss. zie post May 16, 2020 in this thread.
When doing an on-host backup with veeam a RCT file is created. When the RCT file is on the disk and you are testing with diskspd ons this disk, after this point the I/O Performance problem is there and remains.
But if you change or delete the backup job from on-host to an Agent job and delete the RCT files from the disks, the I/O Performance loss does not occur with diskspd testing. (We have our SQL server virtualized but we are doing an Agent backup job which solves our problems for now)
Also this is not a Veeam problem, but all back-up vendors that use RCT have the same issue and I/O performance loss. zie post May 16, 2020 in this thread.
Who is online
Users browsing this forum: No registered users and 13 guests