Exchange 2013 DAG guests fail after upgrading HyperV 2016

HenrikS. · Post by **HenrikS.** » Jul 13, 2018 6:17 am this post

Hello,

After upgrading two Hyper-V 2016 S2D clusters with the 2018-06 updates: KB4132216 and KB4284833, Windows 2012 R2 guests running Exchange CU21 DAGs fail to backup.
Backlog shows that:
The first DAG member started having this issue after the upgrade of the first Hyper-V cluster.
The Second DAG member started having this issue after the upgrade of the second Hyper-V cluster.

Veeam reports the following error: Processing EXCH-MBX03 Error: Error code: 0x80041032 Cannot query class instance from enumerator object [WMI] Failed get next object in collection.

System events show: Log Name: System Source: Service Control Manager Date: 12.07.2018 19.54.35 Event ID: 7011
A timeout (30000 milliseconds) was reached while waiting for a transaction response from the VeeamVssSupport service.

VeaamGuestHelper log within the guest show the last log lines 30 seconds before:
12.07.2018 19:54:05 15720 Cleaning CRpcServer data.
12.07.2018 19:54:05 15720 INFO Clearing list of VSS snapshot jobs.

After the failure started, Veeam B&R has been upgraded to 9.5U3a and 2018-07 MS patches and Guests have received their 2018-07 patches.
The Hyper-V 2016 cluster has remained on 2018-06.

We have registered a Support case: ID# 3092339, but since this issue is pretty painful we are also looking to the community now.

Does anyone have a clue of what could be going on here?
Is there anyway of enabling more debugging on the veeam services on either guest interaction proxy, hyper-v hosts or guest?

-BR

Sep 23, 2018 6:18 pm

Will look into the support case.
Please check also: https://social.technet.microsoft.com/Fo ... Management

Sep 24, 2018 7:20 am

I see within your support case, that the backup is now running again and that you check with Microsoft Support the root cause.

Overall the following things are changed (based on support log):
- ArbTaskMaxIdle was set to 3600000 https://support.microsoft.com/en-us/hel ... nd-2012-r2
- Host was rebootet

HenrikS. · Feb 08, 2019 12:49 pm

Just to follow up on this, as we now think to finally have resolved the root cause.
Many thanks to both Veeam and Microsoft support that needed to be involved.

1: Veeam with CBT on versions prior than 9.5u4 did convert snapshots to reference points in the .VMCX file.
This resulted over time in a very large .VMCX file, for us 180KB with real data and total file size of 28MB.

2: Hyper-V service reads the .VMCX in 10byte chunks, meaning that a huge .VMCX file takes a lot of time to read.

3: Veeam queries WMI of Hyper-V to generate the .VMCX file for backup, and in the case of a huge .VMCX with XXX reference points of XX disks took more than the ArbTaskMaxIdle limit of default 20 minutes.
To note: We also after additional months hit over 1 hour of query time.

4: After having identified the reason for a very huge .VMCX file we where able to delete OLD reference points with the newly released 9.5u4 version.
9.5u4 by default does not keep reference points anymore, but will also not delete OLDER reference points which are stored in the .VMCX.
To have 9.5u4 also delete OLD reference points, you must enable it by a registry setting that Veeam support can provide you with.

5: The result then was a bloated .VMCX file with a huge replay log and free space table. This however needed to be compacted which we where able to do with the help of Microsoft Support.

As a result, the backup window and times spent where like this:
Time to query WMI to generate the .VMCX file: 1h->2m
Total time of backup window for this particular VM: 2h21m->1h

R&D Forums

Exchange 2013 DAG guests fail after upgrading HyperV 2016

Re: Exchange 2013 DAG guests fail after upgrading HyperV 2016

Re: Exchange 2013 DAG guests fail after upgrading HyperV 2016

Re: Exchange 2013 DAG guests fail after upgrading HyperV 2016

Who is online