ISSUE RAISED VIA SUPPORT PORTAL - CASE ID: 01000359
I’m wondering if anyone can help us with the issue below.
We are currently running around 12 HA VMs, on a 2-node Windows Server 2012 R2 Hyper-V cluster. VM storage is housed on SMB 3.0 shares, which are running from a 2-node Windows Server 2012 Scale-out File Server cluster.
2 SMB 3.0 shares have been provisioned from the 2 CSVs presented by the SOFS cluster. Each SOFS cluster node is an owner of a CSV. Storage hardware is a Lenovo ThinkServer JBOD.
For VM backup, we are utilising Veeam Backup and Replication 8, with update 2b applied. The backup run begins at 10pm each evening, by means of a scheduled PowerShell script.
The backup completes successfully, but have found over the past week or so that upon arriving at the office the next day, there are multiple 1069 cluster events for the VMs which have rebooted at random.
The VMs in question, are random in terms of which ones reboot each evening.
In an effort to find the root cause of the problem, we disabled all Veeam VM backups one evening. The following morning, our Hyper-V cluster reported that it had gone the entire period without having any issues.
We then, manually ran the backup script during office hours, and waited for any issues. The backup ran without issue, until it came to one of the last few VMs.
What then happened, was that 3 of the VMs restarted. Specifically, EMAIL, PRINTSRV & TS4. The VMs rebooted during the TS4 backup.
These restarted between 12.47pm and 12.48pm. All 3 came back up. There doesn’t appear to be any link between the three (apart from the fact that all 3 were running on the same HV node). What’s more odd is that the reboots occurred way after 2 of the VMs.
I should add that there were other VMs running on that same HV node.
Backup completion times:
EMAIL – 10.49am
PRINTSRV – 12.08am
TS4 – 12.55am
The backup then proceeded, until reaching the penultimate VM. Suddenly, I then noticed that all VMs on all HV nodes lost connection to their storage, and were either turning off or starting up on another node of the HV cluster. A few seconds after seeing this, I checked the logs for our SOFS cluster, and noticed that RHS had stopped unexpectedly, which caused the file cluster to restart and VMs to bomb out.
Amazingly the Veeam backup proceeded to backup the last VM when both clusters returned to a normal running state.
Does anyone have any ideas what is causing this problem? I keep reading about disabling ODX in Server 2012, for storage hardware that doesn’t support it.
All I know is that running the backup, causes problems.