-
- Service Provider
- Posts: 315
- Liked: 41 times
- Joined: Feb 02, 2016 5:02 pm
- Full Name: Stephen Barrett
- Contact:
Crippling VM Disk issue - only fixable by running a backup
I've been hit with a strange VM (Hyper-V) issue, where the VM loses the ability to write new Data to its C:\ It seems like it can overwrite existing data but not add new Data. This is presenting itself in the following manner....
- OST Files elicit out of Disk space errors in Outlook (There is 100gb free on C:\)
- Chkdsk shows the C: Volume as RAW instead of NTFS
- Disk management Is not possible to open.
As Soon as I run a Backup from Veeam (be it an ad-hoc VeeamZip or the main backup job for this VM) the VM suddenly kicks back into life, Disk space errors disappear, Resource monitor shows a load of data being written to disk from various process (catching I'd assume) and Chkdsk shows the volume as NTFS and completes with no errors.
I'd guess there is something not quite right with the re-merge of the last snapshot for the VM after the previous backup (would be off-host 3par Hardware VSS snaps), but I've no idea where to look. I've a 100 or so other VMs on the same storage (and 60 in the same Backup Job) that don't exhibit this behaviour. And nothing in the Job Logs for the nightly Backup job.
Any Ideas where I might begin digging?
- OST Files elicit out of Disk space errors in Outlook (There is 100gb free on C:\)
- Chkdsk shows the C: Volume as RAW instead of NTFS
- Disk management Is not possible to open.
As Soon as I run a Backup from Veeam (be it an ad-hoc VeeamZip or the main backup job for this VM) the VM suddenly kicks back into life, Disk space errors disappear, Resource monitor shows a load of data being written to disk from various process (catching I'd assume) and Chkdsk shows the volume as NTFS and completes with no errors.
I'd guess there is something not quite right with the re-merge of the last snapshot for the VM after the previous backup (would be off-host 3par Hardware VSS snaps), but I've no idea where to look. I've a 100 or so other VMs on the same storage (and 60 in the same Backup Job) that don't exhibit this behaviour. And nothing in the Job Logs for the nightly Backup job.
Any Ideas where I might begin digging?
-
- Veeam Software
- Posts: 712
- Liked: 168 times
- Joined: Nov 30, 2010 3:19 pm
- Full Name: Rick Vanover
- Location: Columbus, Ohio USA
- Contact:
Re: Crippling VM Disk issue - only fixable by running a backup
I recommend opening a case with Veeam support. There may be some VSS issues in play as you elude to - but our staff should review the logs to see if there is anything underlying at cause here.
One idea comes to mind - restore this VM to a completely different host and storage (or as different as you can do) - and disconnect the network, then try to reproduce the issue.
Where I'm going with that..... If you fix it that way you could basically fail-over to a new VM target. Sometimes that is easier than fixing. I used to do that when VMware snapshots would get messed up. If that appeals - maybe a Veeam replication job and fail it over. To ensure zero loss of data, do the following:
1. Build a replication job
2. Run it
3a. Run it again (takes changes), get an approximate time of how long an increment takes with this VM
3b. Schedule a downtime
4. At maintenance window..
4a. run replication job again
4b. shut down VM
4c. run replication job again (Supported for powered off VMs)
4d. Fail over and perform check.
If all good make it a permanent failover.
One idea comes to mind - restore this VM to a completely different host and storage (or as different as you can do) - and disconnect the network, then try to reproduce the issue.
Where I'm going with that..... If you fix it that way you could basically fail-over to a new VM target. Sometimes that is easier than fixing. I used to do that when VMware snapshots would get messed up. If that appeals - maybe a Veeam replication job and fail it over. To ensure zero loss of data, do the following:
1. Build a replication job
2. Run it
3a. Run it again (takes changes), get an approximate time of how long an increment takes with this VM
3b. Schedule a downtime
4. At maintenance window..
4a. run replication job again
4b. shut down VM
4c. run replication job again (Supported for powered off VMs)
4d. Fail over and perform check.
If all good make it a permanent failover.
-
- Service Provider
- Posts: 315
- Liked: 41 times
- Joined: Feb 02, 2016 5:02 pm
- Full Name: Stephen Barrett
- Contact:
Re: Crippling VM Disk issue - only fixable by running a backup
I'd forgotten about this post, I'd pretty much settled on a version of what you suggest - in the end I could see it wasn't re-merging the avhdxs, the first avhdx in the chain in particular - even with the VM powered down. hyper-v would get to 3% on the re-merge and fail every time.
So I ran an ad hoc Veeam zip with the intention of restoring it to a host, off the main cluster. Veeam does not restore any snapshots and creates a full VHDX. During this time I'd shut down the original VM to prevent any changes. After about 2 hours the Original VM re-merged the avhds while I wasn't looking. The VM resource in the fail-over cluster did say the VM had failed for some reason (it was already off), and started the VM up itself. All looks good so far.
If it re-occurs I'll do the restore I'd intended to do. Then if there is no joy there I'll engage Veeam support.
EDIT:- It was the manner in which this issue presented itself is what I found interesting, though I'm glad it looks to have resolved itself.
So I ran an ad hoc Veeam zip with the intention of restoring it to a host, off the main cluster. Veeam does not restore any snapshots and creates a full VHDX. During this time I'd shut down the original VM to prevent any changes. After about 2 hours the Original VM re-merged the avhds while I wasn't looking. The VM resource in the fail-over cluster did say the VM had failed for some reason (it was already off), and started the VM up itself. All looks good so far.
If it re-occurs I'll do the restore I'd intended to do. Then if there is no joy there I'll engage Veeam support.
EDIT:- It was the manner in which this issue presented itself is what I found interesting, though I'm glad it looks to have resolved itself.
-
- Veeam Software
- Posts: 712
- Liked: 168 times
- Joined: Nov 30, 2010 3:19 pm
- Full Name: Rick Vanover
- Location: Columbus, Ohio USA
- Contact:
Re: Crippling VM Disk issue - only fixable by running a backup
good to hear SB
-
- Service Provider
- Posts: 315
- Liked: 41 times
- Joined: Feb 02, 2016 5:02 pm
- Full Name: Stephen Barrett
- Contact:
Re: Crippling VM Disk issue - only fixable by running a backup
And...... it reoccurred. No snapshots present just the VHDX. Users couldn't write to their PSTs, created a quick backup, AVHDX created and it was up and running again. I suspect the main VHDX had become corrupted somehow. Fingers crossed a backup doesn't copy the same error.
-
- Veeam Software
- Posts: 712
- Liked: 168 times
- Joined: Nov 30, 2010 3:19 pm
- Full Name: Rick Vanover
- Location: Columbus, Ohio USA
- Contact:
Re: Crippling VM Disk issue - only fixable by running a backup
I think you should open a case with Veeam Support.
-
- Product Manager
- Posts: 8191
- Liked: 1322 times
- Joined: Feb 08, 2013 3:08 pm
- Full Name: Mike Resseler
- Location: Belgium
- Contact:
Re: Crippling VM Disk issue - only fixable by running a backup
Stephen,
If there is corruption (and especially this strange one you are talking about) you can be certain that it will be in the backup also. This does not seem like corruption on the disk-level I'm afraid.
Besides contacting support. Do a restore of the backup (VHDX file only) and see if you can mount it on another server / workstation, even create a new VM (empty) and attach the disk to it and boot it (with no network or quarantined network). In fact, use SureBackup from our solution to test if you can restore successfully.
But this is certainly the time to test
If there is corruption (and especially this strange one you are talking about) you can be certain that it will be in the backup also. This does not seem like corruption on the disk-level I'm afraid.
Besides contacting support. Do a restore of the backup (VHDX file only) and see if you can mount it on another server / workstation, even create a new VM (empty) and attach the disk to it and boot it (with no network or quarantined network). In fact, use SureBackup from our solution to test if you can restore successfully.
But this is certainly the time to test
-
- Service Provider
- Posts: 315
- Liked: 41 times
- Joined: Feb 02, 2016 5:02 pm
- Full Name: Stephen Barrett
- Contact:
Re: Crippling VM Disk issue - only fixable by running a backup
I keep forgetting about this thread - I've restored the VM to another Host over the weekend, and it is functioning flawlessly since (I had a few test restores done during the week and a few Surebackups done before arranging the downtime for the actual restore). I believe, the more I look at it, that it is a Hyper-V or host level issue. Something was preventing writes to the main VHDX on the original Host Side. This could have been something in Hyper-V, or VSS, or something as simple as a file lock not released, we'll never know now.
I'm putting this one down to the stars aligning upside down for once. At least it's written here on the forum should any other poor sod run into something similar.
I'm putting this one down to the stars aligning upside down for once. At least it's written here on the forum should any other poor sod run into something similar.
-
- Product Manager
- Posts: 8191
- Liked: 1322 times
- Joined: Feb 08, 2013 3:08 pm
- Full Name: Mike Resseler
- Location: Belgium
- Contact:
Re: Crippling VM Disk issue - only fixable by running a backup
Do you still have other VMs running on that host?
What likely can happen is that the hyper-v vss writer has issues, but then I expect to see it at other VMs also
What likely can happen is that the hyper-v vss writer has issues, but then I expect to see it at other VMs also
-
- Service Provider
- Posts: 315
- Liked: 41 times
- Joined: Feb 02, 2016 5:02 pm
- Full Name: Stephen Barrett
- Contact:
Re: Crippling VM Disk issue - only fixable by running a backup
I should have said - that particular Host was rebooted as a precaution over the weekend also. Before reboot, the VSS writers on the host showed no issues, and no other VMs of the 20 others on that host were affected.
Who is online
Users browsing this forum: No registered users and 24 guests