Host-based backup of Microsoft Hyper-V VMs.
Post Reply
AlexHeylin
Veteran
Posts: 563
Liked: 173 times
Joined: Nov 15, 2019 4:09 pm
Full Name: Alex Heylin
Contact:

SureBackup - "missing" Hyper-V integration causes all tests to skip and job ends with only Warning

Post by AlexHeylin »

We've got SureBackup setup for two VMs. It runs the basic tests plus some test scripts we've written in-house. This has been running successfully for over a month. Today it ended with Warning and the message

Code: Select all

Warning    [DC_VM_NAME]: Skipping script test: no Hyper-V integration services are installed in the VM
it then skipped all the other tests and the whole job ended with Warning.

When we ran this job again using the same backup point it completed successfully.

This raises these questions;
  1. Why did this fail to detect the Hyper-V integration services (which are clearly installed & working, and always have been)
  2. Why did it skip all the tests - which run externally to the VM, so don't need the Hyper-V integration services?
    If this is because the IP is ONLY got via the integration services - the DHCP is controlled by the vLab appliance (which VBR talks to) and the MAC can be read from the Hyper-V VM config (which VBR can also read) so the IP can be got that way too.
  3. Given this means NO configured tests were run against this machine, shouldn't the job end with Failed not Warning?
Case #05926668

#MOD: added case number from second comment.
PetrM
Veeam Software
Posts: 3626
Liked: 608 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: SureBackup - "missing" Hyper-V integration causes all tests to skip and job ends with only Warning

Post by PetrM »

Hi Alex,

1. There are many reasons and this is something that our support team needs to research. I have 2 ideas at the moment: there is an intermittent issue with WMI that is used to query the status of Hyper-V integration services, for example, we don't get a response or retrieve an incorrect one or there is a problem with the backup repository performance and the services on the mounted VM are not started within a necessary period of time. Moreover, the fact that the issue does not occur every time when the same restore point is used makes me think that it might be related to performance.

2. These services provide interaction between the Hyper-V host and the guest VM and I doubt that we can run any tests if this communication stage is not functioning.

3. Why should it be failed? :) The main goal of SureBackup job is achieved: it checked that the VM is alive and the backup is not corrupted. In case when some custom scripts are skipped, we must warn you so that you can start the investigation.

Let's wait for what our support engineers can find out.

Thanks!
AlexHeylin
Veteran
Posts: 563
Liked: 173 times
Joined: Nov 15, 2019 4:09 pm
Full Name: Alex Heylin
Contact:

Re: SureBackup - "missing" Hyper-V integration causes all tests to skip and job ends with only Warning

Post by AlexHeylin »

Thanks Petr

Just to pick up on a couple of these

2. The test scripts don't run IN the VM being tested - they run outside the VM being tested. Thus, provided the VM IP is determined (see above), there's no need to have access to anything the VM HV integration services. For example testing an SMB connection to \\%vm_ip%\c$, or running a TCP connect to %vm_ip% port 53 to test the DNS server started, doesn't require the integration service as long as %vm_ip% has been determined.

3. I don't think it did "checked that the VM is alive and the backup is not corrupted". I think that is only proved when all the tests, including custom test scripts pass. Once the integrations tools are deemed "not installed" then all the other tests including basics like HV heartbeat and IP ping are skipped. To me, that doesn't meet the requirements for "checked that the VM is alive".

In terms of "the backup is not corrupted" - there's different levels of corruption that are possible. Also - I believe the purpose of the SureBackup is to ensure that the VM can be reasonably expected to be restorable to a fully functioning state. That included making sure the backup is not corrupted - but includes much more as well.

- is there such a significant failure in the backup chain (etc) that Instant Recovery cannot spin the VM up. This has been proved to be OK.

- There's the data is corrupted from what Veeam sees - for example CRC failure in the VBK / VIB. This potentially has been proved to be OK, especially if using the "validate entire virtual disk" option.

- There's corruption of the filesystem inside the volume(s) of the VM. We know that Veeam does not look for, and will not see, this. See cloud-connect-backup-f43/vbr-vaw-val-to ... 80878.html for more on that and my proposed enhancement to address this area where VBR is behind some competing solutions. This definitely has not been proved, even if all test scripts pass.

- There's corruption / damage to the OS which might cause the OS not to boot, or to boot with limited functionality. We've seen that where Hyper-V integration is not detected Veeam does not look for, and will not see, this so this has not been proved to be OK.

- There's application "corruption" / unhappiness which might cause the OS to boot and operate fine, but the application is non-functional or not functioning correctly. We've seen that where Hyper-V integration is not detected Veeam does not look for, and will not see, this so this has not been proved to be OK.

All that's been proved then is that during an instant recover VBR did not generate an error that caused the Instant Recovery to fail, and if "validate entire virtual disk" was enabled then virtual disks within the backup files are all valid too. That's very different to confirming that the VM can be properly restored into a fully functional state as defined by the configured tests. There are a number of circumstances in which an Instant Restore might complete without an error but the VM might not be properly restored / functional.

Thanks

Alex
PetrM
Veeam Software
Posts: 3626
Liked: 608 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: SureBackup - "missing" Hyper-V integration causes all tests to skip and job ends with only Warning

Post by PetrM » 1 person likes this post

Hi Alex,

I appreciate your willingness to share your thoughts and feedback.

1. What's the point to run any scripts if Hyper-V integration services are not running? Let's say we run these scripts in your case and these scripts are successful. As a result, you'll never know about the potential file system/OS level/application corruption which causes an issue with Hyper-V integration services and maybe with other services as well.

2. All arguments about corruption levels are valid. But how do these arguments justify an error instead of a warning? We show an error when we know for sure that the job is failed or if know for sure that the backup is unusable, for example, CRC check was failed. But we cannot conclude that the backup is unusable due to a failed script test, therefore we warn you so that you can decide on further steps: to investigate the issue or to ignore the warning (f.g. you intentionally stopped these services before backup etc).

And the main question: how exactly an error message helps in this case? Why warning is not enough?

Thanks!
AlexHeylin
Veteran
Posts: 563
Liked: 173 times
Joined: Nov 15, 2019 4:09 pm
Full Name: Alex Heylin
Contact:

Re: SureBackup - "missing" Hyper-V integration causes all tests to skip and job ends with only Warning

Post by AlexHeylin » 1 person likes this post

Oh no....!! I wrote a reply to this but looks like I forgot to click Submit before the weekend, then closed the browser :cry:

In summary - I don't think any of the things you're saying are proved are proved except that an IR job doesn't error. Unless the test scripts have been run and passed then the job should end with failure because it did fail (to successfully complete the configured tests). A warning normally indicates "successful but..." - in this case I think not running the scripts means the result is "Failed, but the IR didn't error". Without the test scripts being successful, it should err on the side of caution and assume the VM is dead which should be a Fail result.
Thanks
PetrM
Veeam Software
Posts: 3626
Liked: 608 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: SureBackup - "missing" Hyper-V integration causes all tests to skip and job ends with only Warning

Post by PetrM » 1 person likes this post

Hi Alex,

I'm so sorry to hear that you lost some text, I awaited with interest your next reply. It happens sometimes but the summary is quite clear. I guess it's quite a complicated discussion and there are arguments for both errors and warnings, moreover, a factor of our "subjective" vision comes into action. Anyway, let's imagine we change now the behavior and show an error instead of a warning. What do we get? I don't see real valuable benefits in this change and vice versa, what about others who will get failed jobs instead of warnings? How many similar questions from people with another "subjective" vision will we have? It's a rhetoric question, of course.

But next time, when we'll be designing a new feature and the same question will arise: what should we show an error or warning, I'll definitely remember this topic and all the arguments you provided me with.

Thanks!
AlexHeylin
Veteran
Posts: 563
Liked: 173 times
Joined: Nov 15, 2019 4:09 pm
Full Name: Alex Heylin
Contact:

Re: SureBackup - "missing" Hyper-V integration causes all tests to skip and job ends with only Warning

Post by AlexHeylin » 1 person likes this post

Thanks
Post Reply

Who is online

Users browsing this forum: No registered users and 23 guests