Agentless, cloud-native backup for Microsoft Azure
Post Reply
RobMiller86
Service Provider
Posts: 142
Liked: 23 times
Joined: Oct 28, 2019 7:10 pm
Full Name: Rob Miller
Contact:

Failed State in Azure - No Alarm

Post by RobMiller86 »

We have now seen a couple of occurrences where a VM enters a failed state in Azure. The VM is still up and responding, so our other monitoring doesn't detect any issues. But in Azure, the VM shows as being in a failed state and you have to click the button to reboot and redeploy.

When this happens, Veeam can't see the VM. If you look at the sources in the backup policy, the VM in question has a blank ID/Value column. As soon as you redeploy in Azure, the column is populated and Veeam can back it up.

We haven't seen any alarms for this condition in our VSPC. Shouldn't Veeam trigger an alarm when a VM simply drops off like that? Is this a known issue or perhaps do we have something misconfigured? It seems like Veeam considers the VM deleted, so ignores the condition. Unfortunately, I don't have an example right now to open a ticket but will do so when this occurs again. I'm just wondering if anyone else has witnessed this condition.
nielsengelen
Product Manager
Posts: 5647
Liked: 1187 times
Joined: Jul 15, 2013 11:09 am
Full Name: Niels Engelen
Contact:

Re: Failed State in Azure - No Alarm

Post by nielsengelen »

Hey Rob,

Did you already contact support for this and do you have a case ID? Looks like we'll need to do some more troubleshooting to understand what is causing this.
Personal blog: https://foonet.be
GitHub: https://github.com/nielsengelen
RobMiller86
Service Provider
Posts: 142
Liked: 23 times
Joined: Oct 28, 2019 7:10 pm
Full Name: Rob Miller
Contact:

Re: Failed State in Azure - No Alarm

Post by RobMiller86 »

Hi Niels,

Unfortunately, I do not. This was addressed by some other staff members over the past couple of months. I've been told it has happened 3 times across our client base. We are in the process of migrating a lot of customers over to Veeam. I will open a ticket the next time it's brought to my attention, and I have an example. I was just curious what the expected behavior of VBAZ is when an Azure VM enters a failed state like this, where VBAZ can no longer see the VM in Azure, but the VM is still up.
RobMiller86
Service Provider
Posts: 142
Liked: 23 times
Joined: Oct 28, 2019 7:10 pm
Full Name: Rob Miller
Contact:

Re: Failed State in Azure - No Alarm

Post by RobMiller86 »

I just found another one of these "The virtual machine is in a failed state" in Azure. Luckily for us, this deployment is still using out our older method where we manually added the VMs to the policy, rather than adding them via tag. So the policy was failing as it was manually configured with the VM, and it couldn't find it.

However, this is giving me more concern with our current method of adding VMs by tag. Can I get any confirmation from Veeam on what happens in the following scenario:

1. We are adding VMs to a policy with a specific tag. We then add the tag to the VMs we want to backup with the policy. We are not directly adding VMs to the policy, just the tag.
2. All vms are discovered, backups begin.
3. Sometime later, a VM enters a failed state in Azure. The VM is still up and running, so other monitoring tools don’t detect the failure. However, in Azure, if you pull up the VM overview, up top it says "The last operation performed on this VM failed. The VM is still running. View error details". Then if you click that, in Azure it says "The virtual machine is in a failed state. The fabric operation failed. Reapplying the virtual machine may resolve the issue." Error code: InternalExecutionError. Provision state: Failed. If you click reapply, the VM is reprovisoned in Azure, rebooted, and all is well again.
4. When you pull up the Veeam backup policy in this state, before fixing it above in step 3, the VM is still listed in the policy in the Name/Key column, but the ID/Value (The azure resource ID) is blank.

Will VBAZ still fail a policy in this state? If it previously found the VM to protect via tag, and now it can't find the VM via tag, can I have 100% confirmation that it will now fail the policy instead of thinking the VM was just removed and no longer needs to be backed up? Like I said before, I don't have a current example to open a ticket, but I can't recreate this issue to test, and it's very hard to find unless you are doing manual reviews. I'm considering reverting and not adding VMs via tags due to this issue that I have witnessed myself twice, and someone else witnessed it once. These were on V5, not V6. So I'm not sure if this has improved or not. Is it not safe to add a VM via tag due to this?

I do know the policy fails in this azure failed state if the VM was directly added by "Resource types: virtual machine [name or id]". But I have seen it not fail the policy if the VM was added by "Resource types: tag [key] [value]". With the tag added to the VM. I know it's tough as I can't provide a current example. But it's a big concern as we could go months of longer in this state without realizing it and having no backups. Confirmation for exactly how VBAZ will handle this condition would be much appreciated, or I guess we just shouldn't use the tag method at all for peace of mind.
lyudmila.ezerskaya
Veeam Software
Posts: 111
Liked: 34 times
Joined: Oct 04, 2021 4:08 pm
Full Name: Lyudmila Ezerskaya
Contact:

Re: Failed State in Azure - No Alarm

Post by lyudmila.ezerskaya »

Hi Rob! We will investigate this issue and keep you updated on the results. Thank you!
lyudmila.ezerskaya
Veeam Software
Posts: 111
Liked: 34 times
Joined: Oct 04, 2021 4:08 pm
Full Name: Lyudmila Ezerskaya
Contact:

Re: Failed State in Azure - No Alarm

Post by lyudmila.ezerskaya »

Hi! By design, Veeam Backup for Microsoft Azure skips VMs with the Failed provisioning state during synchronization.

If a VM was manually added to the backup scope, during the backup session, the policy will attempt to process it. However, since it cannot be reached, the policy will fail.

However, when VMs are added to the backup scope using tags, resource groups, or subscriptions, the backup scope adjusts dynamically with each synchronization with Azure. If a VM is in the Failed state during synchronization, it will be skipped and excluded from the backup scope. Since this VM is no longer included in the backup scope, the policy will not attempt to protect it and therefore will not fail.
Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests