Real-time performance monitoring and troubleshooting
Post Reply
rleon
Enthusiast
Posts: 76
Liked: 9 times
Joined: Jun 15, 2017 8:10 am
Full Name: RLeon
Contact:

VeeamONE Repository or Proxy "Connection Failure" alerts not instant

Post by rleon »

Hi all,

We recently tested some VeeamONE alerts and discovered some unexpected alert behaviors.
Setup:
1x VBR Server + Enterprise Manager Server in one
1x VM Proxy + VM Repository Server in one (under the control of the above VBR Server)
1x VeeamONE Server
VeeamONE is configured to point to EM, and not to the VBR directly. (This is the normal way if an environment has an EM server).

The test:
We physically unplugged the network cables from the Proxy+Repo Server, so that the VBR+EM and the VeeamONE server can no longer ping or connect to it.
The expectation was that after a little while, in the VeeamONE Client, inside "Data Protection View", we would get either or both of the following alerts:
"Backup repository connection failure", "Backup proxy connection failure".
But after around 15-20 minutes, no alerts show up anywhere.
Both alerts are enabled and the default setting of "5 min" was kept as is, meaning that after 5 minutes of disconnection from the proxy or repo, the alert should fire. (At least that's what we believe should happen).

Since no alerts appear after the wait, we decided to manually go in to VBR console > Backup Infrastructure > Backup Repositories, right click on the affected repository, then click "Rescan".
The rescan failed as expected, because VBR could not contact the proxy/repo server.
Only at this point, the proxy/server gets grouped under a newly appeared "Unavailable" sub-group, and at the same time in VeeamONE Client, the alerts finally appeared.

We expected the "connection failure" alerts would appear after 5 minutes (per the alerts' default setting) after a network failure at the proxy/repo server.
But the alerts only show up after a manual rescan of the repository.
Is this normal? If so, then what does "5 min" in those alerts do? And exactly how long does it really take VeeamONE to actually throw those alerts after a real connection failure?
Thanks in advance.
wishr
Veteran
Posts: 3077
Liked: 453 times
Joined: Aug 07, 2018 3:11 pm
Full Name: Fedor Maslov
Contact:

Re: VeeamONE Repository or Proxy "Connection Failure" alerts not instant

Post by wishr »

Hi Rleon,

This is expected behavior.

VONE relies on the data from the VBR server to track the state of backup infrastructure components, so the state of a backup proxy or backup repository should change on the VBR side first before VONE will be able to capture this change and fire corresponding alarms.

Speaking about the "Delay" parameter in the alarm rules, it allows minimizing the amount of unnecessary alarm noise by defining the amount of time the target object should spend in the "faulty" condition. E.g. if the "Delay" is set to five minutes, the alarm will fire only if the backup repository has stayed in the unavailable state for five minutes. This capability becomes useful, for example, if frequent short network outages are quite common.

Thanks
rleon
Enthusiast
Posts: 76
Liked: 9 times
Joined: Jun 15, 2017 8:10 am
Full Name: RLeon
Contact:

Re: VeeamONE Repository or Proxy "Connection Failure" alerts not instant

Post by rleon »

Thanks for the info.
Then how long does it take for VBR to automatically detect the connection failure? What's its "detection time interval", if there is one, and how do you change it, perhaps to make it scan more frequently? Because in our test, we waited 15-20 minutes and VBR was still unaware.
We logically want to make sure that this alert fires BEFORE the next scheduled backup job starts (Ironically, the failed backup job would allow the VBR to become aware of the connection failure of the proxy/repo, but at that point it would be too late and also defeats the purpose of the alert.)
wishr
Veteran
Posts: 3077
Liked: 453 times
Joined: Aug 07, 2018 3:11 pm
Full Name: Fedor Maslov
Contact:

Re: VeeamONE Repository or Proxy "Connection Failure" alerts not instant

Post by wishr »

You are welcome!

On a high level, the state of remote backup infrastructure components in VBR depends on multiple various rescan operations running under the hood of the VBR server. Each of these operations serves different and sometimes independent needs. Some of them run on schedule, others follow certain algorithms which are not trivial. VONE uses and is already capable of initiating one of these operations for tracking the state of the components. However, it turned out all these circumstances and specifics may not be the root cause in your particular case. I've found a known issue with the infrastructure component state alarms which is related to how VBR exposes the data, which in turn, impacts the VONE data collection capability. This issue is scheduled to be fixed in future releases as it requires complex and potentially cross-product changes. Before we dive into the technical details on how various rescan operations work, I would like to ask you to open a support case so we could determine what exactly causes the issue in your particular scenario: whether it's the aforementioned known issue or something else.

Thanks
rleon
Enthusiast
Posts: 76
Liked: 9 times
Joined: Jun 15, 2017 8:10 am
Full Name: RLeon
Contact:

Re: VeeamONE Repository or Proxy "Connection Failure" alerts not instant

Post by rleon »

Thanks for getting back. We will test again in the next version. Hopefully the documentation will be updated to reflect exactly how this works behind the scenes.
wishr
Veteran
Posts: 3077
Liked: 453 times
Joined: Aug 07, 2018 3:11 pm
Full Name: Fedor Maslov
Contact:

Re: VeeamONE Repository or Proxy "Connection Failure" alerts not instant

Post by wishr »

Hi RLeon,

Regarding the documentation, what particular information you would like to be added there?

Also, I would still suggest opening a support case because there is a slight chance you have a different issue.

Thanks
Post Reply

Who is online

Users browsing this forum: No registered users and 5 guests