Monitoring and reporting for Veeam Data Platform
Post Reply
Fatboy40
Influencer
Posts: 17
Liked: 1 time
Joined: Mar 02, 2016 10:22 am
Full Name: Clive Arnold
Contact:

Veeam Stalled Backup Job / RPO Alerts

Post by Fatboy40 »

Where I work we currently use Backup and Replication, enterprise edition x6 sockets, and unfortunately a backup job had been stuck for a couple of days due to a VSS issue and this effected a tape replication job as well. As the job in question had not actually failed there was no alert e-mail to be sent.

I want to implement a fire and forget / supported solution now to alert us to such an event, and it seems like to achieve this I'd be going down an RPO based alert route.

If we were licensed for Availability Suite, rather than Backup and Replication, will the additonal features / Veeam ONE that comes with it be able to provide us with an alert if a job takes more than an expected time to complete? (via an RPO being exceeded).

Thanks.
wishr
Veteran
Posts: 3077
Liked: 455 times
Joined: Aug 07, 2018 3:11 pm
Full Name: Fedor Maslov
Contact:

Re: Veeam Stalled Backup Job / RPO Alerts

Post by wishr »

Hi Clive,

Could you please let us know what do you see in the job session log within the B&R UI? You correctly mentioned that there should be an "event" indicating that something went wrong. Such events are used to trigger the Backup Job State alarm in Veeam ONE that should help you to deal with such situations.

Thanks
Fatboy40
Influencer
Posts: 17
Liked: 1 time
Joined: Mar 02, 2016 10:22 am
Full Name: Clive Arnold
Contact:

Re: Veeam Stalled Backup Job / RPO Alerts

Post by Fatboy40 »

Hi Fedor,

is this what you need?..

Code: Select all

Preparing guest for hot backup
Unable to release guest. Error: Unfreeze error: [Cannot wait for m_FreezeFinishEvt: [258]]
Task failed unexpectedly
... and the job was just sat there waiting for someone to stop it.
wishr
Veteran
Posts: 3077
Liked: 455 times
Joined: Aug 07, 2018 3:11 pm
Full Name: Fedor Maslov
Contact:

Re: Veeam Stalled Backup Job / RPO Alerts

Post by wishr »

Hi Clive,

Thank you.

Do you see this text in the job session log in the UI? Is it a warning? I think our alarm engine should be able to catch it since it's a task failure that in turn should result in a session to finish with at least a warning.

Thanks
Fatboy40
Influencer
Posts: 17
Liked: 1 time
Joined: Mar 02, 2016 10:22 am
Full Name: Clive Arnold
Contact:

Re: Veeam Stalled Backup Job / RPO Alerts

Post by Fatboy40 »

Hi Fedor,

I think the answer to your question is yes :?

The problem for me was that the job was still running as shown below (look at the duration)...

https://imgur.com/a/2N2MOow

... which is a little embarrassing, but no warning / failure e-mails had been received, so if I was using Availability Suite I could setup RPO's and then be alerted to the job just being sat there as no backup on this VM had completed for quite a few days?
wishr
Veteran
Posts: 3077
Liked: 455 times
Joined: Aug 07, 2018 3:11 pm
Full Name: Fedor Maslov
Contact:

Re: Veeam Stalled Backup Job / RPO Alerts

Post by wishr »

Clive,

It looks strange to me because a failed task session should not make the job stuck for so many hours. Do you have a support case ID to share regarding the error itself? I would like to have a bit more information to understand what exactly caused this behavior because it seems this behavior should be addressed on the B&R end that will basically make the email notifications arrive as they should.

As an addition to the aforementioned Backup Job State alarm, Veeam ONE offers a VM With No Backup alarm which allows configuring an RPO threshold (24 hours by default). If no backup was created during the given period, you'll get an alarm.

Thanks
Fatboy40
Influencer
Posts: 17
Liked: 1 time
Joined: Mar 02, 2016 10:22 am
Full Name: Clive Arnold
Contact:

Re: Veeam Stalled Backup Job / RPO Alerts

Post by Fatboy40 »

Hi Fedor,

I've not logged a support ticket as I'd assumed that a reboot of the VM would sort things, and it did, it's more than I'm after an alert for a stalled backup job.

Regarding the Veeam ONE "VM with no backups" alarm can the RPO threshold be set differently / uniquely for each VM or a group of VM's?

Thanks.
wishr
Veteran
Posts: 3077
Liked: 455 times
Joined: Aug 07, 2018 3:11 pm
Full Name: Fedor Maslov
Contact:

Re: Veeam Stalled Backup Job / RPO Alerts

Post by wishr » 1 person likes this post

Hi Clive,

Sure, you may change the threshold value and modify the assignment. If the threshold value should be different across different groups of VMs you may simply copy the alarm and configure the assignment and value accordingly, for each VM group.

Thanks
Post Reply

Who is online

Users browsing this forum: Baidu [Spider] and 9 guests