Monitoring and reporting for Veeam Data Platform
Post Reply
johnlockie
Enthusiast
Posts: 53
Liked: 3 times
Joined: Apr 06, 2012 5:46 pm
Contact:

Supressing Errors for vCenter Outages...

Post by johnlockie »

Is there any way to tell VeeamONE not to report on all the vCenter related errors when this happens (vCenter goes down)? For example, heartbeats on VM tools....etc.

Because when vCenter service goes down we get about 100+ emails which can be a bit annoying. When just looking at alerts on email it can be alarming....when in reality vCenter outage in our envronment is not a huge problem. But, for example, SQL outage is. These emails could be due to SQL failing, or vCenter failing. One is insignificant, and the other is catastrophic....

Am I making sense?
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Supressing Errors for vCenter Outages...

Post by Vitaliy S. »

Hi John,

So basically you're receiving 100+ alarms from all objects in your infrastructure when vCenter Server goes down, right? I agree that it does make sense to receive only one alarm in this case...

Thanks!
johnlockie
Enthusiast
Posts: 53
Liked: 3 times
Joined: Apr 06, 2012 5:46 pm
Contact:

Re: Supressing Errors for vCenter Outages...

Post by johnlockie » 1 person likes this post

Precisely.

For example, "Heartbeat is missing for VM for VirtualMachineName"

Now, if our SQL server went down this could happen cause vCenter would go offline (or better yet, if a datastore was offline and the VM died, etc.). But also if vCenter goes down on its own this happens. I really don't care if vCenter causes this error, but if SQL goes down we want to konw.

To put it another way, picture an island with a bridge leading to it. VeeamONE uses this bridge to check on the island. The island is my infrastructure, and the bridge is vCenter. When the bridge goes down VeeamONE assumes the island is not existent, when it is. So, to me I am having to assume maybe a bomb went off on the island and it is in fact down....

Now if a bomb went off I would get other alarms still. If a datastore goes offline for example. And in that event I want to know all the 100 emails I get about the VMs and their associated issues when a datastore kicks them offline. These two events (vCenter service crashing vs. datastore going offline) are completely different in priority and severity, yet they kick out very similar batch of errors.
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Supressing Errors for vCenter Outages...

Post by Vitaliy S. »

Awesome analogy :) I believe it shouldn't be too hard to suppress all alarms when vCenter Server goes down, thanks for sharing your feedback!
johnlockie
Enthusiast
Posts: 53
Liked: 3 times
Joined: Apr 06, 2012 5:46 pm
Contact:

Re: Supressing Errors for vCenter Outages...

Post by johnlockie »

Ok, I take it there is no workaround for this then? Right now, when vCenter goes down we need to just deal with the emails?

It's not the end of the world. I just thought I might be missing a way to tweak it.
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Supressing Errors for vCenter Outages...

Post by Vitaliy S. »

Yes, currently it is not possible to change the email notification behavior.
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Supressing Errors for vCenter Outages...

Post by Vitaliy S. »

John,

Could you please provide additional information on the behavior you observe? Our QC team couldn't reproduce a massive email/alarm storm when vCenter Server connection goes down.

BTW - what version of Veeam ONE are you using?

Thanks!
johnlockie
Enthusiast
Posts: 53
Liked: 3 times
Joined: Apr 06, 2012 5:46 pm
Contact:

Re: Supressing Errors for vCenter Outages...

Post by johnlockie »

6.1

That's very interesting! And most the alerts we have are configured out of the box (we needed to do very little customization....only on disk latency alerts :))

Stay tuned, I'll post something
johnlockie
Enthusiast
Posts: 53
Liked: 3 times
Joined: Apr 06, 2012 5:46 pm
Contact:

Re: Supressing Errors for vCenter Outages...

Post by johnlockie »

Ok here is a breakdown:

Due to a SQL bug vCenter crashes during multiple snapshots being taken. When Veeam B&R was taking snapshots for a couple of backups around 4:30am we had this happen. The vCenter server service halted. The VM was still online, but vCenter crashed. Everything else was fine, including SQL (the bug is with a record specifically in the vcenter database, and does not impact other databases or the SQL instance at all).

Alerts arrived in the following manner:
4:30am: Topology collection failure for vcenterserver
4:30am: Event data collection failure for vcenterserver
4:35am: Virtual Server connection failure for vcenterserver
4:39am: Virtual Server connection failure for vcenterserver
4:40am: Heartbeat is missing for VM for VMNAME-1
4:40am: Heartbeat is missing for VM for VMNAME-2
4:40:am.....and so on for a ton of heartbeat errors

5:08am: Host vmhost03 is not responding (totally untrue)
...then it reports all the hosts are down, etc.

I can almost guarantee you if I shut down vCenter right now we would have 100 emails from VeeamONE.
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Supressing Errors for vCenter Outages...

Post by Vitaliy S. »

Ok, thanks for the info! I will update this topic once I have more details from our QC team.
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Supressing Errors for vCenter Outages...

Post by Vitaliy S. »

John, could you please clarify if you have lots of emails when just vCenter Server service halts or it happens when SQL Server hosting vCenter Server database crashes?
johnlockie
Enthusiast
Posts: 53
Liked: 3 times
Joined: Apr 06, 2012 5:46 pm
Contact:

Re: Supressing Errors for vCenter Outages...

Post by johnlockie »

Nothing happens w/ SQL server. If I just reboot vcenter I get the errors. In this case, vCenter Server Service crashed. It was the only system to crash. Our core database was online and no other systems were actually impacted. Only vCenter was down.
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Supressing Errors for vCenter Outages...

Post by Vitaliy S. »

Hmm...that's weird, we do not receive lots of alarms when our vCenter Server goes down. Could you please contact our support team so we could take a look at your environment and review Veeam ONE debug logs?
johnlockie
Enthusiast
Posts: 53
Liked: 3 times
Joined: Apr 06, 2012 5:46 pm
Contact:

Re: Supressing Errors for vCenter Outages...

Post by johnlockie »

Yeah. I will test this manually during a maintenance window. We have a few coming up. It would probably be good for me to know 100% and be able to replicate on the fly before I put a ticket in....

I am glad to hear you do not see this. Something is weird about our environment probably (?)
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Supressing Errors for vCenter Outages...

Post by Vitaliy S. »

Could be... Please let me know your ticket number, so I could explain to our support engineer what exactly we need to verify in your environment.

BTW - what kind of alarms do you receive? The ones which refer to failed heartbeats only?
johnlockie
Enthusiast
Posts: 53
Liked: 3 times
Joined: Apr 06, 2012 5:46 pm
Contact:

Re: Supressing Errors for vCenter Outages...

Post by johnlockie »

Yes, heartbeat and also host down! So odd =/
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Supressing Errors for vCenter Outages...

Post by Vitaliy S. »

Do you receive heartbeat alarms for all VMs in the VI or there is a specific VM scope that causes this behavior?
johnlockie
Enthusiast
Posts: 53
Liked: 3 times
Joined: Apr 06, 2012 5:46 pm
Contact:

Re: Supressing Errors for vCenter Outages...

Post by johnlockie »

It seems to be a random group. Not tied to any specific host or datastore. But not ALL of the VMs. Maybe 30-40%
Post Reply

Who is online

Users browsing this forum: No registered users and 5 guests