-
- Enthusiast
- Posts: 53
- Liked: 3 times
- Joined: Apr 06, 2012 5:46 pm
- Contact:
Supressing Errors for vCenter Outages...
Is there any way to tell VeeamONE not to report on all the vCenter related errors when this happens (vCenter goes down)? For example, heartbeats on VM tools....etc.
Because when vCenter service goes down we get about 100+ emails which can be a bit annoying. When just looking at alerts on email it can be alarming....when in reality vCenter outage in our envronment is not a huge problem. But, for example, SQL outage is. These emails could be due to SQL failing, or vCenter failing. One is insignificant, and the other is catastrophic....
Am I making sense?
Because when vCenter service goes down we get about 100+ emails which can be a bit annoying. When just looking at alerts on email it can be alarming....when in reality vCenter outage in our envronment is not a huge problem. But, for example, SQL outage is. These emails could be due to SQL failing, or vCenter failing. One is insignificant, and the other is catastrophic....
Am I making sense?
-
- VP, Product Management
- Posts: 27055
- Liked: 2710 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: Supressing Errors for vCenter Outages...
Hi John,
So basically you're receiving 100+ alarms from all objects in your infrastructure when vCenter Server goes down, right? I agree that it does make sense to receive only one alarm in this case...
Thanks!
So basically you're receiving 100+ alarms from all objects in your infrastructure when vCenter Server goes down, right? I agree that it does make sense to receive only one alarm in this case...
Thanks!
-
- Enthusiast
- Posts: 53
- Liked: 3 times
- Joined: Apr 06, 2012 5:46 pm
- Contact:
Re: Supressing Errors for vCenter Outages...
Precisely.
For example, "Heartbeat is missing for VM for VirtualMachineName"
Now, if our SQL server went down this could happen cause vCenter would go offline (or better yet, if a datastore was offline and the VM died, etc.). But also if vCenter goes down on its own this happens. I really don't care if vCenter causes this error, but if SQL goes down we want to konw.
To put it another way, picture an island with a bridge leading to it. VeeamONE uses this bridge to check on the island. The island is my infrastructure, and the bridge is vCenter. When the bridge goes down VeeamONE assumes the island is not existent, when it is. So, to me I am having to assume maybe a bomb went off on the island and it is in fact down....
Now if a bomb went off I would get other alarms still. If a datastore goes offline for example. And in that event I want to know all the 100 emails I get about the VMs and their associated issues when a datastore kicks them offline. These two events (vCenter service crashing vs. datastore going offline) are completely different in priority and severity, yet they kick out very similar batch of errors.
For example, "Heartbeat is missing for VM for VirtualMachineName"
Now, if our SQL server went down this could happen cause vCenter would go offline (or better yet, if a datastore was offline and the VM died, etc.). But also if vCenter goes down on its own this happens. I really don't care if vCenter causes this error, but if SQL goes down we want to konw.
To put it another way, picture an island with a bridge leading to it. VeeamONE uses this bridge to check on the island. The island is my infrastructure, and the bridge is vCenter. When the bridge goes down VeeamONE assumes the island is not existent, when it is. So, to me I am having to assume maybe a bomb went off on the island and it is in fact down....
Now if a bomb went off I would get other alarms still. If a datastore goes offline for example. And in that event I want to know all the 100 emails I get about the VMs and their associated issues when a datastore kicks them offline. These two events (vCenter service crashing vs. datastore going offline) are completely different in priority and severity, yet they kick out very similar batch of errors.
-
- VP, Product Management
- Posts: 27055
- Liked: 2710 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: Supressing Errors for vCenter Outages...
Awesome analogy I believe it shouldn't be too hard to suppress all alarms when vCenter Server goes down, thanks for sharing your feedback!
-
- Enthusiast
- Posts: 53
- Liked: 3 times
- Joined: Apr 06, 2012 5:46 pm
- Contact:
Re: Supressing Errors for vCenter Outages...
Ok, I take it there is no workaround for this then? Right now, when vCenter goes down we need to just deal with the emails?
It's not the end of the world. I just thought I might be missing a way to tweak it.
It's not the end of the world. I just thought I might be missing a way to tweak it.
-
- VP, Product Management
- Posts: 27055
- Liked: 2710 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: Supressing Errors for vCenter Outages...
Yes, currently it is not possible to change the email notification behavior.
-
- VP, Product Management
- Posts: 27055
- Liked: 2710 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: Supressing Errors for vCenter Outages...
John,
Could you please provide additional information on the behavior you observe? Our QC team couldn't reproduce a massive email/alarm storm when vCenter Server connection goes down.
BTW - what version of Veeam ONE are you using?
Thanks!
Could you please provide additional information on the behavior you observe? Our QC team couldn't reproduce a massive email/alarm storm when vCenter Server connection goes down.
BTW - what version of Veeam ONE are you using?
Thanks!
-
- Enthusiast
- Posts: 53
- Liked: 3 times
- Joined: Apr 06, 2012 5:46 pm
- Contact:
Re: Supressing Errors for vCenter Outages...
6.1
That's very interesting! And most the alerts we have are configured out of the box (we needed to do very little customization....only on disk latency alerts )
Stay tuned, I'll post something
That's very interesting! And most the alerts we have are configured out of the box (we needed to do very little customization....only on disk latency alerts )
Stay tuned, I'll post something
-
- Enthusiast
- Posts: 53
- Liked: 3 times
- Joined: Apr 06, 2012 5:46 pm
- Contact:
Re: Supressing Errors for vCenter Outages...
Ok here is a breakdown:
Due to a SQL bug vCenter crashes during multiple snapshots being taken. When Veeam B&R was taking snapshots for a couple of backups around 4:30am we had this happen. The vCenter server service halted. The VM was still online, but vCenter crashed. Everything else was fine, including SQL (the bug is with a record specifically in the vcenter database, and does not impact other databases or the SQL instance at all).
Alerts arrived in the following manner:
4:30am: Topology collection failure for vcenterserver
4:30am: Event data collection failure for vcenterserver
4:35am: Virtual Server connection failure for vcenterserver
4:39am: Virtual Server connection failure for vcenterserver
4:40am: Heartbeat is missing for VM for VMNAME-1
4:40am: Heartbeat is missing for VM for VMNAME-2
4:40:am.....and so on for a ton of heartbeat errors
5:08am: Host vmhost03 is not responding (totally untrue)
...then it reports all the hosts are down, etc.
I can almost guarantee you if I shut down vCenter right now we would have 100 emails from VeeamONE.
Due to a SQL bug vCenter crashes during multiple snapshots being taken. When Veeam B&R was taking snapshots for a couple of backups around 4:30am we had this happen. The vCenter server service halted. The VM was still online, but vCenter crashed. Everything else was fine, including SQL (the bug is with a record specifically in the vcenter database, and does not impact other databases or the SQL instance at all).
Alerts arrived in the following manner:
4:30am: Topology collection failure for vcenterserver
4:30am: Event data collection failure for vcenterserver
4:35am: Virtual Server connection failure for vcenterserver
4:39am: Virtual Server connection failure for vcenterserver
4:40am: Heartbeat is missing for VM for VMNAME-1
4:40am: Heartbeat is missing for VM for VMNAME-2
4:40:am.....and so on for a ton of heartbeat errors
5:08am: Host vmhost03 is not responding (totally untrue)
...then it reports all the hosts are down, etc.
I can almost guarantee you if I shut down vCenter right now we would have 100 emails from VeeamONE.
-
- VP, Product Management
- Posts: 27055
- Liked: 2710 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: Supressing Errors for vCenter Outages...
Ok, thanks for the info! I will update this topic once I have more details from our QC team.
-
- VP, Product Management
- Posts: 27055
- Liked: 2710 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: Supressing Errors for vCenter Outages...
John, could you please clarify if you have lots of emails when just vCenter Server service halts or it happens when SQL Server hosting vCenter Server database crashes?
-
- Enthusiast
- Posts: 53
- Liked: 3 times
- Joined: Apr 06, 2012 5:46 pm
- Contact:
Re: Supressing Errors for vCenter Outages...
Nothing happens w/ SQL server. If I just reboot vcenter I get the errors. In this case, vCenter Server Service crashed. It was the only system to crash. Our core database was online and no other systems were actually impacted. Only vCenter was down.
-
- VP, Product Management
- Posts: 27055
- Liked: 2710 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: Supressing Errors for vCenter Outages...
Hmm...that's weird, we do not receive lots of alarms when our vCenter Server goes down. Could you please contact our support team so we could take a look at your environment and review Veeam ONE debug logs?
-
- Enthusiast
- Posts: 53
- Liked: 3 times
- Joined: Apr 06, 2012 5:46 pm
- Contact:
Re: Supressing Errors for vCenter Outages...
Yeah. I will test this manually during a maintenance window. We have a few coming up. It would probably be good for me to know 100% and be able to replicate on the fly before I put a ticket in....
I am glad to hear you do not see this. Something is weird about our environment probably (?)
I am glad to hear you do not see this. Something is weird about our environment probably (?)
-
- VP, Product Management
- Posts: 27055
- Liked: 2710 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: Supressing Errors for vCenter Outages...
Could be... Please let me know your ticket number, so I could explain to our support engineer what exactly we need to verify in your environment.
BTW - what kind of alarms do you receive? The ones which refer to failed heartbeats only?
BTW - what kind of alarms do you receive? The ones which refer to failed heartbeats only?
-
- Enthusiast
- Posts: 53
- Liked: 3 times
- Joined: Apr 06, 2012 5:46 pm
- Contact:
Re: Supressing Errors for vCenter Outages...
Yes, heartbeat and also host down! So odd =/
-
- VP, Product Management
- Posts: 27055
- Liked: 2710 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: Supressing Errors for vCenter Outages...
Do you receive heartbeat alarms for all VMs in the VI or there is a specific VM scope that causes this behavior?
-
- Enthusiast
- Posts: 53
- Liked: 3 times
- Joined: Apr 06, 2012 5:46 pm
- Contact:
Re: Supressing Errors for vCenter Outages...
It seems to be a random group. Not tied to any specific host or datastore. But not ALL of the VMs. Maybe 30-40%
Who is online
Users browsing this forum: No registered users and 5 guests