Monitoring and reporting for Veeam Backup & Replication, VMware vSphere and Microsoft Hyper-V in a single System Center Operations Manager Console
Post Reply
khitsgmbh
Novice
Posts: 4
Liked: never
Joined: Nov 07, 2011 5:22 pm
Contact:

Unstable 5.7 Collectors

Post by khitsgmbh »

Hello,

we have some heavy stability issues with some of our collectors and i want to know, how we can tune the system to prevent stability issues in the future:

The environment:
- 1 x VirtualCenter (around 200 cores, 700 VMs)
- 1 x Veeam EM
- 6 x Veeam Collectors, divided into 2 Failovergroups with 3 Collectors each
- each Collector is a VM with 4 vCPU and 4 GB RAM. CPU and RAM usage between 50-70%. nworks load on collector between 30-50%
- Veeam Version 5.7
- VC event queue size: 350

The situation:
In one of the FGs the VM guys have some serious problems with storage and nics. The result is, that those objects generates hundreds and thousands of events. This causes a lot of trouble at the SCOM Agent on the collector: a lot of WMI events like 10376,10378 (Module was unable to convert WMI setting) and steady heartbeat failures.

My question:
I can not influence the stability of the VM Hosts (i.e. the real root cause) nor that the VM guys put their machines into maintenance... How can we reliable configure the collectors and the underlying SCOM agents in a way, that they work even in situations with high event counts? IMHO our nWorks design should easyly handle the load...

Any hints and suggestions are highly appreciated,

Dirk
Alec King
VP, Product Management
Posts: 1445
Liked: 362 times
Joined: Jan 01, 2006 1:01 am
Contact:

Re: Unstable 5.7 Collectors

Post by Alec King »

Hi Dirk,

Well, that's an interesting problem.....if you cannot immediately fix the root cause (event storm in VC) then we can examine some options.

I'd like some more details -
How many events do you see from these hosts? (approx)
What are the exact events/alerts?


We can perhaps tune the rules that respond to those events. Either disable them (for the problem Hosts), or maybe introduce something like event correlation, were multiple events will be rolled up into one alert.

I wonder also if these events are causing follow-on problems for our Collector, such as discovery thrashing. If storage and/or networking components are going online & offline rapidly, this could cause us to constantly re-discover the Hosts....this means very high CPU on Collector, and other problems follow from that.

Might be best if you could open a case with our support team - send over the logfiles, and we can dive into deeper analysis.

Thanks!
Alec
sepj12927
Enthusiast
Posts: 25
Liked: never
Joined: May 31, 2012 9:48 am
Full Name: Per J
Contact:

Re: Unstable 5.7 Collectors

Post by sepj12927 »

Hi,

Did you resolve this issue?

I'm suffering from a similar issue.

/Per
Alec King
VP, Product Management
Posts: 1445
Liked: 362 times
Joined: Jan 01, 2006 1:01 am
Contact:

Re: Unstable 5.7 Collectors

Post by Alec King »

Hi Per,

Can you share more details of your problem? Do you have an event flood from vCenter, or other issue?

Cheers,
Alec
sepj12927
Enthusiast
Posts: 25
Liked: never
Joined: May 31, 2012 9:48 am
Full Name: Per J
Contact:

Re: Unstable 5.7 Collectors

Post by sepj12927 »

Hi,

We receive multiple events like this on our EMS.

=================================
Module was unable to convert WMI setting .\timestamp


One or more workflows were affected by this.

Workflow name: nworks.VMware.VEM.VMGUESTVIRTUALDISK.Collect.freePct
Instance name: C:\
Instance ID: {8579BA65-3D36-E5EC-5EFF-14749032E061}
Management group: MGMT_Group
=================================

At the time of the above event in the OpsMgr log there are on average 5-6 events pe minute in the nworks log.

/Per
vBPav
Expert
Posts: 181
Liked: 13 times
Joined: Jan 13, 2010 6:08 pm
Full Name: Brian Pavnick
Contact:

Re: Unstable 5.7 Collectors

Post by vBPav »

Per,

I would recommend submitting this to our support team @ http://cp.veeam.com. I recommend also exporting your OpsMgr event logs on each of your Veeam Collectors, zipping them, and submitting them with the case. Please post back once you get an answer from support. Good luck!
Brian Pavnick | Cireson| Solutions Architect

- Follow me on Twitter @ vbpav
- Reach me on e-mail @ brian.pavnick@cireson.com
thomaxx
Novice
Posts: 3
Liked: never
Joined: Nov 03, 2009 3:21 pm
Full Name: Thomas Loicht
Contact:

Re: Unstable 5.7 Collectors

Post by thomaxx »

Hi Guys,

any solution here. I have the same Problem with one of my nworks Collectors.
I see in the nworks Event Log 10-15 Entries with
[UserLoginSessionEvent] User username@Servername logged in
[UserLogoutSessionEvent] User username logged out
The HealthService.exe is using 50% CPU and in the Operations Manager EventLog on the Collector Server i see Event's like this
Module was unable to convert WMI setting .\timestamp


One or more workflows were affected by this.

Workflow name: nworks.VMware.VEM.VMHOSTDISK.Monitor.totalWriteLatency
Instance name: vmhba2:C0:T1:L22
Instance ID: {CF745E74-DAE5-E9FA-F937-0B85DD4122E9}
Management group: BRZ

Cheers,
Thomas
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Unstable 5.7 Collectors

Post by Vitaliy S. »

Hi Thomas,

Unfortunately, none of the posters above mentioned his support ticket number, so I cannot check the resolution for you. Could you please open a support ticket and post your case number here, so I could update this topic with the resolution for future readers?

Thanks!
Post Reply

Who is online

Users browsing this forum: No registered users and 4 guests