Real-time performance monitoring and troubleshooting
Post Reply
stevenfoo
Expert
Posts: 116
Liked: 3 times
Joined: Jun 26, 2009 3:11 am
Full Name: Steven Foo
Contact:

VM Monitor 4.0 false alarm

Post by stevenfoo »

I just started with VM Monitor 4.0 in our Windows 2003 server.

However, last weekend our time zone, we encounter the following message and repeated every 10 mins. from 6/28/2009 09:38 PM until 6/29/2009 04:10am.
This was recorded in the VM Monitor 4.0 Alarm History.

There was no message after 04:10am.

Target: vmware5.corp.XXXX.com
Status: Warning (Yellow)
Alarm: Hardware problems
Time: 6/29/2009 4:00:47 AM
Sensor "VMware Rollup Health State" equal Warning

As such I when to login into our VM server and check the events and logs. I did not see any unsual error message. All hardware component is doing fine.
I have even login into our DELL server using the DRAC 5 and browser the logs file. No errors either.

I also encounter the following error on the same server at 09:25am and 09:35am.
The same message was about 3 days ago. Then it was ok.

Node: datastore1
Level of Bus Resets is above 1

Again checking in the DRAC browser, VM host itself did not state any issue.

Is this just a false alarm or something I need to worry ?

Thanks
Steven
Gostev
Chief Product Officer
Posts: 31804
Liked: 7298 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: VM Monitor 4.0 false alarm

Post by Gostev »

Steven, Veeam Monitor cannot really give false alarms because it does not generate hardware alarms by itself, and rather queries those from ESX hosts. However, if there is a bug with SMASH implementation with specific server/vendor, that would result in false alarms.

Concerning "VMware Rollup Health State", this issue was reported before on this forum, but I cannot seem to find it (probably the thread got deleted by original poster). Anyway, as that poster has indicated, there is a known SMASH bug with some specific server model from specific vendor that results in this particular warning improperly reported by CIM/SMASH API; that server vendor is planning to address this with a future update. Unfortunately, all specifics are lost with the thread deleted.

Concerning bus resets, if you do not have a lot of them then I would not worry. When we had real storage problem in our lab once, the disk issues graph was going crazy with multiple resets.
stevenfoo
Expert
Posts: 116
Liked: 3 times
Joined: Jun 26, 2009 3:11 am
Full Name: Steven Foo
Contact:

Re: VM Monitor 4.0 false alarm

Post by stevenfoo »

Hi Gostev,

As of today, there is no alert. It's seems like a little bug.

My ESXi server are running on 3.5 U4 on Dell 2950 servers.

May be the issue is improper reported of CIM/SMASH API as you mention.

Thanks very much for the explaination. Greatly appreciated.

Steven
sixth
Enthusiast
Posts: 52
Liked: never
Joined: Feb 05, 2009 11:57 am
Contact:

Re: VM Monitor 4.0 false alarm

Post by sixth »

I just wanted to post in regards to the VM Health Rollup issue you are receiving. I get this same error on my Dell R710 and found that it has to do with the way Dell's health reporting agent (DRAC) is reporting to the ESXi/ESX CIM/SMASH agent. This error only occurs when the Raid controller battery is in charging mode, which occurs every so many days and can last a while. I 'heard' that in the most recent Dell Perc 6 firmware update (released on June 30th) may fix this problem. I have not tried the firmware yet as I can't take my production host down, but if you have a PERC 6 card and the alerts are happening when the battery is charging it may be the issue.

Of course it could be an entirely different issue...:-)
stevenfoo
Expert
Posts: 116
Liked: 3 times
Joined: Jun 26, 2009 3:11 am
Full Name: Steven Foo
Contact:

Re: VM Monitor 4.0 false alarm

Post by stevenfoo »

Hi Sixth,

thanks for the information. i have not tried out the firmware patch yet.

we will have to schedule a downtime window period to perform that.

Regards
Steven
dilberty
Influencer
Posts: 10
Liked: never
Joined: Aug 15, 2009 10:46 am
Full Name: Dan
Contact:

Re: VM Monitor 4.0 false alarm

Post by dilberty »

Hey guys,

Have you ever solved this? I'm getting this issue on my Dell R710 running the latest iDRAC6 firmware (1.10).


Dan
abonnell
Lurker
Posts: 1
Liked: never
Joined: Aug 19, 2009 3:27 pm
Full Name: adam bonnell
Contact:

Re: VM Monitor 4.0 false alarm

Post by abonnell »

We have an R710 as well and just experienced this issue. We need to look at updating our PERC firmware. Once completed it takes a bit before the battery learn cycle to relearn so it may take a bit to see if it resolved. This is the Battery on Controller 0 discharging... (Learn Cycle Active).
caffeen
Lurker
Posts: 1
Liked: never
Joined: Nov 09, 2009 6:23 am
Full Name: Jonathan Attwell
Contact:

Re: VM Monitor 4.0 false alarm

Post by caffeen »

I think you will find that this error is cause by your ESX version being out of date.

I have ESX Server 3i, 3.5.0, 123629 and i believe the latest revision is 152875 or higher.
albertwt
Veteran
Posts: 941
Liked: 53 times
Joined: Nov 05, 2009 12:24 pm
Location: Sydney, NSW
Contact:

Re: VM Monitor 4.0 false alarm

Post by albertwt »

ok, so does this problem related to Dell server firmware or the Veeam Monitor 4 ?
--
/* Veeam software enthusiast user & supporter ! */
Vitaliy S.
VP, Product Management
Posts: 27371
Liked: 2799 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: VM Monitor 4.0 false alarm

Post by Vitaliy S. »

That's an old thread, but let me answer your question.... status for hardware sensors is gathered from ESX hosts, as Anton described above. Veeam Monitor only translates the data it receives from VMware.

Though If you get a lot of warning alarms, you may adjust the corresponding alarm in Veeam Monitor 5 to be only notified when any sensor goes red. That will make the alarm less verbose while you're trying to shed some light on the initial hardware problem.
Post Reply

Who is online

Users browsing this forum: No registered users and 5 guests