Monitoring and reporting for Veeam Backup & Replication, VMware vSphere and Microsoft Hyper-V in a single System Center Operations Manager Console
Post Reply
push3r
Enthusiast
Posts: 36
Liked: 6 times
Joined: May 17, 2013 11:54 pm
Contact:

RAID Controller Hard Drive Failure Monitoring

Post by push3r »

I am evaluating version 6.5 for VMware with SCOM 2012 R2. Everything is configured and working as expected. Using a test ESXi 5 host (HP Proliant G5), I pulled a power cable and got the alert. However, when I pulled a hard drive out of a local RAID 1 array , I DID NOT get any alert. I checked the Health Explorer and saw that the Host HBA unit monitor is monitoring the "vmhba1", which is the HP Smart Array P400 card as seen by ESXi.

Does this MP monitor for RAID array in degraded state? It would be very helpful to know that a hard drive has failed in the RAID.

Please advise. Thank you.
Alec King
VP, Product Management
Posts: 1445
Liked: 362 times
Joined: Jan 01, 2006 1:01 am
Contact:

Re: RAID Controller Hard Drive Failure Monitoring

Post by Alec King »

Hi,

I would expect local RAID disk failure to be captured by our Hardware monitoring, not the HBA monitoring.

Do you have any alerts in SCOM for failure to reach the Host for hardware monitoring, any alerts related for "Unable to get Host data via CIM"?
Hardware is a separate collection method from the VMware data, we get hardware info direct from the Host over a different port. Also requires one additional permission in vCenter "CIM Interaction".

If you check the Installation Guide http://www.veeam.com/veeam_mp_for_vmwar ... 6_5_pg.pdf you'll see the firewall ports and additional permission required for CIM Hardware Data collection.

Hope that helps, please post back any queries!

Thanks
Alec
Alec King
Vice President, Product Management
Veeam Software
push3r
Enthusiast
Posts: 36
Liked: 6 times
Joined: May 17, 2013 11:54 pm
Contact:

Re: RAID Controller Hard Drive Failure Monitoring

Post by push3r »

Hi Alec,

I believe my environment is setup correctly as I mentioned above that the Power Supplies alert worked when I pulled a power cable out of the dual power supplies on the HP Proliant. And yes, I have setup "CIM Interaction" with the appropriate ports opened on the ESXi host as described in the installation guide.

The two hard drives in a RAID 1 connect to the HP Smart Array P400 locally. When I pulled one out to simulate a hard drive failure, I did not get any alert at all.
Alec King
VP, Product Management
Posts: 1445
Liked: 362 times
Joined: Jan 01, 2006 1:01 am
Contact:

Re: RAID Controller Hard Drive Failure Monitoring

Post by Alec King »

Ah, good point about the Power Supplies! That's also hardware monitoring of course. :oops:

So, let's check which Sensors our hardware scan has discovered. If you expand the Host in the Compute Topology view, there is a hardware sensor container, and various sensor classes underneath.
You can select any sensor class (or the container itself) and there will be an in-context Task available "List all xxxx sensors" (e.g. List all power sensors)
If you list out all the Disk sensors, do you see the RAID volumes/disks there?
push3r
Enthusiast
Posts: 36
Liked: 6 times
Joined: May 17, 2013 11:54 pm
Contact:

Re: RAID Controller Hard Drive Failure Monitoring

Post by push3r »

I clicked on the "List All Disk Sensors" task and it doesn't show anything. No, I do not see the raid volumes/disks.

Also, ran the "List All Disk Sensors" task on the production ESXi hosts and it did show the Disks, HP Smart Array Controller, and Logical Volume.

Maybe the firmware on the controller of the test esxi host is not up to date?
Alec King
VP, Product Management
Posts: 1445
Liked: 362 times
Joined: Jan 01, 2006 1:01 am
Contact:

Re: RAID Controller Hard Drive Failure Monitoring

Post by Alec King »

Was the same build of ESXi used in production, as in the lab? There is a special HP/OEM build of ESXi, that contains the built-in drivers for the hardware sensors. See here - http://h18004.www1.hp.com/products/serv ... image.html
push3r
Enthusiast
Posts: 36
Liked: 6 times
Joined: May 17, 2013 11:54 pm
Contact:

Re: RAID Controller Hard Drive Failure Monitoring

Post by push3r »

Production ESXi hosts = standard build
Test ESXi Host = don't remember whether it was standard build with manually added HP drivers or not, but it shows the "Health Status" of the HP Proliant under the Configuation tab of the vSphere Client. Maybe this is causing the issue.

When I have time, I'll install another test box with the standard ESXi build and do more testing. If I find something that is out of the ordinary, I'll report back.

Thank you Alec for checking into this with me. You can close this issue as confirmed above, the production ESXi local disks and smart array cards are being monitored.
Alec King
VP, Product Management
Posts: 1445
Liked: 362 times
Joined: Jan 01, 2006 1:01 am
Contact:

Re: RAID Controller Hard Drive Failure Monitoring

Post by Alec King »

OK!
I do know that the storage devices are one sensor class that doesn't work if the HP drivers are not loaded (as opposed to power supplies sensors, which are a more generic industry standard driver). So I think the build of ESXi used could explain it....

Anyway - if you have any further queries - let us know!
Cheers
Alec
push3r
Enthusiast
Posts: 36
Liked: 6 times
Joined: May 17, 2013 11:54 pm
Contact:

Re: RAID Controller Hard Drive Failure Monitoring

Post by push3r »

Hi Alec,

Just letting you know that I setup another test ESXi host and the Raid/Disk alert is working fine. I got alerts for Smart Array battery being partially charged and a simulated disk failure from a RAID 5.

Looks like the ESXi build should come from HP for Proliant server or manually add the drivers to the hosts.

Nice Management Pack! Thanks!
Alec King
VP, Product Management
Posts: 1445
Liked: 362 times
Joined: Jan 01, 2006 1:01 am
Contact:

Re: RAID Controller Hard Drive Failure Monitoring

Post by Alec King »

Great news! Glad it's all working now. Any more questions, please let us know.

Thanks, and I'm glad you're liking our Management Pack! Enjoy 8)
Alec King
Vice President, Product Management
Veeam Software
push3r
Enthusiast
Posts: 36
Liked: 6 times
Joined: May 17, 2013 11:54 pm
Contact:

Re: RAID Controller Hard Drive Failure Monitoring

Post by push3r »

Hi Alec,

I PMed you about the licensing. Not sure if you have taken a look at it yet.

thanks
Post Reply

Who is online

Users browsing this forum: No registered users and 3 guests