nworks vCenter Storage redundancy

krowczynski · Post by **krowczynski** » Jun 21, 2010 5:09 am this post

Hello,

I get some warnings on one of my 5 ESX host because of "Storage redundancy issue".
If I look in my vCenter in the properties of the LUN I can see two path, one Active with I/O and the other the other also also marked as Active.

I am using nworks 5.5 RC1.

Thanks for some help here.

Post by **Alec King** » Jun 22, 2010 6:24 pm this post

Hi Arkadiusz,

Our MP is responding to a specific event from VC, that says the storage redundancy has been lost (i.e. one of the paths is down).

If Virtual Center is throwing this event, then maybe you have some intermittent problem with one of those LUN connections...?
Especially if this always happens on the same host, and the same path....perhaps you have a bad cable or other issue with that storage link?

If you look in VC event view for this Host, you should see those 'storage redundancy lost' events. That's what causes our alert to fire....do you see many of those events?

Cheers,
Alec

keithkleiman · Post by **keithkleiman** » Sep 20, 2012 3:43 pm this post

Alec,

The "nworks vCenter: Storage redundancy issue on ESX host" monitor was raised for several hosts because the backend storage had failed over. Once the incident was resolved this monitor never updated it's health staus to healthy. I tried to recalculate the health from the SCOM health explorer, however still appears to be in a warning state. I suppose that it is reading the nworks log to check for a healthy expression logged as an event and has not found it. Checking the VC it appears that all paths on the ESX host are active. Can you think of a reason why it did not detect the healthy expression? I supose that I can reset the monitor myself since I know it is not an issue, however I was just trying to understand why it did not detect the healthy expression. I am using nworks 5.7. Thanks in advance.

Keith

Post by **Alec King** » Sep 20, 2012 4:19 pm this post

Hi Keith,

What I suspect, is that there was a re-discovery synchronisation issue.
When there is a storage outage, followed by re-scan, vCenter can destroy and re-create those vmhba paths to the storage. This destruction/recreation can mean that our MP in turn destroys and recreates the matching MP vmhba objects in SCOM...and if the timing is bad, when the healthy event arrives - it arrives at mid-discovery time when there is no valid vmhba target. So it gets dropped, but the red status is cached in SCOM. And you get a Monitor stuck on Red.

This is one reason that our next MP update has a re-designed model for storage monitoring! We have a more stable model, while still monitoring performance and availability for all paths. We've also made huge improvements in discovery latency - meaning updates (such as storage rescans) are captured almost instantly. I'll be dropping you a line pretty soon, hopefully you'll be able to participate in our beta program for nworks vNext!

In summary, I'd agree that you should do a manual reset of the monitor in this case. The above scenario is not common, usually we track healthy status quite accurately - you've just been unlucky with timings on this occasion.
Hope the above covers your question - let me know.

Cheers
Alec

keithkleiman · Post by **keithkleiman** » Sep 20, 2012 5:30 pm this post

Alec,

Thank you for your quick response. This makes sense to me. I would concur that it is not a common issue and usually health does get caluclated accuratly by nworks. We have since manuaaly reset the monitors and should be ok now.

Thanks again,

Keith

R&D Forums

nworks vCenter Storage redundancy

Re: nworks vCenter Storage redundancy

Re: nworks vCenter Storage redundancy

Re: nworks vCenter Storage redundancy

Re: nworks vCenter Storage redundancy

Who is online