nWorks and vCenter

felyjos · Post by **felyjos** » Feb 07, 2013 6:54 pm this post

Hello,

I have an alert in SCOM : "nWorks vCenter: Host cannot connect to storage Alarm" which is sent almost every morning by 2:00 AM+.
I could not locate it on the vCenter why? could it be closed? Any historical da6ta available on the vCenter to trace this issue?
I have the ESXs with the error.

Thanks,
Dom

agolubnichy · Post by **agolubnichy** » Feb 08, 2013 1:14 pm this post

Hi Dom,

Yes, it could be closed; also the entry shown in VI Client could be overwritten by later events. You can check on the events created in vCenter running the query below on the vCenter DB:

select * from VPXV_EVENTS where EVENT_TYPE like '%host%' and CREATE_TIME
between 'yyyy-mm-dd hh:mm:ss.mss' and 'yyyy-mm-dd hh:mm:ss.mss'

Cheers,
Alexey

felyjos · Post by **felyjos** » Feb 08, 2013 5:58 pm this post

Hello,

I am getting only 76 events, these seems really low..
select * from VPXV_EVENTS where EVENT_TYPE like '%host%' and CREATE_TIME
between '2013-02-06 00:00:00.000' and '2013-02-06 03:00:00.000'

and nothing which seems to be linked to my alert!!!! I am checking the one in category 'error'
vim.event.HostComplianceCheckedEvent
vim.event.HostCompliantEvent

Thanks,
Dom

felyjos · Post by **felyjos** » Feb 13, 2013 5:42 pm this post

Hello,

Which Event am I looking for to match the SCOM Alert?

Thanks,
DOm

felyjos · Post by **felyjos** » Feb 13, 2013 7:40 pm this post

select * from VPXV_EVENTS
where CREATE_TIME between '2013-02-13 00:00:00.000' and '2013-02-13 20:00:00.000'
and HOST_NAME like '%mbesx4%'
and EVENT_TYPE like '%storage%'

this seems to be the trick ... thanks

felyjos · Post by **felyjos** » Feb 13, 2013 7:51 pm this post

Hello,

Does "esx.problem.storage.redundancy.degraded 2/13/13 10:33 mbesx13.ad" will send an SCOM alert:
Alert: nworks vCenter: Host cannot connect to storage Alarm
Resolution state: New
Source: DISK
Path: mbesx13.ad
Last modified by: System
Last modified time: 2/13/2013 2:55:20 AM Alert description: Alarm 'Cannot connect to storage' on mbesx13.ad changed from Gray to Gray
????

Thanks,
DOm

felyjos · Post by **felyjos** » Feb 13, 2013 7:55 pm this post

Hello,

Or will be more "vim.event.HostConnectionLostEvent" which generates the alert in SCOM?
Thanks,
DOm

sergey.g · Feb 14, 2013 6:07 am

Hi Dom,

"nworks vCenter: Host cannot connect to storage Alarm" will be triggered by "Alarm*cannot connect to storage*"
"nworks vCenter: Storage connectivity lost on ESX host" will be triggered by "vprob.storage.connectivity.lost" or "esx.problem.storage.connectivity.lost"
"nworks vCenter: Storage redundancy issue on ESX host" will be triggered by "vprob.storage.redundancy.degraded" or "vprob.storage.redundancy.lost" or "esx.problem.storage.redundancy.degraded" or "esx.problem.storage.redundancy.lost"

There are also the following event monitors which use the corresponding vCenter Alarms
"nworks vCenter: Host storage status Alarm changed to Red"
"nworks vCenter: Host storage status Alarm changed to Yellow"

So looks like some host in your infrastructure looses storage path connections, one by one (or could be even at once). So you receive all spectra of redundancy and connection lost events, redundancy are being closed when storage goes back online, connection lost to storage alarm doesn't have the corresponding closure event and it's a timer based monitor and it will be reset after 24 hours, but since you have this issue each night - it will not be closed.

Looks like something is happening with one datastore. You can check redundancy and connectivity lost monitors for HBAs on mbesx13.ad host in health explorer and check which path is failing, then check which storage is connected via this path. I think some maintenance is scheduled for this storage each night and because of that you are receiving all these errors. If it's a planed maintenance you can schedule a maintenance mode for DISK object on the corresponding host, there is a Microsoft's article on how to use Maintenance mode and how to schedule it.

http://support.microsoft.com/kb/2704170

Hope this helps.
Thanks.

felyjos · Post by **felyjos** » Feb 14, 2013 8:05 pm this post

Hello,

I have done several queries and some are okay some are not:
I could not find the
select * from VPXV_EVENTS
where CREATE_TIME between '2013-02-01 00:00:00.000' and '2013-02-15 23:59:00.000'
and EVENT_TYPE like 'Alarm*cannot connect to storage*'

OR

select * from VPXV_EVENTS
where CREATE_TIME between '2013-02-01 00:00:00.000' and '2013-02-15 23:59:00.000'
and EVENT_TYPE like '%cannot connect%'

result 0

and SCOM sent again the same alerts today 02/14/2013...
Any idea on this one?

I found the problem storage connectivity and probleme storage degraded thanks ...

Thanks,
Dom

sergey.g · Feb 14, 2013 8:24 pm

Hi,
could you try something like this?

select * from VPXV_EVENTS
where CREATE_TIME between '2013-02-01 00:00:00.000' and '2013-02-15 23:59:00.000'
and EVENT_TYPE like '%AlarmStatusChangedEvent%'

Among all triggered alarms, there should be an alarm about lost connection to storage.

I'll try to double-check in my lab.

Thanks.

felyjos · Post by **felyjos** » Feb 14, 2013 8:33 pm this post

Hello,

this query is giving me 10,000+ lines... let me filter for one esx server...

This alarm
vim.event.AlarmStatusChangedEvent 2013-02-14 11:23:07.997 info sopdesx3.ad VIOPP-DMZ OPP

seems matching

Alert: nworks vCenter: Host cannot connect to storage Alarm
Resolution state: New
Source: DISK
Path: sopdesx3.ad
Last modified by: System
Last modified time: 2/14/2013 3:23:09 AM Alert description: Alarm 'Cannot connect to storage' on sopdesx3.ad.medctr.ucla.edu changed from Gray to Gray

and I could do this match for several ESX ... but the vim.event is not clear... how to identify the Storage?
but difficult to say it is concerning the storage from the vim.event.xxx???
Any idea?

Thanks,
DOm

felyjos · Post by **felyjos** » Feb 15, 2013 5:00 pm this post

Hello,

Any luck on this issue? I need more info from the Event on the vCenter, it has still sent the alerts to SCOM last night...
I miss the name of the Datastore impacted, and the exact message with 'Host cannot connect to storage Alarm' I have something like 'vim.event.AlarmStatusChangedEvent' but this does not say which Event it is...
DATASTORE_ID & DATASTORE_NAME have a "NULL" value in the event table.

Thanks,
Dom

sergey.g · Feb 19, 2013 9:24 am

Helo Dom,

Previously you mentioned that you were able to find "nworks vCenter: Storage connectivity lost on ESX host" events in vcenter database, these events should have information about affected datastore. When vSphere receives such an event it triggers the alarm and you see 'vim.event.AlarmStatusChangedEvent'.

Hope this helps.

felyjos · Post by **felyjos** » Feb 19, 2013 4:43 pm this post

Let me check again this alarm...

felyjos · Post by **felyjos** » Feb 19, 2013 8:44 pm this post

Hello Sergey,

EVENT_ID CHAIN_ID EVENT_TYPE EXTENDED_CLASS CREATE_TIME USERNAME CATEGORY VM_ID VM_NAME HOST_ID HOST_NAME COMPUTERESOURCE_ID COMPUTERESOURCE_TYPE COMPUTERESOURCE_NAME DATACENTER_ID DATACENTER_NAME DATASTORE_ID DATASTORE_NAME NETWORK_ID NETWORK_NAME NETWORK_TYPE DVS_ID DVS_NAME
18917078 18917078 esx.problem.net.connectivity.lost 2 2/2/13 7:06 NULL NULL NULL NULL 98499 mbesx12.ad 176 3 VIOPP-PROD 2 OPP NULL NULL NULL NULL NULL NULL NULL

The column "DATASTORE" is always filled with "NULL"..
Any idea why?
Thanks,
Dom

sergey.g · Post by **sergey.g** » Feb 20, 2013 10:38 am this post

Hm, that's esx.problem.net.connectivity.lost

Could you locate esx.problem.storage.connectivity.lost events? They should contain the name of the datastore.

Thanks.

felyjos · Post by **felyjos** » Feb 20, 2013 4:33 pm this post

Hello,

Nothing except "NULL" in the name of datastore...
EVENT_TYPE CREATE_TIME HOST_NAME COMPUTERESOURCE_NAME DATASTORE_ID DATASTORE_NAME
esx.problem.storage.connectivity.lost 2/16/13 18:42 bopesx16.ad.medctr.ucla.edu VIOPP-MCO NULL NULL
esx.problem.storage.connectivity.lost 2/16/13 18:42 bopesx16.ad.medctr.ucla.edu VIOPP-MCO NULL NULL
esx.problem.storage.connectivity.lost 2/16/13 18:42 bopesx15.ad.medctr.ucla.edu VIOPP-MCO NULL NULL
esx.problem.storage.connectivity.lost 2/16/13 18:42 mbesx11.ad.medctr.ucla.edu VIOPP-PROD NULL NULL
esx.problem.storage.connectivity.lost 2/16/13 18:42 mbesx5.ad.medctr.ucla.edu VIOPP-PROD NULL NULL

Is a setting missing to capture the datastore's name?

Thanks,
Dom

felyjos · Post by **felyjos** » Feb 20, 2013 5:56 pm this post

could it be coming also from "esx.problem.storage.redundancy.degraded" ?
this morning don't have any lost of connectivity but only this one?

sergey.g · Feb 21, 2013 12:13 pm

Hi,

Yes, I would check them too, but it's strange that there is no datastore name in connectivity lost events.

felyjos · Post by **felyjos** » Feb 21, 2013 2:02 pm this post

Hello,
I verified again for the lost connectivity this morning and none pf them have datastore filled up....
Thanks,
Dom

R&D Forums

nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Re: nWorks and vCenter

Who is online