Disk command aborts

Real-time performance monitoring and troubleshooting

Disk command aborts

Veeam Logoby Daveyd » Thu Oct 13, 2011 8:34 pm

I am trying to figure out why were are getting sporadic disk command aborts on each of our ESX hosts. In the Veeam Monitor, if I choose the Datacenter and look at the Disk tab, I see numerical values for the disk command abort counter for each of the Datastores. However, if I go into one of the Datastores in Veeam Monitor and go into the Disk Issues tab, all VMs are listed however there are no aborts listed during the same time period. Any reason for that?
Daveyd
Expert
 
Posts: 272
Liked: 10 times
Joined: Thu May 20, 2010 4:17 pm
Full Name: Dave DeLollis

Re: Disk command aborts

Veeam Logoby Vitaliy S. » Thu Oct 13, 2011 9:13 pm

Hi Dave, It would help if you could post some screenshots, as I don't have any disk command aborts in my lab, I am a lucky guy ;) Thanks!
Vitaliy S.
Veeam Software
 
Posts: 19566
Liked: 1104 times
Joined: Mon Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov

Re: Disk command aborts

Veeam Logoby Daveyd » Mon Oct 17, 2011 6:05 pm

Here is the past day viewing the Disk aborts at the Host level...

Image



And at the Datastore level


Image
Daveyd
Expert
 
Posts: 272
Liked: 10 times
Joined: Thu May 20, 2010 4:17 pm
Full Name: Dave DeLollis

Re: Disk command aborts

Veeam Logoby Daveyd » Mon Oct 17, 2011 6:41 pm

Also, I think the numbers are a little off on this screenshot. My maximum range from 6600- 9281ms but the axis numbers show 10s of thousands??

Image
Daveyd
Expert
 
Posts: 272
Liked: 10 times
Joined: Thu May 20, 2010 4:17 pm
Full Name: Dave DeLollis

Re: Disk command aborts

Veeam Logoby Vitaliy S. » Tue Oct 18, 2011 10:31 am

Dave, the only reason I can think of is that you had this host previously connected to this datastore and right now you've removed this datastore from hosts storage inventory.

Basically, if you choose particular datastore, it will scan existing connection to the hosts and will display the graph based on results you currently have, in other words historical data will be retained within a host object, not a datastore object.

As regards Disk I/O tab, then datastore latency is displayed as a stacked graph, because arithmetic mean cannot be used as an indicator of an average disk latency combined from multiple hosts.

Hope this makes sense.
Vitaliy S.
Veeam Software
 
Posts: 19566
Liked: 1104 times
Joined: Mon Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov

Re: Disk command aborts

Veeam Logoby Daveyd » Fri Oct 21, 2011 6:53 pm

When I choose to run a Trend Report to view all SCSI aborts that happened during the past week,the report only shows daily totals. I wanted to see at what times during the day, for a week period, I was receiving the most aborts. Is that possible?

Also, any channce a real time zoom feature will be incorporated? I am demoing other products and I LOVE the ability that some have to look at a graph, that has a day or weeks worth of data, hold down my left mouse button and highlight a section of the graph and it will zoom into that specific time frame.
Daveyd
Expert
 
Posts: 272
Liked: 10 times
Joined: Thu May 20, 2010 4:17 pm
Full Name: Dave DeLollis

Re: Disk command aborts

Veeam Logoby Vitaliy S. » Sat Oct 22, 2011 12:24 pm

Hi Dave,

Currently, this is not possible. Could you please help me to understand the use case for that? Will the corresponding alarm for SCSI aborts (with the exact time and number of aborts) be what you're looking for?

Real time zoom would be possible with a limited set of counters, because you might know that Veeam Monitor stores more than 60 performance metrics for each and every object, so keeping real-time data for every object and every counter might make your database unmanageable.

I would love to hear what counters you're mostly interested in, so we could incorporate this functionality to the next releases.

Thanks!
Vitaliy S.
Veeam Software
 
Posts: 19566
Liked: 1104 times
Joined: Mon Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov

Re: Disk command aborts

Veeam Logoby joergr » Mon Oct 24, 2011 1:17 pm

Wow, Dave, that topic seems interesting to me. please give details about the esxi version and patch level, san system/vendor/firmware, the connection, the hba´s (doesn't matter if iscsi or fc, just provide vendor, model and firmware), the switches used for san (doesn't matter if iscsi or fc, just provide vendor, model and firmware).

If there is no HBA, and you might use SW iSCSI or SW FCOE, please also report the card and the driver (for example intel x520, dual, driver 2.0.84.9).

Please also check out if the hosts were encountering cpu pitches or very high network load during the storage latency/blackout phases.

If you don´t have everything at your hand, please at least tell me what you know in mind.

best regards
Joerg
joergr
Expert
 
Posts: 377
Liked: 39 times
Joined: Tue Jun 08, 2010 2:01 pm
Full Name: Joerg Riether

Re: Disk command aborts

Veeam Logoby Daveyd » Mon Oct 24, 2011 3:59 pm

We are currently running ESX 4.0U3. Each server is a HP DL380 G6 with 2x HP FC1142SR 4Gb FC HBAs. They are all on the latest BIOS, 2.15 and latest EFI, 2.2.0. The HBAs are using VMware's driver 8.02.01-k1-vmw48-4vmw. We have a pair of EMC CX4s (1 in Prod, 1 in DR) running FLARE30 I believe. We are also using EMC RecoverPoint appliances.

During the events there are no abnormal spikes in CPU or network utilization on the Hosts. Working with EMC, we think we have isolated the issue to RecoverPoint. When I see scsi aborts in the Veeam monitor, I see cooresponding disconnect alerts on the RecoverPoint appliances.

Vitaliy, I was looking for a trend report that would show me hourly data for each day of the week, for 1 week. I wanted to be able to look at a report that showed me, on Monday between 8-9am I have xxx scsi aborts and 7-8pm I had xxx aborts, On Tuesday between 4-5pm I had xxx aborts....and so on for an entire week. That would be a nice report to see if issues are happening between specific hours every day or specific hours every other day, etc. The trend report that the Monitor produces now just shows me that I had xxx aborts that day.

The zoom feature....On one product I am demoing, SolarWinds Virtualization Manager, I can take an active graph of say 7 days worth of a specific metric, say CPU utilization and zoom in on that 37day graph down to a specific hour on a specific day without having to open new graphs.
Daveyd
Expert
 
Posts: 272
Liked: 10 times
Joined: Thu May 20, 2010 4:17 pm
Full Name: Dave DeLollis

Re: Disk command aborts

Veeam Logoby joergr » Mon Oct 24, 2011 4:55 pm

Hi Dave,

thanks. Do you see these aborts when replication takes place or randomly? I barely remember a case years ago where aborts where occurring during replication, but on the dr site...let me google it....

...found, this is the old thread http://communities.vmware.com/thread/78 ... 5&tstart=0
Maybe not your scenario but all that comes to my head actually, i am much more familiar with Equallogic ;-)

Best regards,
Joerg
joergr
Expert
 
Posts: 377
Liked: 39 times
Joined: Tue Jun 08, 2010 2:01 pm
Full Name: Joerg Riether

Re: Disk command aborts

Veeam Logoby Vitaliy S. » Mon Oct 24, 2011 7:14 pm

I see... but would you like to have this ability (real-time zoom) for all performance metrics we have or only for a few of them (like CPU Usage, Memory Usage etc.)?
Vitaliy S.
Veeam Software
 
Posts: 19566
Liked: 1104 times
Joined: Mon Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov

Re: Disk command aborts

Veeam Logoby Daveyd » Tue Oct 25, 2011 1:37 pm

Vitaliy S. wrote:I see... but would you like to have this ability (real-time zoom) for all performance metrics we have or only for a few of them (like CPU Usage, Memory Usage etc.)?

As a customer, the more the better...as a developer, whatever is realistic :)
Daveyd
Expert
 
Posts: 272
Liked: 10 times
Joined: Thu May 20, 2010 4:17 pm
Full Name: Dave DeLollis

Re: Disk command aborts

Veeam Logoby Daveyd » Wed Oct 26, 2011 9:32 pm

Another feature request...I ran a HTML report on all my VMs to show their read and write rates for my VMs during a specific time period. While it did produce a nice report, it would be nice if would be a sortable report. I would have like to seen the VMs with the highest MBps listed first then the rese in decending order. The report shows all the VMs but in no particular order and its hard to look at 50 VMs and see which ones hit the highest throughput during that timeframe since its list both KBps and MBps
Daveyd
Expert
 
Posts: 272
Liked: 10 times
Joined: Thu May 20, 2010 4:17 pm
Full Name: Dave DeLollis

Re: Disk command aborts

Veeam Logoby Vitaliy S. » Wed Oct 26, 2011 9:52 pm

Makes perfect sense, I agree. I hope you will be really impressed with our new performance reports that will be shipped with v6. Anyway, thanks for the feedback!
Vitaliy S.
Veeam Software
 
Posts: 19566
Liked: 1104 times
Joined: Mon Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov


Return to Monitoring



Who is online

Users browsing this forum: No registered users and 7 guests