HPE 3PAR Inform OS 3.3.1 Storage Snapshot Bug

Availability for the Always-On Enterprise

HPE 3PAR Inform OS 3.3.1 Storage Snapshot Bug

Veeam Logoby david.buchanan » Tue Sep 12, 2017 10:51 pm

Hi All,

Just posting this to make everyone aware that there is a bug in the current Inform OS 3.3.1 in which processes on the 3PAR will get stuck/frozen during a snapshot removal and cause the system to Panic.

This isn't specifically a Veeam issue, the bug can happen when any system tries to remove a snapshot.

Simply posting this here to make people aware of the issue in case they are looking at updating to 3.3.1. The issue does not effect versions prior to 3.3.1.

Currently there is a fix in the works but no ETA on when it will be released. The current work around is to switch back to hotadd backups.

Thanks,
David.
david.buchanan
Enthusiast
 
Posts: 41
Liked: 8 times
Joined: Tue Jun 02, 2015 12:44 am
Full Name: David

Re: HPE 3PAR Inform OS 3.3.1 Storage Snapshot Bug

Veeam Logoby foggy » Thu Sep 14, 2017 4:33 pm

Hi David, thanks for sharing. I would like to add though that this is not a 100% reproducible issue, since we have 3.3.1.215 (GA) deployed in our lab and do not see such behavior. So there should be some specific circumstances in which the issue shows up.
foggy
Veeam Software
 
Posts: 15094
Liked: 1111 times
Joined: Mon Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson

Re: HPE 3PAR Inform OS 3.3.1 Storage Snapshot Bug

Veeam Logoby david.buchanan » Thu Sep 14, 2017 10:43 pm 1 person likes this post

Hi Foggy,

Yes, you are correct. We were running storage snapshots for 3 weeks without issue before this popped up.

I'm waiting on additional details from HPE but as I understand it there are 3 bugs they are aware of.

Two are resolved by P7 and P11 and the 3rd which hit us is resolved by P12 which is not out of QA as of this post.

Below is what HPE outlined to us in our ticket with them:

The System manager (sysmgr) process is one of the main Kernel process running on our Inserv and is running on the master node. We only allow one system manager process to be run once on the master node.
There are couple of child processes attached to the sysmgr process, for example the pdscrubber process responsible for chunklets relocation (servicemag process) or the tpdtcl process responsible to run cli commands on the Inserv.

When sysmgr became unresponsive and no specific tasks or processes are run on the Inserv (i.e no tuning or servicemag process running) or no pending IOCTL block are pending between the controller nodes it is usually safe to restart the system manager process.

Cause : Automation (Veeam) to incorrectly issue snapshot delete requests out of order and attempt to delete RO snaps (normally hidden), this causes the two snaps to be merged from an exceptions table point of view, but this can't be done because one the snaps is stuck in the "pending delete" and results in multiple node panic's
The snapshot removal process has been enhanced so as not to allow out of order snapshot removal with pending delete and is available in the upcoming 3.3.1.MU1 with Patch 12 CURRENTLY patch 12 is NOT available, but is currently undergoing Software QA testing and is expected to be available soon. (subject to change) (Update from LAB)


This is still ongoing with HPE support so I'll update the post with additional details as I get them.

Thanks,
David.
david.buchanan
Enthusiast
 
Posts: 41
Liked: 8 times
Joined: Tue Jun 02, 2015 12:44 am
Full Name: David

Re: HPE 3PAR Inform OS 3.3.1 Storage Snapshot Bug

Veeam Logoby Massamb » Tue Sep 19, 2017 7:37 am

Hi David, we also have 3.3.1.215 (GA)+P01,P02,P04.
HPE upgraded our production 3PAR on August 12th and so far we do not see such behavior.
In the HPE ticket description you posted is not clear to me which are the two snaps to be merged (in weeam backup there is only one snap involved, right?)
Have you got more details?

Thanks.
Massimo
Massamb
Lurker
 
Posts: 2
Liked: never
Joined: Wed Mar 08, 2017 5:25 pm
Full Name: Massimo

Re: HPE 3PAR Inform OS 3.3.1 Storage Snapshot Bug

Veeam Logoby znabela » Mon Sep 25, 2017 11:26 am

We are currently having a similar issue with Inform OS 3.3.1.215 (GA)+P01,P02 ... a lot of hung Veeam storage snapshots, a CPG that has run out of space, and unable to allocate more space since sunday morning.

Awaiting call-back from 3PAR 2nd-tier support.
znabela
Influencer
 
Posts: 13
Liked: 5 times
Joined: Thu Dec 04, 2014 7:09 am
Full Name: Robert Christiansen

Re: HPE 3PAR Inform OS 3.3.1 Storage Snapshot Bug

Veeam Logoby Massamb » Sun Oct 15, 2017 5:32 pm

HPE has released a patch (3.3.1 MU1 P14) that may be related to this issue.
Here the link to the Release Notes:

https://support.hpe.com/hpsc/doc/public ... 27034en_us
Massamb
Lurker
 
Posts: 2
Liked: never
Joined: Wed Mar 08, 2017 5:25 pm
Full Name: Massimo


Return to Veeam Backup & Replication



Who is online

Users browsing this forum: Bing [Bot], pstickle and 47 guests