Nimble Integration Issues

Availability for the Always-On Enterprise

Nimble Integration Issues

Veeam Logoby paul.hugill » Thu Jul 20, 2017 5:42 pm

Hi Everyone,
I am interested to hear if anyone has had any issues with their Nimble Integration at all.
The issue that I am having is that once a week (or so) one of my Veeam servers will lose it's storage.

In general it seems to work really well and I love how quick the VM snapshots get removed and the speed is much better.

We are using a bunch of Dell physical Windows servers (R720xd and NX3200) which provide both proxy and repository functions for Veeam.
At some point the server will lose access to it's internal storage with a bunch of PERCSAS2 errors every 30 seconds in the Windows System Log:
Source: percsas2
EventID: 129
Description: Reset to device, \Device\RaidPort0, was issued.

Essentially this seems to hang all the storage on the server (DAS and iSCSI LUNs) and all the jobs that are using that server as a proxy will just get stuck.
The only way to recover is to reboot the affected server, which has the effect of terminating the running tasks (they won't gracefully terminate though).

I started off by checking into the server firmware, given the RaidPort0 errors in the logs and we have updated that all but it still happened on a server with the updated FW.
I then disabled the 'Backup from Storage Snapshots' 10 days ago, on the jobs that it was enabled for and have not had the issue since.
Although I can't claim this has definitely fixed it, it does point me towards the Nimble integration somehow.

My current thinking is that it could be something to do with also having Compellent LUNs also presented to the same servers for DirectSAN backups, so I am looking into that.
I do have a case open with Nimble but not opened one with Veeam yet, however I may do.

Anyone else seen anything similar?

Thanks
Paul
paul.hugill
Novice
 
Posts: 3
Liked: never
Joined: Sat Jun 20, 2015 6:30 pm
Full Name: Paul Hugill

Re: Nimble Integration Issues

Veeam Logoby EugeneK » Thu Jul 20, 2017 6:03 pm

Hi Paul,

I haven't experienced it, not yet anyways.
Considering the whole system gets halted, I'm not sure it is due to any particular storage integration, but PERCSAS2 generally imply driver issues with the storage. That said, I'd check if the Nimble integration package was the latest available for your OS and compatible with the firmare version used on Nimble.
The mass hangout on storage may be the result of malfunctioning networking connection, too, where all outstanding I/O has to be queued up and that leads to additional resources consumption and ultimate crash of the services. Might be just another area to check.
Eugene K
Product Architect @ SingleHop - Veeam Platinum Service Provider
http://www.singlehop.com
VCAP-DCD, VCAP-DCA, VCP-NV
Veeam Certified Architect
EugeneK
Veeam Vanguard
 
Posts: 102
Liked: 23 times
Joined: Sat Mar 19, 2016 10:57 pm
Location: Chicago, IL
Full Name: Eugene Kashperovetskyi


Return to Veeam Backup & Replication



Who is online

Users browsing this forum: Bing [Bot], bstreza, Google [Bot], iColin and 34 guests