Comprehensive data protection for all workloads
Post Reply
baremetalbrisbane
Service Provider
Posts: 5
Liked: never
Joined: Dec 15, 2016 1:36 am
Full Name: Mark Pollard
Contact:

Multiple backup copy jobs stalling

Post by baremetalbrisbane »

Hi

Support ID 02622569

We have a single repo server 20SSD extents in a SOBR config and 6 GFS HDD extents also in SOBR config

The issue we are having is at some point during the 50 BCJs (around 10-15 start active at any one time) every active job will go into "waiting for backup repository availability"

On the repository there is no CPU activity or disk io and the jobs will stay like this until they are disabled and re-enabled (waited 15 hours)

This doesn't seem to be a task concurrency issue as its happened when only 3 jobs are running. SSD extents are set to 2 tasks each and HDD to 2 each. This only happens on GFS jobs. Both storage tiers are locally connected to the same VM (using mount points)

8vcpu and 20GB of RAM is provisioned (neither of which are over utilised)

Mark
Dima P.
Product Manager
Posts: 14396
Liked: 1568 times
Joined: Feb 04, 2013 2:07 pm
Full Name: Dmitry Popov
Location: Prague
Contact:

Re: Multiple backup copy jobs stalling

Post by Dima P. »

Hello Mark.

What policy was configured for this scale-out backup repository? Thank you.
baremetalbrisbane
Service Provider
Posts: 5
Liked: never
Joined: Dec 15, 2016 1:36 am
Full Name: Mark Pollard
Contact:

Re: Multiple backup copy jobs stalling

Post by baremetalbrisbane »

Both scale out extents are set to data locality
baremetalbrisbane
Service Provider
Posts: 5
Liked: never
Joined: Dec 15, 2016 1:36 am
Full Name: Mark Pollard
Contact:

Re: Multiple backup copy jobs stalling

Post by baremetalbrisbane »

I should also add volumes are mounted via mount points to NFS
veremin
Product Manager
Posts: 20271
Liked: 2252 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Multiple backup copy jobs stalling

Post by veremin »

With data locality policy backup server tries to place all restore points on one extent. May be two slots had been occupied on the given extent already, either by other jobs or by different operation such as restore. And when the backup copy job run, it didn't have spare slot to proceed. Thanks.
Post Reply

Who is online

Users browsing this forum: Semrush [Bot] and 266 guests