Comprehensive data protection for all workloads
Post Reply
baremetalbrisbane
Service Provider
Posts: 4
Liked: never
Joined: Dec 15, 2016 1:36 am
Full Name: Mark Pollard
Contact:

Multiple backup copy jobs stalling

Post by baremetalbrisbane » Mar 08, 2018 6:31 am

Hi

Support ID 02622569

We have a single repo server 20SSD extents in a SOBR config and 6 GFS HDD extents also in SOBR config

The issue we are having is at some point during the 50 BCJs (around 10-15 start active at any one time) every active job will go into "waiting for backup repository availability"

On the repository there is no CPU activity or disk io and the jobs will stay like this until they are disabled and re-enabled (waited 15 hours)

This doesn't seem to be a task concurrency issue as its happened when only 3 jobs are running. SSD extents are set to 2 tasks each and HDD to 2 each. This only happens on GFS jobs. Both storage tiers are locally connected to the same VM (using mount points)

8vcpu and 20GB of RAM is provisioned (neither of which are over utilised)

Mark

Dima P.
Product Manager
Posts: 10643
Liked: 874 times
Joined: Feb 04, 2013 2:07 pm
Full Name: Dmitry Popov
Location: Prague
Contact:

Re: Multiple backup copy jobs stalling

Post by Dima P. » Mar 08, 2018 3:31 pm

Hello Mark.

What policy was configured for this scale-out backup repository? Thank you.

baremetalbrisbane
Service Provider
Posts: 4
Liked: never
Joined: Dec 15, 2016 1:36 am
Full Name: Mark Pollard
Contact:

Re: Multiple backup copy jobs stalling

Post by baremetalbrisbane » Mar 08, 2018 10:16 pm

Both scale out extents are set to data locality

baremetalbrisbane
Service Provider
Posts: 4
Liked: never
Joined: Dec 15, 2016 1:36 am
Full Name: Mark Pollard
Contact:

Re: Multiple backup copy jobs stalling

Post by baremetalbrisbane » Mar 09, 2018 2:47 am

I should also add volumes are mounted via mount points to NFS

veremin
Product Manager
Posts: 17017
Liked: 1462 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Multiple backup copy jobs stalling

Post by veremin » Mar 09, 2018 1:33 pm

With data locality policy backup server tries to place all restore points on one extent. May be two slots had been occupied on the given extent already, either by other jobs or by different operation such as restore. And when the backup copy job run, it didn't have spare slot to proceed. Thanks.

Post Reply

Who is online

Users browsing this forum: Cragdoo, davidwatts71, lowlander and 59 guests