Discussions related to using object storage as a backup target.
Post Reply
poulpreben
Certified Trainer
Posts: 1024
Liked: 448 times
Joined: Jul 23, 2012 8:16 am
Full Name: Preben Berg
Contact:

v10 Cloud Tier questions

Post by poulpreben » 1 person likes this post

As part of case #03893045, we upgraded one of our customers to v10. It was an opportunity for us to enable the capacity tier in copy mode.

It has generated some questions:
  1. Is there a way to control the initial synchronization? In the SOBR, there is currently 100 TB data, and the objective is to achieve a full 1:1 relationship between cloud and on-prem storage tiers (3-2-1 rule and all...). Initially, one tiering job started, including all objects in the SOBR. After the daily backup, a plethora of additional tiering jobs started. It seems to start one offload job per OIB? Considering it will take 5-10 days to complete the initial sync, could you please explain how this is supposed to work? Is it expected that for a 1,000 VM environment, 1,000 offload jobs will start every N days the offload copy has a backlog?
  2. Since last night, it's copied 5 TB data. There are currently about 22.5M objects in the bucket. That's approaching 1B objects in a single bucket for this particular customer. What are your thoughts on bucket sharding? Is it planned that a SOBR can write to multiple buckets in order to maintain optimal bucket performance?
Thanks,
Preben.
HannesK
Product Manager
Posts: 14314
Liked: 2889 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: v10 Cloud Tier questions

Post by HannesK »

Hi Preben,
I just checked the case number and it looks like RMAN? Does that have anything to do with your question?

1) upload is controlled by the repository task settings. One repository task is consumed per concurrent VM upload to object storage. Backup has higher priority over object storage upload. So uploads are paused if backup starts an requires all tasks slots. Upload always waits for free task slots. The second way to control upload is the network traffic rules.

2) No it's not planned for near future. Do you see performance degradation with many objects in one bucket? If yes, which service provider are you using?

Best regards,
Hannes
poulpreben
Certified Trainer
Posts: 1024
Liked: 448 times
Joined: Jul 23, 2012 8:16 am
Full Name: Preben Berg
Contact:

Re: v10 Cloud Tier questions

Post by poulpreben »

HannesK wrote:Does that have anything to do with your question?
Not at all. Just to inform you that this is a customer deployment that was upgraded prior to GA as per instruction from support.

1) Understood. Can you please explain what is the trigger for starting a new SOBR Offload job? Is it one per object in a backup job?
1a) We currently see >80 active SOBR offload sessions. Will we see an additional 80 tomorrow assuming that the backlog of the initial copy has not yet completed? Is this a concern? Again, I am just trying to wrap my head around the 1,000+ VM scenario. I guess 1,000 jobs could easily saturate a couple of 100's GB RAM on the VBR server.

2) We are the provider, and our backend is Ceph S3 compatible with object locking, etc. I've read examples of Ceph users having buckets with >100M objects, but I have not yet seen anything but suggestions for optimizations for >1B objects. We also have customers using other on-prem object storages, including Scality and IBM Spectrum Scale. Both vendors have some vague recommendations for the max objects per bucket in the ballpark of 100M objects.
HannesK
Product Manager
Posts: 14314
Liked: 2889 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: v10 Cloud Tier questions

Post by HannesK »

1) as soon as one VM finishes, the offload starts (if task slots are free). So it's per object. It's not waiting until the job finishes.
1a) I don't expect issues in general by the number of concurrent offload sessions. A repo task should be backed by 2 GB of memory. So if you have ~80 tasks, I would expect up to 160 GByte RAM usage on the repository. If you see higher usage, that would be interesting for a support case

2) Thanks for the explanation with the recommendation of number of objects per bucket for these systems. That feature request makes sense to me (well, we also expect that object storage providers will improve their capabilities and new hardware generations will also help)
Post Reply

Who is online

Users browsing this forum: scottf99 and 7 guests