"Resource not ready" when restoring from Capacity Tier

Sep 08, 2020 10:31 pm

We are attempting to benchmark restore operations from our Cloud Object Storage. During our tests, we disable all Performance Tier extents within a SOB. As expected, Veeam issues a warning that it "Cannot find backup files in performance tier" and indicates that it will be "Restoring from capacity tier". After firing off multiple concurrent restore tasks, we start seeing Veeam indicate "Resource not ready: Backup repository".

Code: Select all

9/8/2020 10:53:32 AM Warning    Cannot find backup files in performance tier
9/8/2020 10:53:32 AM          Restoring from capacity tier
9/8/2020 10:54:21 AM          Preparing backup from capacity tier
9/8/2020 10:54:21 AM          Preparing backup from capacity tier
9/8/2020 10:54:32 AM          Starting restore job
9/8/2020 10:54:33 AM          Restoring from DAL10 Archive to US East
9/8/2020 10:54:33 AM          Locking required backup files
9/8/2020 10:54:38 AM          Queued for processing at 9/8/2020 10:54:38 AM
9/8/2020 10:54:38 AM          Waiting for the next task
9/8/2020 10:54:38 AM          Resource not ready: Backup repository

I checked the properties on the COS repository and on the SOBR, but I don't anywhere that the number of concurrent tasks can be specified for a COS repository.

Is this configurable?
What is the default number of supported concurrent tasks?

Post by **markhensler** » Sep 08, 2020 10:38 pm this post

From this thread, I found the following registry keys mentioned:

S3ConcurrentTaskLimit
S3MultiObjectDeleteLimit

Are there other registry keys which may affect restore performance?
Is there official documentation or a KB on these?

Post by **ronnmartin61** » Sep 09, 2020 1:38 am this post

Hey Mark, concurrent tasks are determined by the available processing slots defined by your Veeam repository compute.

Post by **Gostev** » Sep 09, 2020 11:17 am this post

That is correct, however in this case only object storage repository extent is involved, which does not have task slot settings. @veremin any idea what this might be going on here?

The registry values posted by Mark are irrelevant here, they are on much deeper level than resource scheduling and task slots. Essentially, they are the settings used by each task. While the status above is specifically about waiting for a task slot.

Sep 09, 2020 4:08 pm

The number of parallel tasks (offload, restore) a Capacity Tier can handle is defined by number of slots available on Performance Tier.

Disabling Performance extents does change this, so, in your case:

Code: Select all

Number of VMs restored from Capacity Tier simultaneously = Number of parallel tasks specified on all Performance extents combined

And if you wonder what happens, if there is no Performance Tier or Scale-Out Backup Repository at all (Import Backup from object storage scenario), then, this number is always equal to 4. This number is not configurable at the moment, but we will add the corresponding registry key in the next product release.

Thanks!

Post by **Gostev** » Sep 09, 2020 5:33 pm this post

Wow, OK. Live and learn!

I think it should be a task slot setting on the object storage repository instead? Just like we have it on all other repository types.

I think this setting is equally important for all storage devices. For example, we have this setting for NAS with SMB interface, but not for the same NAS with object storage interface. Even if it is the same NAS and the same need in both cases: make sure it's not overloaded with too many concurrent tasks.

We can have this setting optional, behind a check box, since cloud object storage is pretty much infinitely scalable (at least from the perspective of one particular customer). If the limit is not selected, then we can consider having unlimited slots. This is safe as in any case, we will never be able to exceed the sum of task slots of all Performance Tier extents (in case of offload/download), or task slots of all available proxy servers (in case of restores).

Post by **markhensler** » Sep 09, 2020 8:21 pm this post

Okay. I think that makes sense. So, for our tests, we were effectively limited to 4 parallel tasks for our Capacity Tier restores.

What would consitute a "task" for an object-storage repository? Is it a VM or a disk?

The S3ConcurrentTaskLimit registry key was mentioned in this thread. Does that come into play for restores?

Post by **Gostev** » Sep 09, 2020 8:44 pm this post

Task is always disk throughout the product, but @veremin posted above that it is VM in this particular case, which if true is another weird inconsistency that we will need to address.

Re: registry keys, see my first post above.

Post by **markhensler** » Sep 09, 2020 9:38 pm this post

Okay. So, the registry keys wont affect the number of concurrent tasks.

Could we affect our restore throughput by tuning S3ConcurrentTaskLimit?
Does this key affect the number of concurrent GET calls per task?

Post by **Gostev** » Sep 09, 2020 10:51 pm this post

You can't improve performance by playing with this key if that is what you're asking, as the default value is already very high. This key is actually there to reduce performance, in case a storage device can't keep up with the load. Not usually a problem with public clouds, but some on-prem object storage devices are fairly weak. If you noticed, this is exactly the reason why it was shared in that topic you picked it up from.

Sep 10, 2020 12:52 pm

I apologize for providing misleading information – unfortunately, there has been a certain misunderstanding between me and QA team. To simplify the example VM was assumed to have 1 virtual disk, and at some point VM and virtual disk got mixed up. This led to me mentioning VM instead of virtual disk.

Later on development team chimed in and clarified that it’s not the number of parallel tasks specified on all performance extents, but rather number of parallel tasks specified on a performance extent where the original restore point resides – the one selected in restore wizard. It might be the restore point plus metadata (copy mode) or just metadata (move mode).

So, the proper equation:

Code: Select all

Number of disks restored from the Capacity Tier = Number of task slots available on the Performance Tier extent where the original restore point resides

Example:

VM1 and VM2 each has 2 virtual disks

VM1 and VM2 were moved to Capacity Tier

Performance Extents were disabled

VM1 and VM2 were selected for restoration

Restore point selected for VM1 resides on Performance Extent 1 (with 2 task slots available)

Restore point selected for VM2 resides on Performance Extent 2 (with 1 task slots available)

VM1 virtual disk 1, virtual disk 2, VM2 virtual disk 1 will be restored in parallel

VM2 virtual disk 2 will be waiting, until VM2 virtual disk 1 is restored

Again, deeply sorry for confusion caused.

Thanks!

Post by **Gostev** » Sep 10, 2020 4:57 pm this post

And what if SOBR does not exist at all @veremin ? Typical DR scenario is: create object storage repository, import backups and perform the restore?

Sep 10, 2020 8:49 pm

In this case the number is always 4 (and will be configurable via registry in v11) - at least something was correct in my original post

Sep 11, 2020 12:11 am

Any reason not to have it unlimited? You should be able to start as many restore tasks as you like, since the source is infinitely scalable in case of public cloud...

Post by **markhensler** » Sep 11, 2020 2:09 am this post

Thank you both for your time and insight. It is much appreciated!

Post by **markhensler** » Sep 22, 2020 6:46 pm this post

I have DR scenario to ask about.

Let's say that we loose a datacenter. At an alternate location, we spin up a new Veeam Server. We create a new SOBR and add the old COS endpoint and some new performance extents. Then, we rescan the SOBR to pick up the COS backups. If we kick off a bunch of restores, what will Veeam use as the concurrent task limit? Will it be 4, because there are no extents on which the original restore points resided? Or, will it be the sum of the concurrent tasks settings from the new extents?

Post by **Gostev** » Sep 23, 2020 11:58 am this post

It will be the sum of the concurrent tasks settings from the new extents of the new SOBR. Rescan will create backup file stubs in the new extents, so at this point they will hold "original restore point" as you called it.

R&D Forums

"Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Re: "Resource not ready" when restoring from Capacity Tier

Who is online