Discussions related to using object storage as a backup target.
Post Reply
edh
Veeam Legend
Posts: 368
Liked: 115 times
Joined: Nov 02, 2020 2:48 pm
Full Name: Manuel Rios
Location: Madrid, Spain
Contact:

s3.PutObjectRetention millions request per hour.

Post by edh »

Hi,

As we moved to S3 Storage, we encountered another issue "s3.PutObjectRetention" in millions per hours.

Looks like is a MetaData Lock-In Refresh for inmutability but this small request near 254B saturare the full configuration.

Is there any way to delay or throtthle? It can cause that backups cant be uploaded during this "nightmare hours"...

Our S3 is a Minio Distributed that we tested the PUT with normal objects up 10-13 Gbps...

Regards
Manuel
Service Provider | VMCE
Gostev
Chief Product Officer
Posts: 32230
Liked: 7592 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: s3.PutObjectRetention millions request per hour.

Post by Gostev »

Hi, I recommend you check with MinIO on this.

Only actual storage can know if it's overloaded, and it's trivial for storage to delay/throttle processing of "offending" requests to address this temporary overload. Alternatively, object storage should respond with 503 Slow Down HTTP error code, which will make Veeam engage exponential backoff algorithm for retries. All of this is industry-standard S3 stuff and most object storage use one of these approaches or a combination of both.

Just to be clear, what do you mean by "normal objects"?
edh
Veeam Legend
Posts: 368
Liked: 115 times
Joined: Nov 02, 2020 2:48 pm
Full Name: Manuel Rios
Location: Madrid, Spain
Contact:

Re: s3.PutObjectRetention millions request per hour.

Post by edh »

Thanks Gostev..

For us "normal size" 1MB to 4MB objects :), but this Put request are for 254bytes. Mmm if Veeam handle the retrys... maybe i can implement something arround Nginx for force a throthle...
Service Provider | VMCE
Gostev
Chief Product Officer
Posts: 32230
Liked: 7592 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: s3.PutObjectRetention millions request per hour.

Post by Gostev »

Got it. This PUT request does not actually put any data, it instruct object storage to extend immutability on the existing storage, thus it only needs to pass metadata.
I've just asked devs for the full list of HTTP error codes that trigger exponential backoff algorithm, may be it will help with your workaround idea.
Mildur
Product Manager
Posts: 10316
Liked: 2754 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian K.
Location: Switzerland
Contact:

Re: s3.PutObjectRetention millions request per hour.

Post by Mildur »

Hello

I moved this topic to our object storage subforum.

Best,
Fabian
Product Management Analyst @ Veeam Software
Gostev
Chief Product Officer
Posts: 32230
Liked: 7592 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: s3.PutObjectRetention millions request per hour.

Post by Gostev » 1 person likes this post

The following HTTP error codes trigger exponential backoff algorithm in Veeam:

Code: Select all

Error 400 with one of the following statuses:
Request Timeout, ExpiredToken, TokenRefreshRequired

Error 402 /* Service Quotas Exceeded */

Error 403 with one of the following statuses:
ConnectionLimitExceeded, Throttle, RequestTimeTooSkewed

Error 408 /* Request Timeout */

Error 409 with one of the following statuses:
OperationAborted

Error 429 with one of the following statuses:
SlowDown, CloudKmsQuotaExceeded

Error 500 /* Internal Server Error */

Error 502 /* Bad Gateway */

Error 503 /* Service Unavailable, Slow Down */

Error 504 /* Gateway Time-out */
edh
Veeam Legend
Posts: 368
Liked: 115 times
Joined: Nov 02, 2020 2:48 pm
Full Name: Manuel Rios
Location: Madrid, Spain
Contact:

Re: s3.PutObjectRetention millions request per hour.

Post by edh » 1 person likes this post

Hi,

Just for update, after several hours, we're going to rolling back to Block Storage Inmutable. There's no elegant solution to accomplish this without move to QLC SSD drives and expend several 200K € in new storage.

Stats of Distributed Cluster +1PB full saturated by small request. This same cluster without buckets inmutable can handle 10Gbps , tested with Veeam VBO 365.

Code: Select all

Duration: 11m20s ▱▱▱
RX Rate:↑ 5.2 GiB/m
TX Rate:↓ 107 MiB/m
RPM    :  5026.2
-------------
Call                      Count          RPM     Avg Time  Min Time  Max Time  Avg TTFB  Max TTFB  Avg Size     Rate /min    Errors
s3.PutObject              25037 (44.0%)  2209.5  33.152s   2.866s    4m36s     33.152s   4m36s     ↑2.4M ↓1B    ↑5.2G ↓3.2K  0
s3.PutObjectRetention     23733 (41.7%)  2094.4  15.131s   3.026s    38.324s   15.131s   38.324s   ↑254B        ↑520K        0
s3.HeadObject             3012 (5.3%)    265.8   111.1ms   952µs     24.522s   111ms     24.522s   ↑121B        ↑31K         0
s3.GetObject              2011 (3.5%)    177.5   725.5ms   543µs     19.232s   691.2ms   19.232s   ↑121B ↓606K  ↑21K ↓105M   0
s3.DeleteObject           1397 (2.5%)    123.3   1.195s    962µs     30.508s   1.195s    30.508s   ↑121B        ↑15K         0
s3.ListObjectVersions     1022 (1.8%)    90.2    4.18s     1.9ms     35.586s   4.18s     35.586s   ↑121B ↓8.4K  ↑11K ↓753K   0
s3.ListObjectsV2          411 (0.7%)     36.3    9.76s     1.9ms     5m5s      9.76s     5m5s      ↑121B ↓38K   ↑4.3K ↓1.3M  0
s3.GetBucketLocation      197 (0.3%)     17.4    799µs     589µs     3.8ms     780µs     3.8ms     ↑121B ↓128B  ↑2.1K ↓2.2K  0
s3.DeleteMultipleObjects  87 (0.2%)      7.7     3.342s    4.1ms     30.318s   3.342s    30.318s   ↑1.6K ↓116B  ↑13K ↓890B   0
s3.HeadBucket             22 (0.0%)      1.9     696µs     607µs     1.2ms     656µs     1.1ms     ↑121B        ↑234B        0
s3.ListObjectsV1          18 (0.0%)      1.6     3.051s    1.202s    10.067s   3.051s    10.067s   ↑121B ↓315B  ↑192B ↓500B  0
s3.ListBuckets            8 (0.0%)       0.7     1.7ms     1.3ms     3ms       1.6ms     3ms       ↑159B ↓3.7K  ↑112B ↓2.6K  0
We tryed a workarround with Nginx map and limit_req zone configs returning a 429 error to force throthle at Veeam Agents / VBRs but it dont works.

From our perspective the main problem comes when a task finish and then start to apply the retencion object logic to update all metadata of all object of the backup.
This small request, sub 1KB saturate IO at physical layer and dont allow other customers to proces their s3.PutObject to the repository creating a global stock over the repository.
Just 1 TB at 4MB is about 250 000 request, but as we cant setup/force customer block size, normally is 1MB and about 1Million request per TB stored in S3 Inmutable and in proposed scale 1PB will be arround 1 000 000 000 requests overloading any IO available, just a simple math, 1PB will take about 55hours to be fully metadata updated.

Of course this not happens with NVME options, but NVME still is not €/GB afforable at least for our market.

I checked in VBO how it works and looks like its implemented at "Repo Level" to apply the retention at schedule time.

Why tryed this setup to allow our customer a better scalability solution and allow them "Instant Options" for restore but this setups dont works at least in our small scale.

How it can be improved? I dont know but as Veeam with SOSApi create a system and capacity xml files, maybe instead use S3.PutObjectRetention can create a hidden xml file that update in a single put the full object locking.
Or maybe allow to SORBs admins to setup when should be scheduled the PutObjectRetention, or as Veeam got a regkey to limit the number of object deleted per request, create a key to limit in the Proxy the number of S3ObjectRetention request per second.

Im pretty sure im not the only sysadmin that got this caveheads with S3 storages... now i know why Dell removed ECS....

Regards,
Manuel
Service Provider | VMCE
Gostev
Chief Product Officer
Posts: 32230
Liked: 7592 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: s3.PutObjectRetention millions request per hour.

Post by Gostev »

This is not about S3 storage in general, just the one you're testing (MinIO). Every object storage has different architecture as vendors prioritize different use cases.

Some vendors put big focus on performance and scale with large number of objects and we even use a couple such vendors in our performance testing labs.

Ironically, Dell ECS was one of the best in the early days actually. I didn't know it was discontinued.
edh
Veeam Legend
Posts: 368
Liked: 115 times
Joined: Nov 02, 2020 2:48 pm
Full Name: Manuel Rios
Location: Madrid, Spain
Contact:

Re: s3.PutObjectRetention millions request per hour.

Post by edh »

Yeah , We tested Dell ECS but they use postgress SQL for store metada, not bad but scaleup billions entrys dont work :(. The main problem with Minio is that they dont use OS buffers , just DirectIO and dont support LVM cache or other cache layers .... And their answer is always, fork the project and code anything you need :)
Service Provider | VMCE
Gostev
Chief Product Officer
Posts: 32230
Liked: 7592 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: s3.PutObjectRetention millions request per hour.

Post by Gostev »

LOL!! This made my day.
karsten123
Service Provider
Posts: 575
Liked: 141 times
Joined: Apr 03, 2019 6:53 am
Full Name: Karsten Meja
Contact:

Re: s3.PutObjectRetention millions request per hour.

Post by karsten123 »

what about using RAID 0 per disk to use controller cache?
edh
Veeam Legend
Posts: 368
Liked: 115 times
Joined: Nov 02, 2020 2:48 pm
Full Name: Manuel Rios
Location: Madrid, Spain
Contact:

Re: s3.PutObjectRetention millions request per hour.

Post by edh » 1 person likes this post

They got a warning in their Documentation about not use intermediate cache o raid controllers, typical warnings like CEPH done... thats why we didnt try with RAID 0 but our servers gots 8GBs cache that maybe is not too much but they will absorb that 254Bytes request pretty easy... not today... but maybe other day will test with some dell demo servers. Just now we finish our new deployment for solve this issue and we're getting 8GBps incoming traffic and 0.3 IO Wait... Veeam Transport + Linux + XFS + RAID 60 works like a charm.
Service Provider | VMCE
Gostev
Chief Product Officer
Posts: 32230
Liked: 7592 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: s3.PutObjectRetention millions request per hour.

Post by Gostev » 2 people like this post

edh wrote: Jan 08, 2025 5:51 pmVeeam Transport + Linux + XFS + RAID 60 works like a charm.
https://www.youtube.com/watch?v=Fs72G3fIlog
Post Reply

Who is online

Users browsing this forum: No registered users and 7 guests