Veeam and Ceph: a real cool story

Dec 07, 2022 10:29 am

Hi everyone, I'm Fabio and I'm working with Veeam and Ceph for about 2 weeks and I found some interesting thing that I suppose could interest everyone

Let me start from the beginning!

I have a 2.7PB cluster Ceph (Pacific release) which expose the object storage using the integrated RadosGW, so I can use it as the offload storage for the Veeam's backup jobs.
As radosgw frontend I tried civetweb and beast (with the same results).

I'm using the immutability API function (and at the very beginning I re-compiled Ceph to fix by myself the issue with the timestamp format... but my merge request is not the one which the community integrate in the official release of Ceph.. Same issue, similar solutions, but mine was not so elegant, maybe

).

Almost everything works fine, even if Ceph and the S3 API implementation is not certified by Veeam, apart for an issue that Veeam support could not solve and I'm trying to solve by myself.

Let me explain better:
- offload to s3 works great so I can put data in buckets with the right expiration date
- list of files in buckets works great so I can recover data from buckets

The only problem is related with the multiple object delete request that Veeam at the end of the job. If I run the job manually, it could delete (better, tag to delete) the oldest files but when the scheduler runs the same job, it couldn't delete the oldest files cause "unknown error".. During my investigation I found that the "delete" operation is an API put request which contain a list of maximum 1000 (by Veeam default) path to sign as deletable after the immutability expired.

I tried with less files in the bulk request (playing with the Veeam's registry keys) but the error is always the same: unknown error [and the same timeout].

I know that this is a Ceph issue (better: a rados gateway issue..) and I'm pretty sure that someone hit the same issue and maybe had solve it a lot of time ago..well, I'm here to ask your help

I can produce a lot of logs and metrics and logs but I thing the issue is related with some settings of beast (or civetweb but I'm using beast at this moment)..

Thanks to everyone and sorry for my bad english..

Fabio

Post by **sfirmes** » Dec 07, 2022 1:15 pm this post

@fabio.pasetti welcome to the forums and thanks for the question.

I have tested Veeam working with Ceph Octopus, Pacific, and now Quincy (stable). I have used the "official" releases of Ceph running on Ubuntu.

We also have several alliance partners who use Ceph as part of their solutions who work with Veeam and have passed the Veeam Ready Object and Veeam Ready Object with Immutability testing. The Veeam Ready Object testing does test the deletion of over 4 million objects from the object storage target. So what you are trying to do should be working without issues.

What version of VBR are you using?

Also you mentioned you contacted Veeam Support. Do you have a case# that we can look at?

Dec 07, 2022 1:26 pm

Hello Fabio.
Do you run Ceph 14.2.22 or later, 15.2.14 or later, or 16.2.5 or later?
There is as well a bug related to the wrong date format used by default in Ceph.
https://tracker.ceph.com/issues/51327

Post by **fabio.pasetti** » Dec 07, 2022 1:49 pm this post

Hi @sfirmes thank you for your reply!

Uh, you wrote an interesting thing about tests! I mean you're right, if there is a tool to test the storage I can use it! I don't know how can I obtain that test, but I'm writing to my contact in Veeam asking if it's possible to test my Ceph with them.

About the case, it's the #05630213 (and it's closed right now because my colleague prefer to start from scratch in another case....

) and it contains almost all the entire story.

Hi @Andreas Neufert I'm using Ceph 16.2.10 and the but related to the wrong date format is actually solved on my environment, I'm sure that the issue is related with something else, but thank you very much for the link!

Thank you!

Dec 07, 2022 2:21 pm

@fabio.pasetti the testing I referred to is Veeam Ready Program which is a program offered to our Technical Alliance Partners.

I am doing some testing today with v11 and v12 beta3 using the Quincy build (17.2.5) and will let you know if I see the same issue you are encountering. I only have a 4TB cluster, but that will be enough to test the deletion of millions of objects both manually and automatically.

If I have any issues, I will update this thread.

Post by **fabio.pasetti** » Dec 07, 2022 2:30 pm this post

Great, thank you very much Steve! If you need to know something else about my setup, I can send you the radosgw config or anything else.

Thank you a lot

Post by **fabio.pasetti** » Dec 13, 2022 8:34 am this post

Hi everyone,
just to update with some consideration:

- I'm looking for the errors in the radosgw debug logs and I can find that the "beast" frontend sometimes crash itself and the requests disappeared with the process

- osds didn't report errors

I suppose that my issue is related with some mods on the radosgw trying to enhance performance so, starting from today, I'm trying to move half of my radosgw to default configuration so tonight we can use them.

Thank you,
Fabio

Post by **sfirmes** » Dec 13, 2022 1:56 pm this post

Thanks for the update @fabio.pasetti. My setups always use the base code that I get from the linux distros and so far they have been solid.

Looking forward to your next update.

Post by **randyodonnell** » Mar 26, 2023 8:56 pm this post

Sorry to jump on this thread, but we are having the exact same issue with Veeam 12 (RTM), and Ceph Quincy 17.2.5. Everything seems to go fine, until it needs to clean up restore points, and throws erros on the multiple object delete. Was anything ever figured on what causes this?

Thanks,
Randy

Post by **Mildur** » Mar 27, 2023 6:46 am this post

Hi Randy

Welcome to the forum.

The provided Case was closed, therefore I don't see what the solution was.
Please open your own case if you want to have it analyzed.

Best,
Fabian

R&D Forums

Veeam and Ceph: a real cool story

Re: Veeam and Ceph: a real cool story

Re: Veeam and Ceph: a real cool story

Re: Veeam and Ceph: a real cool story

Re: Veeam and Ceph: a real cool story

Re: Veeam and Ceph: a real cool story

Re: Veeam and Ceph: a real cool story

Re: Veeam and Ceph: a real cool story

Re: Veeam and Ceph: a real cool story

Re: Veeam and Ceph: a real cool story

Who is online