Discussions related to using object storage as a backup target.
Post Reply
mcz
Veteran
Posts: 948
Liked: 223 times
Joined: Jul 19, 2016 8:39 am
Full Name: Michael
Location: Rheintal, Austria
Contact:

object storage integration needs to become more resilient

Post by mcz »

Hello everybody,

don't get me wrong: I love object storage integration and we are using it since day one when veeam integrated it but I've had too many issues the last month and weeks and they should be fixed / wiped out for future releases. I don't wannt to post endless stories about what happened and why, so I'll just sum it up in quick words and if someone has questions, I will answer them later.

What I've observed so far:
  • rescans can take days (fix should be released in v10)
  • downloading files (vm export) can also take a huge amount of time (I'm not talking about the bandwith)
  • Doing a replica failback with the option where the original vm isn't present anymore messes everything up - offloaded restore points will become unavailable
  • rescan won't rebuild local metadata if it isn't on prem anymore - next restore would fail
  • changing vcenter while preserving morefid of vm's deletes offloaded restore points like in case 03971889
I've already posted as good as every topic here on the forums but the last point is something I'd like to elaborate a little bit more. Support engineer wrote me the following:
I've managed to reproduce the issue in my lab. Here's a quick example on how to do that so that you'll be aware of its gist:
1. Backup a VM from vCenter added using its DNS name;
2. Offload it;
3. Restore will be working OK at this point;
4. Remove that VM from the backup job, add its copy from the same vCenter added using its IP address and back it up;
5. Offload it;
6. Restore from the same restore point used on the step #3 is not working anymore.

To conclude, the very first offload of the new object is deleting data on the old object in the cloud.
As I said I've had many many cases the last months and I just have the feeling that there are too many bugs/issues at the moment. As an admin, you don't get warnings or error messages when something on the offload-side goes wrong or veeam decides to delete archives just like in our case. The dangerous thing here is that you won't realize it until you do a restore where you get the error message. Please, please change it for the future so that you can rely on you offloaded backups - at the moment I do not really trust the console output...

Maybe, the points that I've listed sound quite silly or easy but I tell you I've worked days and weeks with the engineers and it wasn't just "ok, let's do a db update and everything runs well", it was often very very timeconsuming and paintful (if you realize that backups have been deleted).

Don't get me wrong, I don't wannt to point my finger at anybody or the product - bugs and failures happen, no question. All I'd like to achive is to show the veeam product managers that this area needs to be improved - if they don't know it yet.

Thanks!
veremin
Product Manager
Posts: 20736
Liked: 2403 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: object storage integration needs to become more resilient

Post by veremin »

rescans can take days (fix should be released in v10)
OK, this is something v10 addresses.
downloading files (vm export) can also take a huge amount of time (I'm not talking about the bandwith)
Are you talking about Export Backup functionality? If so, it's not directly related to our object storage integration and should be discussed in separate thread.
changing vcenter while preserving morefid of vm's deletes offloaded restore points like in case 03971889
Doing a replica failback with the option where the original vm isn't present anymore messes everything up - offloaded restore points will become unavailable
Seem like similar issues - as soon as original VMs are not processed any longer (and both vCenter migration and Failback to different location make backup server track VMs as new ones), offload restore points become unavailable. We will investigate it more carefully internally and see whether we can improve current experience anyhow.
rescan won't rebuild local metadata if it isn't on prem anymore - next restore would fail
Not sure whether I follow you on this - if the performance extents are unavailable, where exactly you want to download metadata to?

Thanks!
Gostev
Chief Product Officer
Posts: 32761
Liked: 7971 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: object storage integration needs to become more resilient

Post by Gostev »

I am also confused about "replica failback", and how is it related to object storage integration, which is for backups only.
mcz
Veteran
Posts: 948
Liked: 223 times
Joined: Jul 19, 2016 8:39 am
Full Name: Michael
Location: Rheintal, Austria
Contact:

Re: object storage integration needs to become more resilient

Post by mcz »

Hi Vladimir,
Not sure whether I follow you on this - if the performance extents are unavailable, where exactly you want to download metadata to?
Yes, you did follow on this: object-storage-f52/is-offloaded-backup- ... 64390.html
Just wannted to add it to the list, as you can see it's already marked as future request.
Seem like similar issues - as soon as original VMs are not processed any longer (and both vCenter migration and Failback to different location make backup server track VMs as new ones), offload restore points become unavailable. We will investigate it more carefully internally and see whether we can improve current experience anyhow.
If you say "restore points become unavailable", do you really mean unavailable in terms of console output (backup properties)? In my case, the restore point wasn't shown as unavailable and only during restore we noticed the desaster...

One thing I've forgotten to mention is that the "folder" name on S3 is equal to a hash in the database for the vm. Now in one case we had two database objects for one vm (old and new vcenter) but the hash itself was equal which means it would have written offloads to the same directory. Not sure if this is intended but the support engineer made a db update to prevent any collision...
mcz
Veteran
Posts: 948
Liked: 223 times
Joined: Jul 19, 2016 8:39 am
Full Name: Michael
Location: Rheintal, Austria
Contact:

Re: object storage integration needs to become more resilient

Post by mcz »

Gostev wrote: Feb 17, 2020 1:49 pm I am also confused about "replica failback", and how is it related to object storage integration, which is for backups only.
Yeah correct, sounds misleading - please have a look at this thread:

vmware-vsphere-f24/keeping-same-morefid ... ml#p341748
veremin
Product Manager
Posts: 20736
Liked: 2403 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: object storage integration needs to become more resilient

Post by veremin »

changing vcenter while preserving morefid of vm's deletes offloaded restore points like in case 03971889
Doing a replica failback with the option where the original vm isn't present anymore messes everything up - offloaded restore points will become unavailable
QA team have just checked similar scenario and found several issues related to it. We're planning to fix them in the first update for version 10.

So, thank you for raising it, much appreciated!
mcz
Veteran
Posts: 948
Liked: 223 times
Joined: Jul 19, 2016 8:39 am
Full Name: Michael
Location: Rheintal, Austria
Contact:

Re: object storage integration needs to become more resilient

Post by mcz »

Sounds good Vladimir, thank you! I'm a little bit curious: Did they test this scenarios anyway or was it due to my post/request?
veremin
Product Manager
Posts: 20736
Liked: 2403 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: object storage integration needs to become more resilient

Post by veremin »

Due to your post, I asked QC to re-test this and similar scenarios - and they spotted some issues there. So, should you think about switching to Veeam QA team, let me know - apparently you're quite good at finding scenarios with problems :)

Thanks!
mcz
Veteran
Posts: 948
Liked: 223 times
Joined: Jul 19, 2016 8:39 am
Full Name: Michael
Location: Rheintal, Austria
Contact:

Re: object storage integration needs to become more resilient

Post by mcz » 1 person likes this post

Thanks Vladimir for this fantastic cooperation! And thanks for the job offer ;)
veremin
Product Manager
Posts: 20736
Liked: 2403 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: object storage integration needs to become more resilient

Post by veremin »

You're welcome - we will keep it open for a while, so you have time to decide :)

The issue is being investigated at the moment - does not look that good, actually, but at least it occurs only under very specific conditions. Anyway, we're planning to fix it in the first v10 update.

Thanks again!
Post Reply

Who is online

Users browsing this forum: No registered users and 3 guests