Host-based backup of oVirt KVM-based VMs (Red Hat Virtualization, Oracle Linux KVM)
Post Reply
mdws
Influencer
Posts: 17
Liked: never
Joined: Nov 03, 2011 6:14 pm
Contact:

My Actual Summary

Post by mdws »

I did some tests the last months with a small environment (Single RHV Host, 10 VMs, VBR on Windows 2019, Repository on Centos XFS)
All VMs reside on Block Storage (FC).
So far, here are my experiences with Veeam for RHV:

- Backup VMs with CBT enabled will require twice the Disk space as long the Backup is running
- Backup VMs with CBT not enabled will freeze the VM as soon as the checkpoint is full (I think the size is fixed 1 or 10GB)
- when the Ovirt Storage Domain is not big enough to create the Checkpoints, the Backups will fail but will remain in the Postgres of Engine, all consequent Backups will fail until you delete the Backups manually from the Engine Database

The points above are under Development for 4.5 and I hope they will be fixed soon.

- The Backup Speed is good, even without CBT enabled
- The Restore Speed is much too slow (about 500MBits in a single Restore Session, will scale with parallel Restores) -> this should be solved by Veeam
- If the VBR Server fails while a Backup, the Backup Session will stuck forever in State "Starting" on VBR Server. This causes a failure of future Jobs, to work around you need to create a new Backup Job (but the old Backups are undeletable as the are locked) or you can save / restore the VBR configuration -> there need to be a a better solution to handle such Situations

Nice to have would be:

- Instant recovery (NFS should be no problem with RHV)
- Recovery from VMware / Hyper-V VM to Ovirt (for Migration Scenarios)
- More than one Proxy (the Proxy or the Host the Proxy is running will be a Bottleneck as designed)
- Support for a Proxy on Bare Metal (then you could use one Machine with high Bandwidth 40 or 100Gbit and eliminate the Proxy / Proxy Host Bottleneck)

I think the Restore Speed is the Showstopper for now, the other issues you can work around.
HannesK
Product Manager
Posts: 14287
Liked: 2877 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: My Actual Summary

Post by HannesK »

Hello,
and welcome to the forums.

Thanks for you feedback. For the issues on VBR side, we will see what we can do (I might come back and ask you for support case & logs for the restore performance).

Fore the feature requests: they are make sense and lets see when things can be implemented.

Best regards,
Hannes
HannesK
Product Manager
Posts: 14287
Liked: 2877 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: My Actual Summary

Post by HannesK »

Hello,
could you please open a support case on the performance issue and post the case number here? From a software perspective, we expect at least around 200-300MByte/s on a single restore.

So the question is, whether it is an infrastructure issue, or a software issue that we would need to fix.

Thanks,
Hannes
mdws
Influencer
Posts: 17
Liked: never
Joined: Nov 03, 2011 6:14 pm
Contact:

Re: My Actual Summary

Post by mdws »

Hello,

I'm using VBR Community Edition, can I open a call regardless of the used Edition?
HannesK
Product Manager
Posts: 14287
Liked: 2877 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: My Actual Summary

Post by HannesK »

Hello,
normally you need at a valid support contract for VBR. But in this case we would make an exception to see whether it's something we need to take care of. Could you maybe also tell us what kind of fibre channel storage you have (vendor, model, number of disks / SSDs) to get an idea of what the storage should be capable?

And what is the repository server? A physical machine with what kind of storage attached?

Once the case number is posted, I will take care of the case. Please ensure that logs are included from the RHV proxy and from VBR (full logs is the easiest way).

Thanks,
Hannes
mdws
Influencer
Posts: 17
Liked: never
Joined: Nov 03, 2011 6:14 pm
Contact:

Re: My Actual Summary

Post by mdws »

Hello,

the Case is #05244819.
I uploaded the Logs from the Proxy and the VRB Server + Repository.
For testing I created two new VMs after Upgrading to Ovirt 4.4.10, VM1 and VM2.
I restored VM1 and made Screenshots from the Repository Servers Network Usage + top and from the Node iotop.
Then I restored VM1 and VM2 at the same time and made the same Screenshots.

The Node has not really a FC HBA, it's a SAS HBA ServeRAID M5110 with 6 Samsung EVO SSDs in a Raid 5. For Ovirt this is "FC" Storage.
The ServeRAID M5110 has 2 GB write cache enabled and is capable to write ~ 1GBs.
The Proxy is running as a VM on this Node.
For Engine, VBR Server, Repository there is one physical PC:

Intel(R) Core(TM) i5-4210U
8 GB Ram
Hoodisk SSD 128GB
Samsung SSD 870 2TB
Intel Corporation I211 Gigabit Network Connection

The OS on this HW is Centos 8 Stream, Ovirt Engine is installed.
The VBR Server is a Windows Server 2019 KVM VM running on the same HW.
The OS and the VBR VM are both installed on the Hoodisk, the Samsung SSD is formatted with XFS and is only used as VBR Repository.
The Link between this HW an the Node is 1Gbits.

I now the HW for the Engine / VBR Server is a little bit undersized, but if you check the Screenshots + Logs you will see that there is no bottleneck (as far as I can see).
mdws
Influencer
Posts: 17
Liked: never
Joined: Nov 03, 2011 6:14 pm
Contact:

Re: My Actual Summary

Post by mdws »

Hello,

the Backup Job for VM1 + VM2 is BU07, the Disks for VM1 + VM2 are 20GB preallocated QCOW2 Disks with Incremental Backups enabled, OS is Centos 8 Stream.
HannesK
Product Manager
Posts: 14287
Liked: 2877 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: My Actual Summary

Post by HannesK »

Hello,
thanks for the logs. While it is an unsupported environment (we currently only support RHEL), we will have a look at it.

Best regards,
Hannes
mdws
Influencer
Posts: 17
Liked: never
Joined: Nov 03, 2011 6:14 pm
Contact:

Re: My Actual Summary

Post by mdws »

Thanks,

we plan to use RHV in production in the future (now we use vSphere) as soon a backup solution is available.
Until then we can only try to help to improve the product in a test environment.
As we don't want to pay subscription for the test environment, we will stick on Ovirt until we go into production.
I hope thats ok for testing and feedback for Veeam RHV.
Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests