Host-based backup of KVM-based VMs (Red Hat Virtualization, Oracle Linux Virtualization Manager and Proxmox VE)
Post Reply
crowsprofiles
Novice
Posts: 8
Liked: 1 time
Joined: Jul 21, 2020 8:48 am
Contact:

Proxmox infrastructure no longer visible or available

Post by crowsprofiles »

Hi,

I have a weird situation. We're experimenting with VBR (12.2.0.334) and a Proxmox cluster (8.2.4). We set up VBR, added the proxmox nodes and added workers an all the nodes. We did some test jobs of small VMs to the local repository of the VBR server, and it worked fine.

We then added a hardened linux reposity as a test, all looked fine. But when starting another job, now with a bigger vm to see how it would handle it. The job starts, the vm size is reported but then the job fails with "The remote certificate was rejected by the provided RemoteCertificateValidationCallbak". We tried searching the forum, but didn't find anything relevant. Generated a new certificate, but it didn't help.

We then went into the infrastructure, to see about identifying another vm to test with. All the Proxmox nodes were listed under the cluster, but no VMs and nothing refreshing.

At this point, we restarted VBR. Now the Proxmox infrastructure with cluster and all nodes disappeared. When pressing Add server under Virtual Infrastructure, the options are VMware, Hyper-V, RedHat and Orcale. It's like VBR just forgot about Proxmox (and Nutanix).

We'll do more testing and then probably a reinstall of the VBR server to see if we can work around it or even replicate it. We've tried searching for Proxmox gone Proxmox no longer there, not available etc, but didn't find any other reports.

Anyone seen anything like it?
rovshan.pashayev
Veeam Software
Posts: 443
Liked: 94 times
Joined: Jul 03, 2023 12:44 pm
Full Name: Rovshan Pashayev
Location: Czechia
Contact:

Re: Proxmox infrastructure no longer visible or available

Post by rovshan.pashayev »

Hello,

That sounds like a technical issue. Please provide a support case ID for this issue, as requested when you click New Topic.
PS: Support can only help if you upload logs https://www.veeam.com/kb1832

Regarding Proxmox infrastructure disappearance, please check also KB: https://www.veeam.com/kb4687
Rovshan Pashayev
Analyst
Veeam Agent for Linux, Mac, AIX & Solaris
crowsprofiles
Novice
Posts: 8
Liked: 1 time
Joined: Jul 21, 2020 8:48 am
Contact:

Re: Proxmox infrastructure no longer visible or available

Post by crowsprofiles »

Hi!

Thanks - you were right, it had something to do with the plugins, as after another restart only VMware and Hyper-V were left, all the other options were missing. That kb article helped us further troubleshoot the issue.

The underlying reason for the plugins failing was different than in the kb though. It was because of issues with the Proxmox node certificates. The certificates are issued by a trusted internal CA on our airgapped network. We had initially forgotten to add the trusted root certificate for the internal CA to the VBR server. We're assuming that the certificates were initially accepted when the node was added manually (this seems confirmed by further testing, see below).

Secondly, our certificates have a very short lifetime and are renewed every couple of days. It seems only the initial certificate certificates in use when the nodes were added was trusted without a CRL. When these were automatically renewed with new certificates, VBR panicked, failing the plugins and removed the nodes from the console as well as prevented the plugins from starting.

The trusted CA was added to the VBR server, allowing the certificates to be trusted in windows. This brought back the nodes in the console and the plugins are now starting, allowing new servers to be added including proxmox etc.

However, the nodes are still not working and the logs keep recording errors. No VMs became visible under the nodes. Trying to do rescan on the node in the console, gives the error "Failed to rescan the Proxmox VE server.".

This lead us to the second issue. Even with certificates and certificate chains trusted by Windows, VBR also requires the certificates to have a designated CRL. Our certificates do not have a CRL (our network is airgapped, we have an internal CA that issues certificates with a very limited timespan). So the VBR console would still not populate the nodes with VMs or allow any backups to proceed. Note that the certificates are trusted in Windows, Edge, Chrome etc. without a CRL.

For reference and searches, the Veeam.PVE.Platform.Svc.log contains repeated instances of:

Code: Select all

[CertificateValidationUtils]: Certificate summary information: Certificate errors: None 
Certificate chain errors:
[CertificateValidationUtils]: Remote certificate <thumbnail> has the following validation issues
[CertificateValidationUtils]: The revocation function was unable to check revocation for the certificate.
And this fails rescan and other connections.

Selecting Properties on the node and proceeding passed the Credentials page gives a Certificate Security Alert, where it's possible to view the certificate, continue or cancel. View shows the certificate trusted by Windows. If we continue and then apply the snapshot storage settings, the Console refreshes the node and the VMs are listed again. This must have been the same behavior as when the nodes were initially added, adding the certificates on a trusted list (but is unfortunately not feasible for us to do every time the certificate is renewed for each node).

After doing this it is then possible to run rescan and work with the node again without issue. Forcing a certificate renewal on a node that had gone through the workaround steps above, makes it no longer trusted with a rescan giving the same "Failed to rescan the Proxmox VE server." error.

Based on this, we have two Veeam requests.
1. More graceful handling of certificate issues with a node. It would be better if the user is notified that there is a certificate issue instead of the plugins die, the nodes disappear and the Add server dialog is almost empty.. :)
2. Allow trusted certificates, with a complete certificate chain but without a CRL.

On our side, before thinking about taking this into production, we will have to initiate a project to rebuild our internal airgapped CA to provide CRLs going forward. For us, it means that the Veeam/Proxmox project will have to be postponed until after the CA upgrade project.
rovshan.pashayev
Veeam Software
Posts: 443
Liked: 94 times
Joined: Jul 03, 2023 12:44 pm
Full Name: Rovshan Pashayev
Location: Czechia
Contact:

Re: Proxmox infrastructure no longer visible or available

Post by rovshan.pashayev »

Hello,

Thank you for your detailed explanation of the case. We will review your requests.
Rovshan Pashayev
Analyst
Veeam Agent for Linux, Mac, AIX & Solaris
rovshan.pashayev
Veeam Software
Posts: 443
Liked: 94 times
Joined: Jul 03, 2023 12:44 pm
Full Name: Rovshan Pashayev
Location: Czechia
Contact:

Re: Proxmox infrastructure no longer visible or available

Post by rovshan.pashayev »

Hello @crowsprofiles,

Could you please open a support case and upload the logs so we can review your scenario? This would be very helpful for us.

Even a free case would be sufficient. Once you open the case, kindly share the case number here..
Rovshan Pashayev
Analyst
Veeam Agent for Linux, Mac, AIX & Solaris
Post Reply

Who is online

Users browsing this forum: No registered users and 7 guests