Host-based backup of KVM-based VMs (Red Hat Virtualization, Oracle Linux Virtualization Manager and Proxmox VE)
Post Reply
tgx
Enthusiast
Posts: 57
Liked: 62 times
Joined: Feb 11, 2019 6:17 pm
Contact:

After successful Proxmox Worker deployment cannot connect errors

Post by tgx »

I was able to successfully deploy the Proxmox worker to my Proxmox 8.2.2
environment. When I attempt to do anything with the worker, communication with
the worker fails. I note the worker does power up on command and shuts down so
apparently Veeam can sometimes communicate with it.

I note I am unable to ping the worker from any machine. The worker is resolvable via
DNS both by hostname and FQDN.

As I cannot log into the worker (tried the combination of hostname password to no avail),
I am unable to troubleshoot from that side of the equation.

Case opened: Case #07453532
rovshan.pashayev
Veeam Software
Posts: 568
Liked: 113 times
Joined: Jul 03, 2023 12:44 pm
Full Name: Rovshan Pashayev
Location: Czechia
Contact:

Re: After successful Proxmox Worker deployment cannot connect errors

Post by rovshan.pashayev »

Hello,

Since you have already submitted the case, please wait for the support team to contact you.
Rovshan Pashayev
Analyst
Veeam Agent for Linux, Mac, AIX & Solaris
PTide
Product Manager
Posts: 6575
Liked: 772 times
Joined: May 19, 2015 1:46 pm
Contact:

Re: After successful Proxmox Worker deployment cannot connect errors

Post by PTide »

Hi,
When I attempt to do anything with the worker
For example? By 'do anything' you mean in VBR UI, or via SSH login?

Thanks!
tgx
Enthusiast
Posts: 57
Liked: 62 times
Joined: Feb 11, 2019 6:17 pm
Contact:

Re: After successful Proxmox Worker deployment cannot connect errors

Post by tgx »

"Do anything"

1. Run a 'Test Worker' from VBR GUI. It stops on 'Obtaining an IP address for the worker VM.'

2. I cannot login to the Console of the worker from Proxmox GUI because there is no known user/password combination.
I am able to deploy a worker, but I cannot use it in any fashion.

3. I created a backup job for a VM on the Proxmox node but as you would expect it fails.

What I believe is happening is I have assigned a static IP and DNS servers to the worker and for whatever reason those IP's are ignored. Thus it
sits on 'Obtaining an IP address for the worker VM.' Why do I believe this? I have seen many different Linux variants ignore or have major issues
with static IP's as it is assumed that everyone uses DHCP so it never gets tested. I could be wrong of course but that would be my first guess.

Veeam has basically told me to contact Proxmox. Always nice being 'monkey in the middle'
PTide
Product Manager
Posts: 6575
Liked: 772 times
Joined: May 19, 2015 1:46 pm
Contact:

Re: After successful Proxmox Worker deployment cannot connect errors

Post by PTide »

I have seen many different Linux variants ignore or have major issues
with static IP's as it is assumed that everyone uses DHCP so it never gets tested.
'Obtaining an IP address for the worker VM.' - for how long is this stage going? Does it fail eventually?

We did test static IPs and it is supported. Have you tried assigning DHCP address? Does that work?

What happens if you start the worker manually in PVE UI and wait, does the PVE UI show the worker's IP?

The thing is that you need to wait for a while (up to 10 minutes sometimes) to let the worker finish testing.
Worker is not considered to be ready to work until the testing is done.

The fact that VBR is able to shut down and start the worker does not mean that VBR is able to communicate with the VM as those operations are cerried out via Proxmox API.

Thanks!
tgx
Enthusiast
Posts: 57
Liked: 62 times
Joined: Feb 11, 2019 6:17 pm
Contact:

Re: After successful Proxmox Worker deployment cannot connect errors

Post by tgx »

Yes. It fails after a lonnnngg time. Like 5 minutes.
There are no DHCP servers on the network so that makes testing DHCP difficult.

The PVE does not show an IP address. It says, "No Network Information".
If I click on "More", I can see

lo MAC Address and Loopback information and IP.

For eth0, all I see is MAC address.

Yes I am understanding how it communicates with Proxmox server now.
FWIW, I have another guest VM (different OS) on the Proxmox node which
has no communication issues. That is the VM I am trying to backup with Veeam.
tgx
Enthusiast
Posts: 57
Liked: 62 times
Joined: Feb 11, 2019 6:17 pm
Contact:

Re: After successful Proxmox Worker deployment cannot connect errors

Post by tgx »

Here's an observation. I nuked the worker VM on Proxmox. Then I removed it from Veeam.
When I went to add the worker back, I noticed that the IP address I assigned to the worker appeared in the
Proxmox GUI for the VM. I had never seen that happen before on previous attempts. Also the line:
"The following IP address was successfully obtained:", I had never seen.

Here is the most recent chain that I saw in the Veeam GUI:

Testing the worker mytest
The worker VM was deployed successfully
Configuration settings of the worker mytest were synchronized successfully
The worker VM was powered on successfully on the host myhost
The following IP address was successfully obtained: <ipaddress redacted>
Connection to worker service was established successfully.
Updates are Available
The snapshot was created successfully
Updates were installed successfully
The snapshot was deleted successfully
Connection to worker service was established successfully.
Connection to worker core service was established successfully.
Connection between the worker and backup server was established successfully.
Failed to establish connection between the worker and cluster: Unknown socket error
PTide
Product Manager
Posts: 6575
Liked: 772 times
Joined: May 19, 2015 1:46 pm
Contact:

Re: After successful Proxmox Worker deployment cannot connect errors

Post by PTide »

Ok, that means the the first deployment of the worker was faulty for some reason.
Good thing is that on the second deployment the worker has been assigned the IP address, and it seems that VBR was able to update the worker.

However, now there is a problem in communication between the worker and the cluster.

Please wait for the support to reach out to you in the same support case.

Thanks!
tgx
Enthusiast
Posts: 57
Liked: 62 times
Joined: Feb 11, 2019 6:17 pm
Contact:

Re: After successful Proxmox Worker deployment cannot connect errors

Post by tgx » 1 person likes this post

I thought I would fill everyone in on what caused this scenario as you may run into it as well.

When you add your Proxmox server in Veeam it asks for 'DNS name or IP address'. It does not
say 'DNS FQDN or IP Address'. IF you enter just the DNS hostname like 'server' and not 'server.domain.com',
AND you assign a static IP configuration, your worker will fail to install correctly. Interestingly when you create the
worker you CANNOT enter a FQDN name but must use IP address or simple hostname.

This took multiple days of remote access from Veeam to track down but as suspected initially there is an assumption
that DHCP is being used. When static is used, the configuration file for /etc/resolv.conf does not receive the correct search domain.com field
if you entered just the hostname when you configured the Proxmox server, so the worker cannot communicate with the cluster.

To me the fix is during static IP address configuration when adding DNS servers, request what to use for search domain, alternatively the
Proxmox server creation field could request FQDN and deny hostname. There are a number of ways to avoid this scenario. At least there
is now a documented case of what is going on.
PTide
Product Manager
Posts: 6575
Liked: 772 times
Joined: May 19, 2015 1:46 pm
Contact:

Re: After successful Proxmox Worker deployment cannot connect errors

Post by PTide »

Thanks for sharing the outcome! We will consider adjusting the label in the UI accordingly.

Cheers
Scar_UY
Novice
Posts: 4
Liked: 1 time
Joined: Apr 13, 2023 6:33 pm
Contact:

Re: After successful Proxmox Worker deployment cannot connect errors

Post by Scar_UY »

tgx wrote: Oct 23, 2024 3:29 pm I thought I would fill everyone in on what caused this scenario as you may run into it as well.

When you add your Proxmox server in Veeam it asks for 'DNS name or IP address'. It does not
say 'DNS FQDN or IP Address'. IF you enter just the DNS hostname like 'server' and not 'server.domain.com',
AND you assign a static IP configuration, your worker will fail to install correctly. Interestingly when you create the
worker you CANNOT enter a FQDN name but must use IP address or simple hostname.

This took multiple days of remote access from Veeam to track down but as suspected initially there is an assumption
that DHCP is being used. When static is used, the configuration file for /etc/resolv.conf does not receive the correct search domain.com field
if you entered just the hostname when you configured the Proxmox server, so the worker cannot communicate with the cluster.

To me the fix is during static IP address configuration when adding DNS servers, request what to use for search domain, alternatively the
Proxmox server creation field could request FQDN and deny hostname. There are a number of ways to avoid this scenario. At least there
is now a documented case of what is going on.
Just to add my two cents, I came across the same problem, and the solution for the worker tests to pass was to remove the Proxmox server from the inventory and add it again using the FQDN. Only after that did the worker manage to finish the tests successfully
mikeg5
Influencer
Posts: 12
Liked: never
Joined: Apr 03, 2021 10:43 pm
Contact:

Re: After successful Proxmox Worker deployment cannot connect errors

Post by mikeg5 »

If anyone has swapped out the certificate on any one of the nodes that were added to Veeam prior to the certificate swap, that could lead to cert error similar to this as well.
mikeg5
Influencer
Posts: 12
Liked: never
Joined: Apr 03, 2021 10:43 pm
Contact:

Re: After successful Proxmox Worker deployment cannot connect errors

Post by mikeg5 »

Additionally, I had to boot the VM into single user mode (rd.break) and change the root password. Once changed, reboot while Veeam is waiting. Login at first full boot, download my CA cert, place it under:

Code: Select all

/etc/pki/ca-trust/sources/anchors
then

Code: Select all

update-ca-trust
All before the worker node software is installed.

This is the only way I have been able to get things working using a custom CA UI cert. (I know this isn't directly related but this is where I ended up when I went searching.)

Thanks
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest