-
- Enthusiast
- Posts: 65
- Liked: 45 times
- Joined: Feb 14, 2018 1:47 pm
- Full Name: Chris Garlington
- Contact:
VEEAM B&R - Cluster Agent Issues
Unsure if this should be here or in the Agent subforum, however it seems to be primarily an issue with B&R so I figured I'd start here.
We're currently running B&R 9.5.0.1536, with a pair of agents installed on a MSCS failover cluster installation (two nodes, both VMWare VMs, both with Physical RDM disks, necessitating agents), which is thankfully now a supported configuration. Cluster integration went well, and aside from a registry tweak or two, we've had no significant issues. We do however have one minor one, and that's that occasionally B&R seems to 'lose' connectivity to one of the cluster hosts. Backups will fail to that host with a 'failed to connect to hostname.fqdn:6160', and if I attempt to rescan the hosts/cluster in the 'Physical and Cloud Infrastructure' section of inventory, I end up with hostname1.fqdn, hostname2.fqdn (one of which will show 'last seen' as a day or two ago), and a 'hostname1/2' without FQDN, and with a connectivity error. Rescans of the cluster will return error of 'Cannot connect to HOSTNAME (no FQDN) Error: The format of the specified computer name is invalid'.
If I failover roles from the host which is seemingly having connectivity issues, then restart it, upon coming back up and failing roles back, B&R is able to rescan it fine. So far, that's been my fix action. This happens maybe twice a week? or so, no real reason as far as I can tell, and random as to which of my two cluster hosts are affected. Those VMs are pretty static, and don't tend to get touched unless I'm doing it to fix VEEAM's connectivity to them.
One last note that might add a wrinkle to the equation, the two cluster hosts (as well as VEEAM itself) sit on a disjointed namespace, so they're joined to 'domain.org' but the connection-specific DNS suffix is 'subdomain.domain.org'. Every great once in a while, we run into something that goes pear-shaped when confronted with disjoint namespaces, so wasn't sure if this was a case of that. All servers have a properly configured DNS suffix search order which includes aforementioned disjointed domain name, so resolution to 'HOSTNAME' works fine whether you include the FQDN or not (assuming the 'correct' fqdn is being included anyhow).
I appreciate any insights, thank you!
We're currently running B&R 9.5.0.1536, with a pair of agents installed on a MSCS failover cluster installation (two nodes, both VMWare VMs, both with Physical RDM disks, necessitating agents), which is thankfully now a supported configuration. Cluster integration went well, and aside from a registry tweak or two, we've had no significant issues. We do however have one minor one, and that's that occasionally B&R seems to 'lose' connectivity to one of the cluster hosts. Backups will fail to that host with a 'failed to connect to hostname.fqdn:6160', and if I attempt to rescan the hosts/cluster in the 'Physical and Cloud Infrastructure' section of inventory, I end up with hostname1.fqdn, hostname2.fqdn (one of which will show 'last seen' as a day or two ago), and a 'hostname1/2' without FQDN, and with a connectivity error. Rescans of the cluster will return error of 'Cannot connect to HOSTNAME (no FQDN) Error: The format of the specified computer name is invalid'.
If I failover roles from the host which is seemingly having connectivity issues, then restart it, upon coming back up and failing roles back, B&R is able to rescan it fine. So far, that's been my fix action. This happens maybe twice a week? or so, no real reason as far as I can tell, and random as to which of my two cluster hosts are affected. Those VMs are pretty static, and don't tend to get touched unless I'm doing it to fix VEEAM's connectivity to them.
One last note that might add a wrinkle to the equation, the two cluster hosts (as well as VEEAM itself) sit on a disjointed namespace, so they're joined to 'domain.org' but the connection-specific DNS suffix is 'subdomain.domain.org'. Every great once in a while, we run into something that goes pear-shaped when confronted with disjoint namespaces, so wasn't sure if this was a case of that. All servers have a properly configured DNS suffix search order which includes aforementioned disjointed domain name, so resolution to 'HOSTNAME' works fine whether you include the FQDN or not (assuming the 'correct' fqdn is being included anyhow).
I appreciate any insights, thank you!
-
- Product Manager
- Posts: 14726
- Liked: 1707 times
- Joined: Feb 04, 2013 2:07 pm
- Full Name: Dmitry Popov
- Location: Prague
- Contact:
Re: VEEAM B&R - Cluster Agent Issues
Hello Chris,
This behavior is unexpected, so please open a support case and share the case. I'll ask RnD team to review your logs. Thanks in advance.
This behavior is unexpected, so please open a support case and share the case. I'll ask RnD team to review your logs. Thanks in advance.
-
- Enthusiast
- Posts: 65
- Liked: 45 times
- Joined: Feb 14, 2018 1:47 pm
- Full Name: Chris Garlington
- Contact:
Re: VEEAM B&R - Cluster Agent Issues
Done, thank you.
-
- Product Manager
- Posts: 14726
- Liked: 1707 times
- Joined: Feb 04, 2013 2:07 pm
- Full Name: Dmitry Popov
- Location: Prague
- Contact:
Re: VEEAM B&R - Cluster Agent Issues
Chris,
Can you please update this thread with your case ID? Thanks!
Can you please update this thread with your case ID? Thanks!
-
- Enthusiast
- Posts: 65
- Liked: 45 times
- Joined: Feb 14, 2018 1:47 pm
- Full Name: Chris Garlington
- Contact:
Re: VEEAM B&R - Cluster Agent Issues
Yep, Case ID 02616761
-
- Lurker
- Posts: 1
- Liked: never
- Joined: Mar 13, 2018 1:48 pm
- Full Name: Vaibhao W
- Contact:
Re: VEEAM B&R - Cluster Agent Issues
Could you please share the resolution on this issue?
-
- Product Manager
- Posts: 14726
- Liked: 1707 times
- Joined: Feb 04, 2013 2:07 pm
- Full Name: Dmitry Popov
- Location: Prague
- Contact:
Re: VEEAM B&R - Cluster Agent Issues
Hello vsw.
The case was closed as it looked like a cross-domain authentication bug from OS side. If you have similar behavior please open a case and share debug logs with support engineer, you may use this topic as a reference. Thank you!
The case was closed as it looked like a cross-domain authentication bug from OS side. If you have similar behavior please open a case and share debug logs with support engineer, you may use this topic as a reference. Thank you!
-
- Enthusiast
- Posts: 65
- Liked: 45 times
- Joined: Feb 14, 2018 1:47 pm
- Full Name: Chris Garlington
- Contact:
Re: VEEAM B&R - Cluster Agent Issues
To clarify, it looks like we've run into a very specific bug, cited here:vsw wrote:Could you please share the resolution on this issue?
https://blogs.technet.microsoft.com/joh ... -failures/
Specifically, our MSCS cluster hosts were/have been passing into a 'failed auth state' referenced in that blog post, which prevented them from properly authenticating certain types of netlogon/rpc sessions (based on my understanding). The DCs these are talking to are 2012R2 systems, so I'm not entirely sure whether or not MS has/will patch them to resolve this.
At the moment, we're basically coasting until we finalize a migration we're going through, at which point there won't be any cross-domain traffic to fail.
Who is online
Users browsing this forum: No registered users and 6 guests