Standalone backup agent for Microsoft Windows servers and workstations (formerly Veeam Endpoint Backup FREE)
Post Reply
FECV
Enthusiast
Posts: 41
Liked: 7 times
Joined: Mar 24, 2016 2:23 pm
Full Name: Frederick Cooper V
Contact:

Timeout To Start Agent Error

Post by FECV »

I have an issue that I am looking for some help with. My current case number is 05276077. I have two Agent jobs in my VBR server that fail with the error "Error: Timeout to start agent". My target being backed up only has a single network connection that correlates to PUBLIC_IP_1 in my log excerpt below. (Its a LAN connection). The job has been working fine for years, but recently i added a NIC with PRIVATE_IP_3 to my VBR server for SAN ISCSI connection. All IPs are on completely different subnets by the way. As soon as i added the PRIVATE_IP_3 to my VBR server this error started. Disabling the NIC in windows allows the job to be successful, and re-enabling it causes the same issue. With the this NIC enabled, the Agent inventory scan completes no issues so there is no network level issue connecting over the PUBLIC_IP_1 which is the only network these machines can talk to each other on. Using the Priority Network option in VBR is not an option. I have tried the ConnectBtIPsTimeoutSec registry key set to 1200, but it does not seem to be the issue nor affect the job in anyway which always fails in about 5-6 min. I have tried a setting the VBR IP in the host file but that has no affect either.

MODERATOR EDIT: removed logs according to forum rules
HannesK
Product Manager
Posts: 14314
Liked: 2890 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Timeout To Start Agent Error

Post by HannesK »

Hello,
As soon as i added the PRIVATE_IP_3 to my VBR server this error started
this seems to be a good isolation of the error 👍

It sounds like a routing issue. Maybe default route vs. static routes. Maybe asynchronous routing (route to / back are different and a firewall in between). If preferred networks were configured, this could also be a reason, but I guess more for a general network issue outside Veeam.

Best regards,
Hannes
FECV
Enthusiast
Posts: 41
Liked: 7 times
Joined: Mar 24, 2016 2:23 pm
Full Name: Frederick Cooper V
Contact:

Re: Timeout To Start Agent Error

Post by FECV »

This is not a routing error. The server being backed up only has 1 network connection to the VBR server on the PUBIC IP 1. The issue from my understanding of the logs is that the VBR server is broadcasting 4 IP addresses to the Agent. The Agent is trying all of them but seems to have issues when the valid connection to the server is the 4th one it tries or fails to try. It never seems to try it that i can see. As i said in my post most get around this with setting a priority network in the VBR server. However for serval reasons that was not an option for me. I really wish Veeam would do two things here. 1st being able to pick which networks or exclude some networks that are advertised to agents. If this option exists i have yet to see it and support is not offering it as an option. Also would be nice if someone else could test this to see if it is a bug. Not sure how many folks have their Agent cycling through 4 networks to reach the VBR server. Again each network is a completely seperate subnet with no connection or routing to the other network. But as long as a timeout is not reached it should keep trying. It only seems to try 2 or 3 in the logs before failing.

I have seemed to work around this issue by manually setting Interface Metrics for my network connections. I don't really view this as a solution as Veeam should work without them being set. I hate having to add one off weird fixes like this, because they are often forgot about and come back to haunt you in 6 years.
HannesK
Product Manager
Posts: 14314
Liked: 2890 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Timeout To Start Agent Error

Post by HannesK »

a bit off-topic, but maybe there alternative once I have a better understanding... what is the problem you try to solve by adding so many NICs to the backup server?
FECV
Enthusiast
Posts: 41
Liked: 7 times
Joined: Mar 24, 2016 2:23 pm
Full Name: Frederick Cooper V
Contact:

Re: Timeout To Start Agent Error

Post by FECV »

The other NICs are my iscsi connections to my SANS for off-host backups. Not sure why i need an alternative if the product works like it should.
FECV
Enthusiast
Posts: 41
Liked: 7 times
Joined: Mar 24, 2016 2:23 pm
Full Name: Frederick Cooper V
Contact:

Re: Timeout To Start Agent Error

Post by FECV »

Moderator, your link to the rules seems to be bad.
o%09https%3A//forums.veeam.com/veeam-ba ... -t755.html
HannesK
Product Manager
Posts: 14314
Liked: 2890 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Timeout To Start Agent Error

Post by HannesK »

Hello,
I just checked the case number and it was closed on February as the issue "went away on it's own". Is that the right case number?
The other NICs are my iscsi connections to my SANS for off-host backups.
just to clarify: is this for Backup from Storage Snapshot for Veeam Agent for Windows or is that SAN connection for VMware backup?
Also would be nice if someone else could test this to see if it is a bug.
unfortunately, I still don't understand the network setup. I got lot why two NICs (probably more, because PRIVATE_IP_3 sounds like at least three NICs) return 4 IP addresses. Do you maybe have network diagram?

Best regards,
Hannes
PS: I fixed the link above.
FECV
Enthusiast
Posts: 41
Liked: 7 times
Joined: Mar 24, 2016 2:23 pm
Full Name: Frederick Cooper V
Contact:

Re: Timeout To Start Agent Error

Post by FECV »

Yes I have put the wrong case number above. I opened a case for this, but then closed it when the error went away. I opened a new case referencing the previous case when the issue reappeared and i had more time to dig into it. The second case number is 02470746.

While i make use of the Veeam Agent Hardware Snapshots that is not what is in play on this job. The jobs i am having issues with are simple Veeam Agent for Windows backup jobs of a small physical server.

So not sure how i am supposed to upload a network diagram when the 10 lines of sterilized logs i was asking about were removed per Veeam forum policy. I can try to explain it a bit differently. My host being backed up has 1 network connection on my main public LAN. My VBR server which is also my proxy and repository has connections to this public lan, but also has connections to 3 other private LANs used for SAN traffic. Each a seperate subnet so networking is not the issue here. VBR server is broadcasting to Veeam Agent all 4 IP Addresses. I can not use preferred networks, so please don't recommend that. My understanding is that Veeam Agent should try all 4 networks to connect to my VBR server. I realize there is a timeout, but as far as i can see i am not hitting that, and adding in the registry keys to increase the timeout did not have any effect. From the logs it looks like Veeam Agent is testing the first 3 IP addresses being my private LAN IP addresses and failing as I would expect. But I never see it try the 4th IP address which would be my public LAN IP which would work. It seems to stop after 3 and error out. This is where I feel there may be a bug. I have manually changed the priority of my NICs so the public LAN has a lower priority. I guess Windows assigned it a higher one because it had a public ip address. Not sure on that. This has resolved my issue, but i don't like manual settings like this. They tend to cause issues in the future when they are forgotten about. Plus from my understanding of how its supposed to test all the IP addresses it should not be needed.

I was hoping to get an understanding of what I am seeing the logs and why i saw a failure connecting on the first two private IPs, the start of a try of the 3rd private IP, but did not see a failure of the 3rd private IP nor the start or failure of my Public LAN IP which would have been successful. This has been a beyond difficult task to get someone to explain to me.
HannesK
Product Manager
Posts: 14314
Liked: 2890 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Timeout To Start Agent Error

Post by HannesK »

02470746 looks like a support ID...

if my understanding is right, then you have this setup as shown in the picture below? If yes, that should work. To help you, I need the right case number (starting with 05...) with logs attached to that case.

Image
FECV
Enthusiast
Posts: 41
Liked: 7 times
Joined: Mar 24, 2016 2:23 pm
Full Name: Frederick Cooper V
Contact:

Re: Timeout To Start Agent Error

Post by FECV »

Yes i screwed that up again... The case number is 05369141. Yes your diagram is accurate, but need to note by default windows assigned Interface Metric (Priority) of the NICs favored the Private LAN 1 and 2 and ISCSI SAN as "10" and assigned the Public LAN as "15" so it was always the last interface tried.

I let the case closet this morning. The support engineer refused to help me with understanding the logs and talking to a manager or my Sales Account Manager is just a joke these days. I would be happy to recreate the issue and upload new logs or whatever if you or someone would like to seriously look at them. I am just tired of wasting my time. All i really wanted is someone to explain to me the snippet of logs where it does not try the Public LAN and tell me why that happening if that is the way its supposed to work.
HannesK
Product Manager
Posts: 14314
Liked: 2890 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Timeout To Start Agent Error

Post by HannesK »

As you work for an entity with restrictions on log file access, it's a bit complicated from here (EMEA). From what we saw, the software did attempt to connect to the correct IP, but that also failed.

I have the feeling, that there was some tension in the case and I can understand that you stopped with that case.

The only way I see to get down to the root cause is opening a new case with logs again. Once you post the case number, I will talk to support to assign a different engineer on that case.
Post Reply

Who is online

Users browsing this forum: No registered users and 20 guests