Error: Failed to Call RPC function 'StartAgent'

Availability for the Always-On Enterprise

Error: Failed to Call RPC function 'StartAgent'

Veeam Logoby DDIT » Fri Jul 28, 2017 8:04 am

Support Case #02261750

Hello,

I have just logged this new case, but in the meantime thought I might post it here in case anyone has any suggestions. Since yesterday I have been unable to backup any of my VM's running on Hyper-V. The Veeam backup server and Hyper-V host are both Windows 2012 R2 with all the latest patches up until last weekend. My backups are targeted to either of two iSCSI targets. Each target is presenting a single LUN. I am running Veeam B&R 9.5 U2.

My backups were running fine on Monday, Tuesday and Wednesday this week, but stopped working on Thursday 27th.

The full error I get is 28/07/2017 08:24:59 :: Error: Failed to call RPC function 'StartAgent': An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full. Cannot connect to socket.

This exact error is mentioned in https://www.veeam.com/kb2289, except I do not have KB4015553 installed as the article suggests (I do have KB4015550 installed however). Anyway, the suggested fix of installing Microsoft KB4025335 on the Veeam server has not helped. Neither has the workaround of removing any unused iSCSI targets.

I removed the iSCSI targets, rebooting the Veeam server, then re-attached the targets but this has not fixed the issue.

There is a similar discussion going on at veeam-backup-replication-f2/veeam-server-100-cpu-rpc-errors-and-backup-failures-t40810.html with talk of installing and uninstalling KB4025335 and KB4025336 (both are installed on my server). The post also seems related to Server 2016, ReFS and CPU's stuck at 100% - none of which apply to my situation. I also have the latest NIC drivers installed for my system.

Any suggestions?
DDIT
Enthusiast
 
Posts: 28
Liked: 5 times
Joined: Thu Oct 29, 2015 5:58 pm
Full Name: Michael Yorke

Re: Error: Failed to Call RPC function 'StartAgent'

Veeam Logoby Robvil » Fri Jul 28, 2017 8:17 am

Since 27th. I´m getting these errors on my Hyper-V 2012r2:

+ Failed to index guest file system
+ Failed to call RPC function 'Vss.GetShadowVolumesListing': Error code: 0x80004005

Don´t know if it´s related.

I have a Hyper-V integration Service Guest update waiting to be installed ... maybe it´s related to this?
Robvil
Enthusiast
 
Posts: 61
Liked: 6 times
Joined: Mon Oct 03, 2016 12:41 pm
Full Name: Robert

Re: Error: Failed to Call RPC function 'StartAgent'

Veeam Logoby kubimike » Fri Jul 28, 2017 2:25 pm

Im patched the latest, a few nights ago I got
'7/27/2017 4:59:40 AM :: Processing XXXXXXXXXXXX Error: Failed to call RPC function 'StartAgent': Timed out requesting agent port for client sessions.'

Job retried, hasn't occured since.
kubimike
Expert
 
Posts: 244
Liked: 24 times
Joined: Fri Feb 03, 2017 2:34 pm
Full Name: MikeO

Re: Error: Failed to Call RPC function 'StartAgent'

Veeam Logoby dalbertson » Fri Jul 28, 2017 2:59 pm

So i was able to mimic this in my lab by coincidence. In reviewing the logs i saw that it would not start the agent on the repo server so i rebooted the server and then tested the job again and it was successful. Im glad you opened a ticket to upload logs. Try to reboot the server running the repo and test again.
dalbertson
Veeam Software
 
Posts: 11
Liked: never
Joined: Tue Jul 21, 2015 12:38 pm
Full Name: Dustin Albertson

Re: Error: Failed to Call RPC function 'StartAgent'

Veeam Logoby DDIT » Fri Jul 28, 2017 3:07 pm

@Robvil - I'm not sure this is the same issue as you are getting. Your errors are different, although still related to RPC. Perhaps you need to open a support case.

@kubimike - I recommend you keep an eye on this. I fully patched my Hyper-V hosts and Veeam server last weekend and gave them all a reboot, installing all available updates up to 22/23 July 2017. My backups were working fine on Monday, Tuesday and Wednesday and just started breaking on Thursday without any changes to the setup. I read somewhere that this can be due to a port/memory leak with iSCSI so it is possible it takes a while to show up.

I had a Veeam engineer contact me and ask me to uninstall the following Windows updates KB4015553, KB4019215, KB4025335, KB4025336.

Of these, only KB4025335 was showing up in the list of installed updates. So, I went ahead and removed it, then rebooted. I then noticed KB4025336 had appeared in my list of installed updates with today's date (it was not there previously). So, I uninstalled that, then rebooted again. Guess what? KB4019215 then showed up as an installed update, again with today's date. My guess is when a superseeding update is installed it actually hides superseeded updates from showing as installed. Only by uninstalling (which is effectively a rollback to how the server was pre-update) does it show the superseeded update again, which would then itself need uninstalling. I kept doing this until all the suspect updates were removed.

However, this did not fix the issue. A lot of the articles suggest this problem can happen when old/stale/rogue iSCSI connections are left in the MS iSCSI initiator. This was not the case with my Veeam server. I did have two iSCSI targets, but these were used by various jobs.

The Veeam engineer then started looking at my hyper-v host. Not only did this have some of the updates mentioned above, it also had two very old iSCSI connections listed in the MS initiator, which we had forgotten about. One was stuck at 'reconnecting...' but the target had long been removed.

My next step is to remove these iSCSI connections, uninstall any of the updates mentioned above, then reboot the hyper-v host, which I will attempt tonight. This is almost certainly the issue as we could successfully backup a VM running on another hyper-v host from my Veeam server. Oddly, this other hyper-v host has all the same patches as my main hyper-v host. It just doesn't have the iSCSI connections.

Having re-read the Veeam KB2289 article, it would have been helpful if the 'solution' mentioned which server they were referring to; the Veeam backup server, or hyper-v host. I had assumed Veeam backup server. And judging by the amount of time the engineer was troubleshooting this issue I guess he thought so too. However, I believe my issue will prove to be related to the hyper-v host we are backing up. Just my two pence.
DDIT
Enthusiast
 
Posts: 28
Liked: 5 times
Joined: Thu Oct 29, 2015 5:58 pm
Full Name: Michael Yorke

Re: Error: Failed to Call RPC function 'StartAgent'

Veeam Logoby DDIT » Mon Jul 31, 2017 10:55 am

Just wanted to confirm that removing the unused iSCSI connections from the Hyper-V host, uninstalling the 4 updates mentioned earlier and rebooting the host has fixed this issue.

I may try reinstalling the Windows updates on at a time on both the Veeam Server and Hyper-V host to see if these affect anything going forward. Hopefully not, and it was just the unused iSCSI connections on the host which were causing this.
DDIT
Enthusiast
 
Posts: 28
Liked: 5 times
Joined: Thu Oct 29, 2015 5:58 pm
Full Name: Michael Yorke

Re: Error: Failed to Call RPC function 'StartAgent'

Veeam Logoby kubimike » Mon Jul 31, 2017 6:42 pm

Im not in Hyper-V this was VMWARE. I found some Veeam article that says to expand the number of ports Veeam uses. I see by default its already quite wide so I left the setting alone.
kubimike
Expert
 
Posts: 244
Liked: 24 times
Joined: Fri Feb 03, 2017 2:34 pm
Full Name: MikeO

Re: Error: Failed to Call RPC function 'StartAgent'

Veeam Logoby Robvil » Wed Aug 02, 2017 6:16 am

Yep, i opened a case and got a hitfix which solved part of my issue. Other solution was to remove a Windows 2016 server as a proxy in the backupjob (i´m running a 2012r2 hyper-V cluster).
Robvil
Enthusiast
 
Posts: 61
Liked: 6 times
Joined: Mon Oct 03, 2016 12:41 pm
Full Name: Robert

Re: Error: Failed to Call RPC function 'StartAgent'

Veeam Logoby signal » Wed Oct 18, 2017 2:25 pm

dalbertson wrote:So i was able to mimic this in my lab by coincidence. In reviewing the logs i saw that it would not start the agent on the repo server so i rebooted the server and then tested the job again and it was successful. Im glad you opened a ticket to upload logs. Try to reboot the server running the repo and test again.


Did you find the root cause for this? Rebooting the server is only going to postpone the errors return. One of my customers is experiencing this issue. Only two concurrent backup jobs, with one VM in each. Only 7 VMDKs in total, and transfer is done using storage snapshots.
signal
Influencer
 
Posts: 21
Liked: never
Joined: Thu Oct 06, 2016 1:19 pm


Return to Veeam Backup & Replication



Who is online

Users browsing this forum: Google [Bot], oscaru and 1 guest