Host-based backup of VMware vSphere VMs.
Post Reply
SKratzTS
Enthusiast
Posts: 70
Liked: 16 times
Joined: Jan 06, 2017 7:23 pm
Full Name: Steve Kratz
Contact:

Failing job - Veeam support not responding

Post by SKratzTS »

Case 5162624.

We have two datacenters, each with their own VBR server with storage on an Exagrid appliance. Starting last Wednesday, BOTH of the VBR jobs started getting errors: "Error: Exception of type 'VeeamBackupAgentProvider.AgentClosedException' was thrown"

I've seen other KB articles saying IPV6 being on could cause problems. I've switched all of that off. I'm not seeing any correlation between specific proxies or jobs failing. I've even uninstalled the proxy apps and re-installed those as well just to be sure they're OK.

I opened a severity 2 case last Wednesday, and I've had one response requesting logs, which I included in the original ticket, and even added another set of logs. Requests for updates to the tickets have been made on Thursday, Friday, and today without any response from Veeam.
Mildur
Product Manager
Posts: 8678
Liked: 2276 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian K.
Location: Switzerland
Contact:

Re: Failing job - Veeam support not responding

Post by Mildur » 1 person likes this post

Hi Steve

Have you tried to escalate the case to a manager?
How to Escalate a Support Case or Talk to a Support Manager

For me, that normally works to get an immediate answer (within 1 hours)
Product Management Analyst @ Veeam Software
SKratzTS
Enthusiast
Posts: 70
Liked: 16 times
Joined: Jan 06, 2017 7:23 pm
Full Name: Steve Kratz
Contact:

Re: Failing job - Veeam support not responding

Post by SKratzTS »

I did not know that was a thing. Thanks!
SKratzTS
Enthusiast
Posts: 70
Liked: 16 times
Joined: Jan 06, 2017 7:23 pm
Full Name: Steve Kratz
Contact:

Re: Failing job - Veeam support not responding

Post by SKratzTS »

Continuing my thread - Veeam support has been engaged for several days, but we're hitting a brick wall. So far, I've updated the 10.0 servers to 11a, verified there's no AV issues impacting the backups, and a handful of other things that Veeam has suggested.

I still have several servers that continue to throw that "AgentClosedException" and all Veeam services stop. No errors in the event logs, just behavior like someone went into Services.msc and stopped them. The three main servers that are failing all happen to be pretty large file servers. VMWare w/Simplivity is the back-end infrastructure. None of these servers have existing snapshots. No errors are on VMWare. Storage has plenty of space. Proxies have been completely re-installed.

At a loss. Any thoughts?
PetrM
Veeam Software
Posts: 3262
Liked: 527 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Failing job - Veeam support not responding

Post by PetrM »

Hi Steve,

I believe that the best action plan is to wait for the outcome of debug logs analysis from our engineers.

Any unsuccessful attempt to fix the issue allows us to narrow down the scope of possible reasons and to be some steps closer to the solution. As far as I see, the case was opened 6 days ago but a lot of useful work has already been done. Also, I've just asked our support team leaders to pay attention to your case and perhaps to escalate it to a higher level.

Thanks!
SKratzTS
Enthusiast
Posts: 70
Liked: 16 times
Joined: Jan 06, 2017 7:23 pm
Full Name: Steve Kratz
Contact:

Re: Failing job - Veeam support not responding

Post by SKratzTS »

Thanks Petr. I'm just getting very stressed since this whole thing landed on my plate, and I'm literally running, and re-running the failed jobs over and over manually. (All day, and into the evening for over a week... ugh.)
PetrM
Veeam Software
Posts: 3262
Liked: 527 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Failing job - Veeam support not responding

Post by PetrM »

By the way, if jobs start to work on retry, you may increase the number of automatic retries in job settings. It might be used as a temporary workaround until the permanent solution is found.

Thanks!
SKratzTS
Enthusiast
Posts: 70
Liked: 16 times
Joined: Jan 06, 2017 7:23 pm
Full Name: Steve Kratz
Contact:

Re: Failing job - Veeam support not responding

Post by SKratzTS »

Good tip-- Might help in some cases, but on about 90% of the failures, the proxy/data mover service turns itself off. We have 4 VMWare hosts, and 4 proxies, so there's 3 extra retries with NBD transport it can try...
PetrM
Veeam Software
Posts: 3262
Liked: 527 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Failing job - Veeam support not responding

Post by PetrM »

Hello,

Obviously, it's not ideal at all but better than nothing. I see that the case is escalated, let's wait for the outcome of research.

Thanks!
SKratzTS
Enthusiast
Posts: 70
Liked: 16 times
Joined: Jan 06, 2017 7:23 pm
Full Name: Steve Kratz
Contact:

Re: Failing job - Veeam support not responding

Post by SKratzTS »

Petr:

Falling into another no-response period. Two days without response. Any activity happening behind the scenes?
PetrM
Veeam Software
Posts: 3262
Liked: 527 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Failing job - Veeam support not responding

Post by PetrM »

Hi Steve,

Yes, the case is being researched by our senior engineers and I guess you should get an update soon. I've just asked our support leads to accelerate the process if it's possible, it always depends on complexity of the issue.

Thanks!
SKratzTS
Enthusiast
Posts: 70
Liked: 16 times
Joined: Jan 06, 2017 7:23 pm
Full Name: Steve Kratz
Contact:

Re: Failing job - Veeam support not responding

Post by SKratzTS » 1 person likes this post

Thank you. I added the comments about trying a fresh B&R server to the ticket. This really has be stumped. I wouldn't be surprised to see some storage or network issue be at fault, but EVERYTHING we've looked at shows no problems, other than backup jobs failing.
SKratzTS
Enthusiast
Posts: 70
Liked: 16 times
Joined: Jan 06, 2017 7:23 pm
Full Name: Steve Kratz
Contact:

Re: Failing job - Veeam support not responding

Post by SKratzTS » 3 people like this post

Issue has been resolved. Support confirmed a Veeam service dependency on the Windows WMI service. One of our engineers had put a script in place that was adding WMI permissions to servers, and restarting the WMI service. It was intended to run once, but it was set to run every time Group Policy updated. When it ran, WMI restarted, and Veeam services on various servers stopped.
PetrM
Veeam Software
Posts: 3262
Liked: 527 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Failing job - Veeam support not responding

Post by PetrM » 1 person likes this post

Hi Steve,

Glad to hear that it's sorted out!

Thanks!
sasilik
Expert
Posts: 104
Liked: 13 times
Joined: Jun 12, 2014 11:01 am
Full Name: Markko Meriniit
Contact:

Re: Failing job - Veeam support not responding

Post by sasilik »

SKratzTS wrote: Dec 15, 2021 4:45 pm Issue has been resolved. Support confirmed a Veeam service dependency on the Windows WMI service. One of our engineers had put a script in place that was adding WMI permissions to servers, and restarting the WMI service. It was intended to run once, but it was set to run every time Group Policy updated. When it ran, WMI restarted, and Veeam services on various servers stopped.
I saw GP update and remembered that I had kind of similar problem more than a year ago, long running tape jobs were failing. Solution was to turn off automatic GP updates on tape server because when background GP update ran on server then something happened and Veeam failed. There was no such specific reason like in your case, the WMI service restart, just when GP update started, automatically in background or from command line (gpupdate /force ) then tape job was screwed. Supposedly tape driver crashed or stopped responding. Seems that GP update is dangerous thing. Topic itself is here tape-f29/tape-restore-failed-with-error ... 63753.html
soncscy
Veteran
Posts: 643
Liked: 312 times
Joined: Aug 04, 2019 2:57 pm
Full Name: Harvey
Contact:

Re: Failing job - Veeam support not responding

Post by soncscy »

@Markko,

Ah I actually can answer this one for you; it's because gpupdate may also force a rescan of the scsi devices which takes them offline. IBM even has an article on it: https://www.ibm.com/support/pages/anr83 ... sing-drive

As far as I understand this MSDN document, when the update runs, it tries to rebuild the list of devices depending on your settings, and likely that takes them offline momentarily in Windows, hence killing the tape operations.
sasilik
Expert
Posts: 104
Liked: 13 times
Joined: Jun 12, 2014 11:01 am
Full Name: Markko Meriniit
Contact:

Re: Failing job - Veeam support not responding

Post by sasilik »

Good to know. If this is a case maybe Veeam could mention it in some KB or documentation that GP update may cause problems in tape servers.
SKratzTS
Enthusiast
Posts: 70
Liked: 16 times
Joined: Jan 06, 2017 7:23 pm
Full Name: Steve Kratz
Contact:

Re: Failing job - Veeam support not responding

Post by SKratzTS » 1 person likes this post

This issue at hand in my case was from another admin sneaking in a scheduled task that ran repeatedly throughout the day, and included powershell commands to restart WMI. WMI is required for Veeam services. Without it, VBR stops about 3/4 of the services it has running.
Post Reply

Who is online

Users browsing this forum: Google [Bot] and 59 guests