-
- Expert
- Posts: 127
- Liked: 22 times
- Joined: Feb 18, 2015 8:13 pm
- Full Name: Randall Kender
- Contact:
All Backup Copy Jobs Not Automatically Running
Really strange issue we're seeing. For some reason all backup copy jobs in our environment have stopped automatically running. They are all set to "Immediate copy (mirroring)" however all of them just never run until you right-click on them and choose sync now. When doing that most of them seem to work, but there are 5 out of 39 of our backup copy jobs that won't even start when doing a sync now.
There's no errors on the jobs, they don't start and then fail, they simply just do not start at all. We even have some jobs that go to NAS units that get disabled and re-enabled on a daily basis, and even cycling between enabled -> disabled -> enabled doesn't trigger any data sync. All the Veeam servers were even restarted and that hasn't resolved the issue. Services came back online and no data syncs started.
We've had a case open for a few days without any resolution yet, already escalate to tier 2. Case number 02218137.
There's no errors on the jobs, they don't start and then fail, they simply just do not start at all. We even have some jobs that go to NAS units that get disabled and re-enabled on a daily basis, and even cycling between enabled -> disabled -> enabled doesn't trigger any data sync. All the Veeam servers were even restarted and that hasn't resolved the issue. Services came back online and no data syncs started.
We've had a case open for a few days without any resolution yet, already escalate to tier 2. Case number 02218137.
-
- Product Manager
- Posts: 14835
- Liked: 3082 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: All Backup Copy Jobs Not Automatically Running
Hello,
that case number looks very old, it should be something starting with 04...
Anyway: please remember that the forums are run by product management and not by support. As you say that the case was escalated, I assume that it goes on and that support is working on a solution. (Posting on the forums has doesn't influence speed of case resolutions.)
Best regards,
Hannes
that case number looks very old, it should be something starting with 04...
Anyway: please remember that the forums are run by product management and not by support. As you say that the case was escalated, I assume that it goes on and that support is working on a solution. (Posting on the forums has doesn't influence speed of case resolutions.)
Best regards,
Hannes
-
- Expert
- Posts: 127
- Liked: 22 times
- Joined: Feb 18, 2015 8:13 pm
- Full Name: Randall Kender
- Contact:
Re: All Backup Copy Jobs Not Automatically Running
Ah yes, sorry, the correct case number is 04882991.
Yes I know that the forums is not for support, but was hoping that by posting this perhaps someone else has seen the issue before and knows a resolution. I haven't even been able to figure out myself which log file to look at since the jobs aren't starting in the first place.
It's a really strange issue really though.
Yes I know that the forums is not for support, but was hoping that by posting this perhaps someone else has seen the issue before and knows a resolution. I haven't even been able to figure out myself which log file to look at since the jobs aren't starting in the first place.
It's a really strange issue really though.
-
- Product Manager
- Posts: 14835
- Liked: 3082 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: All Backup Copy Jobs Not Automatically Running
yes, that sometimes works with misconfigurations or well-known bugs. I checked the "well known bugs" list before posting, and found nothing on it.
Looks like support found some RPC errors in the logs.
Looks like support found some RPC errors in the logs.
-
- Expert
- Posts: 127
- Liked: 22 times
- Joined: Feb 18, 2015 8:13 pm
- Full Name: Randall Kender
- Contact:
Re: All Backup Copy Jobs Not Automatically Running
So something actually tells me that this has something to do with a bug in V11 at this point.
History wise, when we updated to V11 we started having a strange issue with backup copy jobs constantly failing reporting timeout issues (previous case 04696697). Eventually the tech recommended editing the broken job and clicking next through all the prompts without changing settings and then saving it, and that would fix the job. But we were having jobs getting "corrupted" by about 2-3 times a week for almost a month until the first patch for V11 came out. I never saw a fix in the patch notes for that specific problem, but the tech confirmed it would resolved the issue and sure enough after installing the patch the backup copy job corruption stopped happening.
But something tells me the corruption was somewhere else in the database and all we did was move it somewhere else, and now that corruption is wherever the job triggers are stored or something else related to core backup copy job function. Not really sure, but at this point we've tried everything we could, duplicating jobs, creating brand new jobs, everything just doesn't seem to be working. At this point we have someone manually kicking off every backup copy job every day, and while that somewhat works for the jobs that do run, there are multiple jobs that don't run even with triggering the sync manually.
The case supposedly is set to be escalated to tier 3 or R&D, but we really are getting desperate at this point. From our point of view support has been abysmal the whole case, multiple instances of scheduling remote sessions days out or reviewing logs for extended periods of time. If it was a smaller issue it wouldn't be that big of a deal, but at this point the case has been open for 2 full weeks (9 business days if you exclude July 4th) and we are no closer to a solution or even work arounds. We have data that is more than 2 weeks out of our RPO for offsite and there's been no way to get the data updated.
If we were to have a disaster requiring offsite recovery right now we'd only have nearly 3 week old restore points to recover with for 1/4th of our critical production SQL servers.
I guess the question is where do we go at this point, is there something we should be trying to do without support? I know the forum posts are not designed to speed up case resolution but we're not sure where else to go from here. It took a few days to escalate from tier 1 to 2, and now we're not sure how long the escalation from tier 2 to 3 will be.
History wise, when we updated to V11 we started having a strange issue with backup copy jobs constantly failing reporting timeout issues (previous case 04696697). Eventually the tech recommended editing the broken job and clicking next through all the prompts without changing settings and then saving it, and that would fix the job. But we were having jobs getting "corrupted" by about 2-3 times a week for almost a month until the first patch for V11 came out. I never saw a fix in the patch notes for that specific problem, but the tech confirmed it would resolved the issue and sure enough after installing the patch the backup copy job corruption stopped happening.
But something tells me the corruption was somewhere else in the database and all we did was move it somewhere else, and now that corruption is wherever the job triggers are stored or something else related to core backup copy job function. Not really sure, but at this point we've tried everything we could, duplicating jobs, creating brand new jobs, everything just doesn't seem to be working. At this point we have someone manually kicking off every backup copy job every day, and while that somewhat works for the jobs that do run, there are multiple jobs that don't run even with triggering the sync manually.
The case supposedly is set to be escalated to tier 3 or R&D, but we really are getting desperate at this point. From our point of view support has been abysmal the whole case, multiple instances of scheduling remote sessions days out or reviewing logs for extended periods of time. If it was a smaller issue it wouldn't be that big of a deal, but at this point the case has been open for 2 full weeks (9 business days if you exclude July 4th) and we are no closer to a solution or even work arounds. We have data that is more than 2 weeks out of our RPO for offsite and there's been no way to get the data updated.
If we were to have a disaster requiring offsite recovery right now we'd only have nearly 3 week old restore points to recover with for 1/4th of our critical production SQL servers.
I guess the question is where do we go at this point, is there something we should be trying to do without support? I know the forum posts are not designed to speed up case resolution but we're not sure where else to go from here. It took a few days to escalate from tier 1 to 2, and now we're not sure how long the escalation from tier 2 to 3 will be.
-
- Product Manager
- Posts: 14835
- Liked: 3082 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: All Backup Copy Jobs Not Automatically Running
Hello,
I'm sorry to hear that.
I recommend to continue with support instead of trying to manually dig through log files or the database.
Best ergards,
Hannes
I'm sorry to hear that.
I used the "talk to manager" button (available in my.veeam.com) for your case now. If something like that happens again, I suggest to use that button.where else to go from here.
I recommend to continue with support instead of trying to manually dig through log files or the database.
Best ergards,
Hannes
-
- Expert
- Posts: 127
- Liked: 22 times
- Joined: Feb 18, 2015 8:13 pm
- Full Name: Randall Kender
- Contact:
Re: All Backup Copy Jobs Not Automatically Running
So figured I'd give an update on this.
We're nearing a full month of incomplete offsite backups. Regarding the case, still no escalation to tier 3, no solution, no workarounds provided. Since the last post the only thing the support tech had us do is stop all Veeam services, wait 20 minutes, then start them. When that didn't fix the problem they asked for more logs.
We're looking into a way to copy the files manually between the sites outside of Veeam at this point. The problem is things like ViceVersa or Robocopy don't really have a way to resume a file copy if it gets interrupted midway through, and many of the backup files are larger than 2TB.
The only thing that seems to have been confirmed by the support reps is that is does look like a bug in V11, as I suspected. But again, that's only based off a few things the support rep mentioned with nothing else substantial. No word of a hotfix, no interaction with dev, still not any closer to a resolution.
We're nearing a full month of incomplete offsite backups. Regarding the case, still no escalation to tier 3, no solution, no workarounds provided. Since the last post the only thing the support tech had us do is stop all Veeam services, wait 20 minutes, then start them. When that didn't fix the problem they asked for more logs.
We're looking into a way to copy the files manually between the sites outside of Veeam at this point. The problem is things like ViceVersa or Robocopy don't really have a way to resume a file copy if it gets interrupted midway through, and many of the backup files are larger than 2TB.
The only thing that seems to have been confirmed by the support reps is that is does look like a bug in V11, as I suspected. But again, that's only based off a few things the support rep mentioned with nothing else substantial. No word of a hotfix, no interaction with dev, still not any closer to a resolution.
-
- Expert
- Posts: 127
- Liked: 22 times
- Joined: Feb 18, 2015 8:13 pm
- Full Name: Randall Kender
- Contact:
Re: All Backup Copy Jobs Not Automatically Running
So another update to this issue.
Still no help from support, we ended up fixing some of the issues by ourselves.
Here's what we did:
Now our 5 jobs have data that is going offsite. And it does seem like it helped the jobs autostart issues as about 90% of them seem to be autostarting correctly now after the parent backup jobs run. However other backup copy jobs that were running before and now failing with timeout issues. Seems that when we fix something it completely breaks some other jobs.
Still no help from support, we ended up fixing some of the issues by ourselves.
Here's what we did:
- Disabled and renamed all broken jobs (jobs that wouldn't even run when issuing a sync now)
- Removed the data for each job form the Veeam configuraiton (right-click -> remove from configuration)
- Created brand new backup copy jobs with the same settings as the broken jobs
- Ran a rescan against all the repositories to re-import the backup copy data
- Mapped the new backup copy jobs to the data that was imported
- Edited every other backup copy jobs and clicked next through all options without changing any settings
Now our 5 jobs have data that is going offsite. And it does seem like it helped the jobs autostart issues as about 90% of them seem to be autostarting correctly now after the parent backup jobs run. However other backup copy jobs that were running before and now failing with timeout issues. Seems that when we fix something it completely breaks some other jobs.
-
- Expert
- Posts: 127
- Liked: 22 times
- Joined: Feb 18, 2015 8:13 pm
- Full Name: Randall Kender
- Contact:
Re: All Backup Copy Jobs Not Automatically Running
So another update on this.
Since we did these changes here's what else ended up happening:
Since we did these changes here's what else ended up happening:
- We still have two jobs that won't autostart and only work if we manually sync them
- Several jobs got corrupted a few days later. For all of them the jobs won't start and are giving timeout errors. "Job XYZ cannot be started. Timeout: 903.4601035 sec"
- We found some backup copy jobs that were running and reporting as successful, but were only copying some of the source job data. For instance, we have one backup copy job that has data from 6 different backup jobs. However three of the jobs data was being copied and three weren't. The only way we found this out is by chance doing a restart of the server it decided to fix itself and start copying the missing data. The only way we could have figured out this ourselves (since Veeam B&R as well as VeeamONE did not send any alerts or warnings whatsoever) would have been to go under the "Disk (Copy)" section and manually review the data.
-
- Product Manager
- Posts: 14835
- Liked: 3082 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: All Backup Copy Jobs Not Automatically Running
Hello,
sorry again for the issues.
I just talked to support and they are in contact with RND as it looks like a bug. It also looks like there is a lack of resources.
Best regards,
Hannes
sorry again for the issues.
I just talked to support and they are in contact with RND as it looks like a bug. It also looks like there is a lack of resources.
Best regards,
Hannes
-
- Influencer
- Posts: 23
- Liked: 6 times
- Joined: Mar 22, 2021 11:18 pm
- Contact:
Re: All Backup Copy Jobs Not Automatically Running
Has this issue been fixed with 11a?
We also run into this issue. The backup copies will be running just fine for weeks or months, and then they fail.
Have noticed that this failure seems triggered if we clone a backup copy or backup job. We only need clone or set up jobs infrequently, so this correlation looks relevant. To be clear, not every time we clone will cause this, but we have noticed that whenever this issue has surfaced it will be straight after we have cloned a job.
The fix I have just found out is to delete the cloned job and re-create from scratch and then all other backup copy jobs will run again.
Really hoping 11a has fixed this.
We also run into this issue. The backup copies will be running just fine for weeks or months, and then they fail.
Have noticed that this failure seems triggered if we clone a backup copy or backup job. We only need clone or set up jobs infrequently, so this correlation looks relevant. To be clear, not every time we clone will cause this, but we have noticed that whenever this issue has surfaced it will be straight after we have cloned a job.
The fix I have just found out is to delete the cloned job and re-create from scratch and then all other backup copy jobs will run again.
Really hoping 11a has fixed this.
-
- Veeam Software
- Posts: 21138
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: All Backup Copy Jobs Not Automatically Running
If I'm correct in reading Randall's case notes, it v11a should fix this. I'll let @HannesK confirm.
-
- Product Manager
- Posts: 14835
- Liked: 3082 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: All Backup Copy Jobs Not Automatically Running
yes, bug #324847 was fixed in 11a. if you still see that issue, please open a case and post the case number for reference.
-
- Influencer
- Posts: 23
- Liked: 6 times
- Joined: Mar 22, 2021 11:18 pm
- Contact:
Re: All Backup Copy Jobs Not Automatically Running
This is excellent news. I have just updated our server to 11a. I will chance a few clone jobs over the next week or so.
-
- Influencer
- Posts: 23
- Liked: 6 times
- Joined: Mar 22, 2021 11:18 pm
- Contact:
Re: All Backup Copy Jobs Not Automatically Running
I can confirm after creating multiple cloned backup jobs and backup copy jobs that this problem is fixed for us.
Who is online
Users browsing this forum: acmeconsulting, Bing [Bot], d.artzen, Mildur, Semrush [Bot] and 108 guests