Discussions related to exporting backups to tape and backing up directly to tape.
Post Reply
brf
Influencer
Posts: 18
Liked: 1 time
Joined: Nov 06, 2017 8:02 pm
Contact:

Issues with tape jobs that copy workstation agent data

Post by brf »

We've recently expanded our VBR installation to contain about 40 workstation agents (both Linux and Windows, latest version) and have started running into major issues getting everything copied to tape each week. We've had a support case open, #03174016, for nearly 2 months now and gone through 4 different support people, and still no real answers. One of the support persons handling the ticket suggested a forums post at one point, so giving that a try.

The tape job in question is a "Backups to tape" job, the source is set to the VBR repository itself, it is set to back up only fulls (not incrementals), "If some linked backup jobs are still running, wait for up to" is set to 99 hours, and "Prevent this job from being interrupted by source backup jobs" is checked. When the tape job runs, it takes about half the week (3-4 days) as it is copying about 40TB of full backups to LTO5 tapes.

All our agents are workstation-licensed, so when we set them up through VBR they are set to "Managed by agent", which is the only option you are allowed to select when your agents' licenses are workstation (not server). All our backups are "forever forward incremental" style.

While the tape job runs, it doesn't seem to actually stop any agents from running, even while that agent is being copied to tape at the exact same time - the "linked backup jobs" and "Prevent this job from being interrupted" options seem to have no effect.

We run into two different unexpected issues:

(1) Due to agents running while that agent is being copied to tape, some individual systems fail inside the tape job with errors such as these 3:
- 9/1/2018 12:07:27 AM :: Backup files for full source job (VeeamAgentUser44454c4c-5000-1054-804c-b5c04f374d32)XXX - XXX were changed, retry is required...
- 9/8/2018 4:25:34 AM :: Full backup: Failed to lock backup file
Item [XXX_2018-08-28T233301.vbk] is locked by running session [XXX - XXX] [Agent Backup]
- 10/7/2018 3:59:55 AM :: Full backup: Restore point was not found for backup file Client Backups - XXX2018-10-07T035516.vib

(2) Sometimes a system will be copied to tape early on in the tape job, then copied to tape again later in the same job, if the agent in question ran a new backup during the duration of the tape job. So, that system ends up taking up double the space on tape, which can be a huge issue when it's a large 10TB+ system.

Regarding #1 above, the first support person handling our case seemed fairly sure that it was normal for agents that are set to "Managed by agent" to experience this problem, and that fixing it would be a "feature request" for the forums. The 2nd person handling the case wasn't so sure but seemed to be leaning towards the same conclusion. The latest person handling our case doesn't think any of that is true and that these jobs should work as expected, but is waiting on answers from someone else. If it is true that workstation agents can't be reliably moved to tape without blacking out their schedule during the tape job, then that's a massively crippling limitation in Veeam especially in our situation where the tape job takes half the week.

Regarding #2, I've been told this is possibly normal for tape jobs where the repository itself is set as the source (the manual seems to contain a small blurb that could be saying that), and if I instead select by job (selecting all jobs), it won't do this, but haven't been able to get real confirmation from anyone I've talked to on this. Selecting the repository is far more desirable than selecting all jobs, because when I add new agents they're automatically included in my tape job that way.
Dima P.
Product Manager
Posts: 14396
Liked: 1568 times
Joined: Feb 04, 2013 2:07 pm
Full Name: Dmitry Popov
Location: Prague
Contact:

Re: Issues with tape jobs that copy workstation agent data

Post by Dima P. »

Hello brf.

Reviewing your case, stay tuned.
Dima P.
Product Manager
Posts: 14396
Liked: 1568 times
Joined: Feb 04, 2013 2:07 pm
Full Name: Dmitry Popov
Location: Prague
Contact:

Re: Issues with tape jobs that copy workstation agent data

Post by Dima P. »

brf,

I see that case is now being handled by experienced engineers, so you are in the good hands. QA will check the mentioned problem with regular media pools, meanwhile the workaround suggested by support team with GFS media pool and GFS tape job sounds like a good idea. Will it work for your case? Thanks!
brf
Influencer
Posts: 18
Liked: 1 time
Joined: Nov 06, 2017 8:02 pm
Contact:

Re: Issues with tape jobs that copy workstation agent data

Post by brf »

From what I've read about GFS so far, I think it would work for us, if that's what's required to fix the problem. I'm actually running my first GFS backup job now - in the middle of the week to ensure that plenty of agents are running while it's running.
Dima P.
Product Manager
Posts: 14396
Liked: 1568 times
Joined: Feb 04, 2013 2:07 pm
Full Name: Dmitry Popov
Location: Prague
Contact:

Re: Issues with tape jobs that copy workstation agent data

Post by Dima P. »

brf,

Great, thanks! Please let us know how it goes, from our side we will perform the investigation of the original issue you've reported. Cheers!
brf
Influencer
Posts: 18
Liked: 1 time
Joined: Nov 06, 2017 8:02 pm
Contact:

Re: Issues with tape jobs that copy workstation agent data

Post by brf » 1 person likes this post

Update on this...

For problem #1 from my original post, I've been told it's a bug in the current release of VBR and will be fixed in either Update 4, or a future release. So far, using GFS instead of regular tape jobs seems to have worked around the problem, but I wouldn't say it's 100% confirmed yet.

But for problem #2, I've been told that it's "a feature, not a bug" that Veeam may copy a full backup of a system to tape twice during a single tape job. ie: if a given system (that's already been copied to tape) gets another incremental backup while the tape job is working on other systems, it may come back to that previous system, and copy an entirely new full backup of that system to tape. Our tape job copies about 40 systems to tape and a few of these systems are about 10-15TB in size. If we get double backups of those 10-15TB systems, we're usually hosed - not only does it eat up too many tapes in the tape library, but it also makes the tape job take too long, possibly exceeding its time window. The only solution I've been given so far is upgrading our hardware to make tape jobs take less time (for example adding tape drives or upgrading LTO generations), so that we can get the tape job finished before new backups occur. This seems kind of ridiculous to the point where I feel I must be missing something, but I'm told I'm not. Surely it would make sense for Veeam to have a feature in its tape jobs where you can specify that it should only copy each system to tape once. (I'm pretty sure most people would just automatically assume that this is what it would do?)
Dima P.
Product Manager
Posts: 14396
Liked: 1568 times
Joined: Feb 04, 2013 2:07 pm
Full Name: Dmitry Popov
Location: Prague
Contact:

Re: Issues with tape jobs that copy workstation agent data

Post by Dima P. »

Hello brf.
if a given system (that's already been copied to tape) gets another incremental backup while the tape job is working on other systems, it may come back to that previous system, and copy an entirely new full backup of that system to tape
I've already discussed your case, and specifically this behavior, with our tape team. We noted an improvement request based on your comments, thank you!
Post Reply

Who is online

Users browsing this forum: No registered users and 24 guests