Comprehensive data protection for all workloads
sdeath
Influencer
Posts: 20
Liked: 2 times
Joined: Feb 11, 2010 12:54 pm
Full Name: Stuart Death
Contact:

Scale Out Repository out of space issue

Post by sdeath »

Hi,

I am a bit confused about what I am reading and what I am being told by support.

- I have a Scale Out repository with 2 extents.
- Extent 1 capacity is 14.4TB and extent 2 is 7TB.
- I have created one Backup Copy job, containing 111 VMs and pointed it to the Scale Out repository.
- The Scale Out repository is configured as Data Locality and Per-VM backup files.

Initially the job appeared distribute the backup files evenly across the 2 extents but the 7TB extent has now run out of space and the Backup Copy job is failing.

Veeam support is telling me that is as designed, using the Data Locality policy will write the increment jobs to the same extent as the full backup file until it runs out of space and will then fail the job. I asked what would happen if I added a new extent and was told that only new VMs to the job would be written to that extent. I asked what is the benefit of Scale Out then and I received a very long pause! I believed that Veeam would handle the space issue, am I wrong?

If this is the case then Scale Out is actually wasting disk space as the other extent still had over 3TB free. Apparently there is no way to back out of a Scale Out repository so I will have to delete the data, create 2 Backup Copy jobs and point them to separate repositories. Not too happy about that.

Can anyone advise if this is correct?

Thanks.
DaveWatkins
Veteran
Posts: 370
Liked: 97 times
Joined: Dec 13, 2015 11:33 pm
Contact:

Re: Scale Out Repository out of space issue

Post by DaveWatkins »

I've seen this manifested in a number of situations myself, so much so I don't use Scale Out anymore at all. Initially it sounded great, but it's poor detection of placement and then failing jobs if an extent gets full I've given up.

The best (and by best I mean worst) situation I saw was a scale out with a 2TB and a 7TB and an exchange backup of a server that was over 3TB and it failed because it put the backup on the 2TB extent and filled it before completing the job. Scale out seems to be missing a LOT of logic to determine the best place to put jobs, and what do do if an extent fills up.

Based on all the marketing, it was presented as a simple way to aggregate all your spare space to make a large, usable backup repo, but it's missing so much logic to catch simple and expected scenarios I've found it completely unusable so far, which is a shame because I love the idea of it, it just needs to make smarter decisions about job placement and what happens when the policy can't be fulfilled. It also doesn't seem to automatically move existing per VM chains off an extent to free up space on it so other VM chains on that extent can continue to backup, which is again, something i expected it to do.
tsightler
VP, Product Management
Posts: 6009
Liked: 2843 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Scale Out Repository out of space issue

Post by tsightler »

Can you please share your support case?
sdeath
Influencer
Posts: 20
Liked: 2 times
Joined: Feb 11, 2010 12:54 pm
Full Name: Stuart Death
Contact:

Re: Scale Out Repository out of space issue

Post by sdeath »

Case # 01754266

But support closed case as no way forward!

Thanks
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Scale Out Repository out of space issue

Post by foggy »

sdeath wrote:Veeam support is telling me that is as designed, using the Data Locality policy will write the increment jobs to the same extent as the full backup file until it runs out of space and will then fail the job.
This is not actually correct, data locality policy is not that strict and should place the backup on another extent if it has available space. There's an issue, however, where if there's not enough space to even update metadata file, the job will fail and chances are you're experiencing right this one. I recommend re-opening the case and escalating it to a higher tier for a closer investigation.
sdeath
Influencer
Posts: 20
Liked: 2 times
Joined: Feb 11, 2010 12:54 pm
Full Name: Stuart Death
Contact:

Re: Scale Out Repository out of space issue

Post by sdeath »

Thanks foggy, I have to say that makes more sense. I will re-open and escalate.
tsightler
VP, Product Management
Posts: 6009
Liked: 2843 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Scale Out Repository out of space issue

Post by tsightler »

Exactly why I was asking for the support case. Can you please update this post with the new case # once it is opened? I know there are several issues with data placement that were fixed in U1, but I would not expect a case of "job fails", although of course there can always be cases where en extent is completely exhausted of space, but we should break policy to do backups to other extents in most of those cases. I just want to make sure we fully understand all of the issues and failure cases.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Scale Out Repository out of space issue

Post by foggy »

DaveWatkins wrote:It also doesn't seem to automatically move existing per VM chains off an extent to free up space on it so other VM chains on that extent can continue to backup, which is again, something i expected it to do.
Dave, could you please elaborate on this scenario?
DaveWatkins
Veteran
Posts: 370
Liked: 97 times
Joined: Dec 13, 2015 11:33 pm
Contact:

Re: Scale Out Repository out of space issue

Post by DaveWatkins » 1 person likes this post

I've you have data locality set as the policy, and an extent is filled, my expectation would be that an entire backup chain would be moved to another extent to free up space on the full one so further incremental data can be placed on that extent for the remaining chains on that extent and so still conform to policy. That didn't seem to happen when I saw it
sdeath
Influencer
Posts: 20
Liked: 2 times
Joined: Feb 11, 2010 12:54 pm
Full Name: Stuart Death
Contact:

Re: Scale Out Repository out of space issue

Post by sdeath »

The case number is the same, seems it hadn't closed completely. It is now awaiting an escalation response.

By the way, I agree with Dave. My expectation was that an existing VM chain would be moved off in order to free up space.

Just to confirm, we are on Update 1.
Marten_med_e
Enthusiast
Posts: 47
Liked: 4 times
Joined: Sep 26, 2013 9:31 am
Full Name: Mårten Edelbrink
Contact:

Re: Scale Out Repository out of space issue

Post by Marten_med_e »

Our copy backup jobs also stalled when space was exhausted on one disk in the scale-out repository, over 4TB free on the other three disks in the scale-out repository, is running v. 9.0.0.1491.

Code: Select all

2016-04-07 09:28:58 :: Error: There is not enough space on the disk.
Failed to write data into file [E:\Backups\definition.erm].
--tr:Error code: 0x00000070
--tr:Failed to call DoRpc. CmdName: [FcWriteFileEx] inParam: [<InputArguments><FilePath value="E:\Backups\definition.erm" /><DesiredAccess value="1073741824" /><ShareMode value="0" /><CreationDisposition value="3" /><FlagsAndAttrs value="0" /><Offset value="0" /><BytesToWrite value="5400" /></InputArguments>].
There is not enough space on the disk.
Failed to write data into 
I guess that the disk got so full that the copy backup job couldn't update necessary files to continue with/move the backup to one of the disks with free space on.

Case #01759224, just opened due to excessive time-out issues when uploading log files.

Cheers
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Scale Out Repository out of space issue

Post by foggy »

Mårten, this looks like exactly the case I was talking about. Thanks for contacting support and sharing the case ID.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Scale Out Repository out of space issue

Post by foggy »

Dave, thanks for clarifying the use case, got it now.
Marten_med_e
Enthusiast
Posts: 47
Liked: 4 times
Joined: Sep 26, 2013 9:31 am
Full Name: Mårten Edelbrink
Contact:

Re: Scale Out Repository out of space issue

Post by Marten_med_e »

foggy wrote:Mårten, this looks like exactly the case I was talking about. Thanks for contacting support and sharing the case ID.
If this is the case, I would suggest that B&R creates a "empty" *.vbm/def file that could be used for writing in case the disk gets full,if I don't remember wrong I think MS does/did this with Exchange transaction log file, haven't been messing around with Exchange for a wile so my memory can be way off. Sorry if it is.

Cheers,
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Scale Out Repository out of space issue

Post by foggy »

We are looking for possible ways of addressing this behavior.
Marten_med_e
Enthusiast
Posts: 47
Liked: 4 times
Joined: Sep 26, 2013 9:31 am
Full Name: Mårten Edelbrink
Contact:

Re: Scale Out Repository out of space issue

Post by Marten_med_e »

I did free up some space on the disk, but B&R just kept writing to the vib/vbk files and got out of space. I had to free up enough space to let the jobs finish and then add a new repository for setting it to incremental backups only to get the jobs to start working again.

Cheers,
nunciate
Expert
Posts: 247
Liked: 39 times
Joined: May 21, 2013 9:08 pm
Full Name: Alan Wells
Contact:

Re: Scale Out Repository out of space issue

Post by nunciate »

I have noticed this same issue. I have 5 extents in a scale out repository. One of the extends is somewhat small (about 3Tb). The others are large (18Tb). The smaller one filled up with zero free space and no jobs which utilized that drive would run. I removed some backups from disk and reran the active fulls. Active fulls were written to a new extent but jobs kept running incremental backups to the small drive and kept filling it up to zero free space. The only way I was able to fix it was to put the extent into maintenance mode and evacuate it completely. Fortunately I had the space to do this on the other extents otherwise i would have had a real problem.

There has to be some logic to determine drive is X% full and to stop backing up to that drive. Is there anything like that because if there is it isn't working.

One other big problem I have noticed is that the jobs don't always pick the best extent that matches the size of the VM. I have a large 10Tb VM I am trying to backup to a this scale out repository for the first time. I have 12Tb free on 1 extent, 7Tb free on another and less than 2 on the others. I run the job and it tries to put the backup on the extent with 7Tb free. I kill the job, put that extent into maintenance mode and reran. Now it tries to backup to an extent with less than 2Tb free all the while ignoring the one extent it can actually be successful at using with 12Tb free. Ridiculous. I literally had to put all extents into maintenance mode to force it to the last one and then re-enabled them all.

My scale out is set to performance mode. I do not have per-vm backups enabled on any of the extents as of yet but plan on doing that soon.

I have to agree with what was already said. The scale-out repository is a great idea but I am not sure it is ready for prime time. Not sure how these issues could not have been identified in beta and RTM.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Scale Out Repository out of space issue

Post by foggy »

nunciate wrote:There has to be some logic to determine drive is X% full and to stop backing up to that drive. Is there anything like that because if there is it isn't working.
There's an heuristic mechanism that estimates the backup size and doesn't allow to use the extent if there's not enough free space.
nunciate wrote:One other big problem I have noticed is that the jobs don't always pick the best extent that matches the size of the VM. I have a large 10Tb VM I am trying to backup to a this scale out repository for the first time. I have 12Tb free on 1 extent, 7Tb free on another and less than 2 on the others. I run the job and it tries to put the backup on the extent with 7Tb free. I kill the job, put that extent into maintenance mode and reran. Now it tries to backup to an extent with less than 2Tb free all the while ignoring the one extent it can actually be successful at using with 12Tb free. Ridiculous. I literally had to put all extents into maintenance mode to force it to the last one and then re-enabled them all.
Alan, do you have a case opened for this one? Any chance the 12TB one is specified for increments only (Performance policy) or does not have free repository slots?
sdeath
Influencer
Posts: 20
Liked: 2 times
Joined: Feb 11, 2010 12:54 pm
Full Name: Stuart Death
Contact:

Re: Scale Out Repository out of space issue

Post by sdeath »

I have a response from tier 2 support.

"As of now the current SOBR design still requires extents to have some space for VBM files.
VBM is written to every extent where we have backup files related to the job.

However, when backup job has backup files only on one extent, and that extent doesn't have free space at all (not even couple of MB for VBM file) then job will fail and won't switch to another extent.
The workaround is to remove a couple of MB so VBM could be written.

When job has backups files on 2 extents, ext1 and ext2, you won't experience any issue: VBM file will be written to both ext1 and ext2, and if ext1 ran out of space completely, then we will fail to write VBM to ext1 but we will be able to write VBM to ext2 and in this scenario job won't fail - it will detect that it could write at least one copy of VBM.

We have already discussed this limitation with RND, and now RND is thinking about ways to improve our workflow/logic here, probably in one of the upcoming patches for v9, but as of now - unfortunately, this is current design, limitation."


Now, we have one Backup Copy job writing to two extents so just confirming if it should be working as suggested in the third paragraph....
nunciate
Expert
Posts: 247
Liked: 39 times
Joined: May 21, 2013 9:08 pm
Full Name: Alan Wells
Contact:

Re: Scale Out Repository out of space issue

Post by nunciate »

My feature request would be to add the ability to set a limitation on the extents. Have an option in there that says don't write to this extent if it is X % full. That way people can set their own value and when the volume gets to 10% free space left for example it will automatically stop backing up to that extent. That might be much harder to implement so even if you set a static value that the job checks for would be something good.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Scale Out Repository out of space issue

Post by foggy »

sdeath wrote:When job has backups files on 2 extents, ext1 and ext2, you won't experience any issue: VBM file will be written to both ext1 and ext2, and if ext1 ran out of space completely, then we will fail to write VBM to ext1 but we will be able to write VBM to ext2 and in this scenario job won't fail - it will detect that it could write at least one copy of VBM.

Now, we have one Backup Copy job writing to two extents so just confirming if it should be working as suggested in the third paragraph....
That's correct, if VBM is stored on both extents, the job shouldn't fail in case there's not enough space to update it on of them.
sdeath
Influencer
Posts: 20
Liked: 2 times
Joined: Feb 11, 2010 12:54 pm
Full Name: Stuart Death
Contact:

Re: Scale Out Repository out of space issue

Post by sdeath »

Our backup copy job has created a VBM on both extents so we shouldn't be seeing this issue then?
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Scale Out Repository out of space issue

Post by foggy »

In case Veeam B&R is able to update at least one copy of metadata, the job shouldn't fail.
sdeath
Influencer
Posts: 20
Liked: 2 times
Joined: Feb 11, 2010 12:54 pm
Full Name: Stuart Death
Contact:

Re: Scale Out Repository out of space issue

Post by sdeath »

Support have confirmed a bug and are compiling a cumulative hotfix.
A.J.
Service Provider
Posts: 6
Liked: 6 times
Joined: Jul 26, 2016 6:19 am
Contact:

Re: Scale Out Repository out of space issue

Post by A.J. »

We have a similar problem with our Scale Out Repositories when using Windows Dedup on that Repositories - especially on Backup Copy Jobs. (SOBR with per VM backup files and data locally policy)
Our latest investigations showed that the copy jobs will fill up a repository extent until 0 bytes free. The next incremental file will be placed on another extent but the merge process will fail because the primary extent of the job has 0 bytes free.
The only explanation that we currently found is that the dedup algorithm causes a "pumping" on the extent. That means that merging processes will expand the deduped files in such a way that veeam can´t recognize it or take it into consideration while placing backup files to the extents. Maybe veeam thinks that there must be enough space on the extent but there isnt because of the dedup behaviour. Maybe this happens while the jobs on the extents are running and the space isnt checked just in time during that job processing. Who knows...
As mention in the first postings there should be a possibility for the veeam Admin to control a spare space on each extent to prevent this out of space situation. And also there should be an automated process that evacuate the full backup file to an extent with enough space to keep the jobs running an the backup chain intact.
For us at the moment the only solution seems to be to disable deduplication on all repositories and clean up the whole deduped files over the time.
We are also planning to upgrade our repositories to windows 2016 with ReFS (dedup is not supported) to reduce the time for merge processes. We hope that this will solve our problems in the near future, knowing that we will need more extents on our SORBs.
Gostev
Chief Product Officer
Posts: 31458
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Scale Out Repository out of space issue

Post by Gostev »

A.J. wrote:Our latest investigations showed that the copy jobs will fill up a repository extent until 0 bytes free. The next incremental file will be placed on another extent but the merge process will fail because the primary extent of the job has 0 bytes free.
Hi, I recommend you upgrade to 9.5 since it has additional new logic to prevent this from happening. Thanks!
A.J.
Service Provider
Posts: 6
Liked: 6 times
Joined: Jul 26, 2016 6:19 am
Contact:

Re: Scale Out Repository out of space issue

Post by A.J. »

Hi, we are already on 9.5 because of this problem. No recognizable change in logic but that could be because of the dedup effect.
I read in another forum entry, that veeam in v9.5 reserves 1% of storage space to prevent that the the repository runs out of space. We will see if the problem occurs on our repository that had never deduped enabled.
Jeff M
Enthusiast
Posts: 34
Liked: 3 times
Joined: Jan 13, 2015 4:31 am
Full Name: Jeffrey Michael James
Location: Texas Tech Univ. TOSM Computer Center, 8th Street & Boston Avenue, Lubbock, TX 79409-3051
Contact:

Re: Scale Out Repository out of space issue

Post by Jeff M »

Veeam Support - Case # 02282755 I can confirm whatever was done to resolve this issue in Ver. 9.5 did not work. I still have all of the previous posted issues with SOBR..
Jeff M
Data Center Operations
Technology Operations & Systems Management
Texas Tech University System
jeff.james@ttu.edu
frankive
Service Provider
Posts: 1092
Liked: 134 times
Joined: May 14, 2013 8:35 pm
Full Name: Frank Iversen
Location: Norway
Contact:

Re: Scale Out Repository out of space issue

Post by frankive »

Same issue here. One extent fill, and backup copy jobs from customers failing. 9.5u3
# 02673368.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Scale Out Repository out of space issue

Post by foggy »

Switch to per-vm backup chains helped in Jeff's case, as far as I an see from the case notes.
Post Reply

Who is online

Users browsing this forum: ante_704, chad.aiken, ludsantos and 326 guests