File Maintenance on ReFS

DaveWatkins · Post by **DaveWatkins** » Mar 10, 2017 2:27 am this post

I've been watching a backup to tape job run slowly today and I'm wondering if it's because of fragmentation from using Fast Clone on the ReFS repository on a reverse incremental chain.

I'm assuming it is, which in itself isn't a huge problem, but that lead me to wonder what, if anything, the file maintenance operation does on a ReFS volume. On an NTFS repository it creates a new copy of the VBK in a mostly single lump removing fragmentation.

However does it do the same on a ReFS repo, or does it also just use the fast clone API and build a new copy of the file without actually moving any data?

Thanks

Post by **Gostev** » Mar 10, 2017 1:52 pm this post

Correct, you are not alone - we had a few similar support cases created by other reversed incremental backup mode users already. Reverse incremental is the absolute worst backup mode in terms of both fragmentation and I/O load, I would not use it with ReFS myself - there are no benefits.

Maintenance operation essentially only does "compact" operation which releases the no longer used backup file blocks to the file system, so it still makes great sense to do. As far as defragmentation, it makes no sense on ReFS volumes. Here's my explanation from another topic:

You should treat ReFS volume where block cloning was used as a sort of deduplication storage. You can never have it NOT "heavily fragmented" due to the nature of the process (different files sharing the same physical block). As such, there's absolutely no point to perform defragmentation - just like there's no point in defragmenting a deduplicating storage. In fact, the only way to actually defragment any given file would require "inflating" one by recreating the file without leveraging block cloning.

DaveWatkins · Post by **DaveWatkins** » Mar 10, 2017 8:51 pm this post

Hi Anton

We originally went reverse incremental because backing up a forward chain to tape was taking too long (for much the same reason as we're now facing). I have no issues switching back to a forward incremental chain but with ReFS I'm not sure any mode will keep the fragmentation under control other than doing regular active full backups, and we really don't have the bandwidth for that. Synthetic Full's won't actually help the fragmentation, although I guess they would held from a data management point of view as it wouldn't have to go through the whole chain working out what to put to tape, but I'm not sure that's any better than our current reverse incremental from a performance to tape perspective.

Virtual full's used to be the solution but they were I/O intensive and again will suffer the fragmentation penalty (and are they really different from a Synthetic Full at this point??).

Any suggestions?

Thanks
Dave

Post by **tsightler** » Mar 10, 2017 10:13 pm this post

Out of curiosity, how slow is this tape backup? Certainly using ReFS will lead to fragmentation, however, we are cloning Veeam blocks, which are relatively large overall. Assuming you are using default settings a typical Veeam block is around 350-500KB. I would think almost any reasonable file system could support enough I/O to keep a typical LTO drive busy at this block size unless perhaps there's something non-optimal about the underlying block device layout.

DaveWatkins · Post by **DaveWatkins** » Mar 10, 2017 10:59 pm this post

It sat at 20-30MB/s for the better part of 3 hours, then picked up to normal LTO6 speed. Underlying block sizes and stripe sizes are all good. It's a 20 disk RAID6 volume with NL-SATA disks attached via FC. So probably not the hardware

Post by **tsightler** » Mar 10, 2017 11:15 pm this post

By any chance did you grab the queue depth on the source repo while while you were doing this. I'm guessing we're just not keeping enough request in queue to properly keep the disk busy, but that is truly just a guess at this point. 20-30MB would imply single queue depth with about 10ms latency per read so it would be interesting to see this data during at tape job. I'm assuming the tape job shows Source as the bottleneck? What percentage?

DaveWatkins · Post by **DaveWatkins** » Mar 11, 2017 12:25 am this post

I'm not sure the source statistics would be useful as it's was a part of a single VM (a 2TB one) that was the only slow one to get to tape, the rest of the job (maybe another 2TB) was all fine, per VM chains to a TL-4000 library with two LTO6 drives. Jobs stats are 29/13/14/97 and was running slow for maybe 4 hours out of the total run time of 10 hours, the rest of the time it was running at normal speed. It also wasn't running slow for the entire time it was backing up that VM

I didn't get any volume stats unfortunately but did check to make sure nothing else was happening on the volume. There was a health check running on a different job but that backup chain is on a different volume

Post by **mkretzer** » Mar 11, 2017 8:26 am this post

We also have the issue that tape backup sometimes goes down to 50-60 MB/s for quite some time from the typical 160 MB/s - and still shows the target as bottleneck. We had this on NTFS, backed by 96 disks and on REFS with 24 disks as well.

Is there a good way to troubleshoot that?

Post by **masonit** » Mar 13, 2017 7:38 am this post

Gostev wrote:Correct, you are not alone - we had a few similar support cases created by other reversed incremental backup mode users already. Reverse incremental is the absolute worst backup mode in terms of both fragmentation and I/O load, I would not use it with ReFS myself - there are no benefits..

Hi Gostev

What backup mode would you recommend with refs?

\Masonit

ChrisGundry · Post by **ChrisGundry** » Mar 13, 2017 9:06 am this post

Also interested in the reply

Post by **m.novelli** » Mar 13, 2017 9:41 am this post

Gostev wrote:Correct, you are not alone - we had a few similar support cases created by other reversed incremental backup mode users already. Reverse incremental is the absolute worst backup mode in terms of both fragmentation and I/O load, I would not use it with ReFS myself - there are no benefits.

Hi Gostev, I agree with you but I love the Reverse Incremental mode because it allow me to export every night a single backup file containing all VM to RDX drive, and I've standardised most of my customer with that config

Cheers,

Marco

Post by **Gostev** » Mar 13, 2017 2:15 pm this post

Yes, certainly there are use cases for the reversed incremental mode - we've got our hands slapped quickly by sales engineers when we started discussing potentially removing one from the product to reduce the number of test scenarios

@Magnus @Chris regular (forward) incremental backup, which is the default.

ITP-Stan · Post by **ITP-Stan** » Mar 14, 2017 8:58 am this post

Just a thought, but with ReFS you could actually do a synthetic full every night without much space loss because of the block cloning right?

Post by **veremin** » Mar 14, 2017 9:35 am this post

Correct, but what's the idea behind such scenario? To have full backup each day and not to stress production environment in the meantime?

Post by **Gostev** » Mar 14, 2017 5:05 pm this post

Exactly.

ChrisGundry · Post by **ChrisGundry** » Mar 15, 2017 9:11 am this post

I thought that with the way ReFS works and the fact that it allows the re-mapping of blocks, it would be the ideal candidate for Reverse Incremental? When the merge parts of the job are happening instead of re-writing all the merged blocks it just re-maps them?

My understanding being that Forward Incremental will use 1 write IO during the backup then 1 read and 1 write after the backup to do the merge. Where RI will use 3 IO during the whole backup process, 1 write of the new block, 1 read of an existing block and another write to put that existing block into the new file. My understanding was that with ReFS the 'existing block' IO is just re-mapped instead of actually read/written. So this improves the speed of the process and takes the actual IO load away from the disk array and does it at a software level instead. I understand that would cause fragmentation of the files, but that would happen anyway with ReFS no?

I thought that Reverse Incremental would give the benefit of the latest backup always being a full, but the IO load issue normally associated with it would be removed due to benefits of ReFS functionality.
Our problem with using Forward Incremental is that we rely on the full chain to go back to last night, if there is an issue with any of the chain the backup is no good. 99% of the time we only want to restore from last night. With Reverse Incremental we always have a full backup being last night.

Any comments on why Reverse Incremental is the worst for IO with ReFS? To me it seems like the ideal candidate to use with ReFS? Why is Forward Incremental so much better with ReFS?

Thanks!

Post by **foggy** » Mar 15, 2017 5:41 pm this post

You're correct, ReFS allows to save 2 of those 3 I/O's thanks to block cloning. Here's a good thread regarding this, btw.

ChrisGundry · Post by **ChrisGundry** » Mar 15, 2017 5:47 pm this post

OK thanks Foggy. I thought that was the case...

I guess my question was why Gostev was saying that he wouldn't use it on ReFS, wondered what the reason for it was?

Post by **foggy** » Mar 15, 2017 5:57 pm this post

Another disadvantage of this method is that it heavily defragments full backup file.

ChrisGundry · Post by **ChrisGundry** » Mar 15, 2017 5:59 pm this post

Where as Forward Incremental would be more sequential and less fragmented? I figured that as that was also using block cloning it would be similarly fragmented to reverse incremental?
Thanks

DaveWatkins · Post by **DaveWatkins** » Mar 15, 2017 7:06 pm this post

I'm also confused as to why Forward Incremental would cause less fragmentation since at the end of the day the VIB's are still merged into the VBK, it just happens at the other end of the chain. I'd love to know the rationale sicne I've got no problem switching to forward at this point as long as we can sustain our backup speed to tape

Post by **Gostev** » Mar 15, 2017 8:34 pm this post

Just think about how increments are created in both modes. They are sequential writes into contiguous VIB file in case of forward incremental mode, and random writes into VBK in case of reversed incremental mode (plus VRB files pointing at random existing blocks which were previously a part of VBK).

DaveWatkins · Post by **DaveWatkins** » Mar 15, 2017 8:38 pm this post

Ahh, of course, so as vib's are rolled into a VBK in FI the entire VIB file stays as a single chunk on the file system, so while you're still fragmenting the VBK but ontly by adding 2 fragments (the VIB and the remaining VBK), it's not to the same extreme as RI which could add hundreds of fragments just by writing out a single VRB because they could be spread all over the exiting VBK.

Maybe a warning on job creation for RI jobs pointing to a ReFS volume?

Post by **Gostev** » Mar 15, 2017 9:01 pm this post

There's one in the label itself, it says (slower) next to its name

naturally, reversed incremental is a few times slower on any file system at all - and the fragmentation issue is not specific to ReFS.

DaveWatkins · Post by **DaveWatkins** » Mar 15, 2017 9:04 pm this post

But the fragmentation is, since it's usually kept under control by File Maintenance but on ReFS File Maintenance basically does nothing other than create a new VBK pointing to exactly the same blocks as the old one

Post by **Gostev** » Mar 15, 2017 9:05 pm this post

That is correct.

Stephan23 · Post by **Stephan23** » May 07, 2019 1:02 pm this post

Gostev wrote: ↑Mar 10, 2017 1:52 pm Maintenance operation essentially only does "compact" operation which releases the no longer used backup file blocks to the file system

What exactly are maintenance operations?
Does it include "Defrag and compact full backup file"?
If yes, do I understand it correctly that, if I use this option, it does not require additional space on the repository for this task?

I am currently in the process of scheduling health checks and defrags for all our jobs but don't want to risk to fill up the repository with temporary files.

Sorry to dig up this old thread, but it showed up on google and fits perfectly for my scenario.

Regards
Stephan

Edit: I think I found the answer myself in another thread from foggy:
veeam-backup-replication-f2/needs-maint ... ml#p256048

Although Gostev basically already said the same thing. I just wanted to be sure.
Maybe you should clarify this behavior in the help documentation, where it is stated that additional space is required for compact operation, without mentioning the file system:
https://helpcenter.veeam.com/docs/backu ... l?ver=95u4

Post by **veremin** » May 07, 2019 6:10 pm this post

It's mentioned in the ReFS integration section:

Veeam Backup & Replication leverages Fast Clone for the following synthetic operations:

In backup jobs:

Merge of backup files
Synthetic full backup
Reverse incremental backup transformation
Compact of full backup file

nikpolini · Post by **nikpolini** » Jul 31, 2019 5:43 am this post

Hi DaveWatkins

Did you ever get to the bottom of the slow tape issue?
I am having this too on ReFS and putting files to TAPE.
THanks Nick

DaveWatkins · Post by **DaveWatkins** » Jul 31, 2019 7:55 pm this post

No, it hasn't got any worse but on an LTO6 library with 2 drives we only average about 150MB/s

Honestly I don't think there is a solution. It's just a price we pay for having the other benefits of ReFS. I guess if you could afford 10k or 15k (or SSD) repositories you'd be able to offset it somewhat, but that's not an option for us.

R&D Forums

File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Re: File Maintenance on ReFS

Who is online