Feature request: sparse file support to avoid the need for compact operations

Post by **DonZoomik** » Feb 18, 2019 3:53 pm this post

Most file systems (NTFS, ReFS, XFS, ext4, ZFS...) support sparse files and most operating systems support functions to "punch holes" in files (also over SMB).
If Veeam supported punching holes in VBKs, it would remove one main need for compact operation - forever growing VBK files.

Use case:
A VM has a VBK of 1TB
500GB of data is added to VM, resulting in 500GB VIB (VIB1)
750GB of data is deleted within VM, captured in VIB (VIB2)
When VIB1 is merged, VBK grows to 1,5TB
When VIB2 is merged, VBK stays at 1,5TB, wasting 750GB of disk space.

Expensive compact with buffer disk space (if not using ReFS) or full backup with retention expiring is required to recover wasted space.
If Veeam were to use sparse files, these 750GB could be "punched out" of the file, resulting in file having logical size of 1,5TB but only 750GB of physical disk space used.

In my real-life case I had a few huge (10TB+) file servers that were refactored into smaller ones. Disk space on backup target was not released as data was migrated so there had to be a lot of hand-holding and careful timing just to not run out of disk space.

I found an old thread with some discussion about it: veeam-backup-replication-f2/remove-clie ... t8947.html
It mentions that tape drives will still see full logical size. However tape compression should mitigate it, as these punched out regions are now just zeroes that should compress to nothing (naive impression). I'm not sure if deduplication appliances and other more exotic devices support this.
Also mentioned that most large customers do regular fulls. I've seen mentioned in other threads that nowadays most customers just use forever-incremental. So it should be something to reconsider after ~6-7 years.

ejenner · Post by **ejenner** » Feb 19, 2019 10:18 am this post

In your real-life case didn't you want the historical restore points?

If something had gone wrong with your migration to the new servers and you wanted to go back to the old data?

Just saying this as I'm trying to work out what would cause the large empty sections in the first instance. I'm thinking it's a fairly exceptional occurrence rather than something which happens all the time?

Post by **DonZoomik** » Feb 19, 2019 10:39 am this post

I simplified my case a bit. Of course there are more than one or two restore points, let's say 14. But the point remains, once the big delete gets merged to VBK after 14 new restore points, I'm still stuck with a VBK that is as big as originally.
This is an extreme case but generally I restart chains once in a while. As VMs live new data gets written to new previously unused blocks (that cause VBK growth) and deleted from others (with no reclamation) - for example log rotation, patching etc. So even if VM size stays relatively constant, VBKs keeps growing. I have no exact measured data but I'd say maybe 10% per quarter. Growth slows down over time as blocks already described in VBK get reused but the point remains - no way to reclaim without expensive compact or restarting chain (both require buffer space and may be unfeasible for huge VMs).

ejenner · Post by **ejenner** » Feb 19, 2019 12:02 pm this post

I suppose with it being an occasional thing rather than something which always happens you could avoid running out of disk space by copying your old backup chain off the repository and starting a new chain. Would not really have to be automated as it's occasional and you're aware that it is happening so you can manage the process. It can be a bit frustrating I know, I've had to do similar things with large file servers... but can't see the case for automating it for reasons stated above.

Post by **DonZoomik** » Feb 19, 2019 12:44 pm this post

Copying data off repository comes down again to buffer space.
I still see sparse files as a silver bullet in these cases. Considering that on Windows side, ReFS/NTFS deduplicated volumes files are already sparse, it's an easy win. On Windows side it shouldn't be that hard to implement as well, set file as sparse with FSCTL_SET_SPARSE, on merge clear nonexistant data with FSCTL_SET_ZERO_DATA (information on cleared blocks has to be in the chain or compact wouldn't know what to skip). I'm not sure about Linux but I presume that there are similar syscalls/IOCTL/commands (fallocate?). Maybe as an experimental feature controlled by a registry flag (like ReFSDedupeBlockClone)?

Post by **Gostev** » Feb 19, 2019 9:57 pm this post

You can just backup to ReFS with synthetic fulls enabled?

Same exact benefits with the GA version of the product and using by now well stabilized functionality delivered 3 years ago!

Sounds so much better than betting your data integrity on some rarely used extended file system controls that only God know how many bad data corruption bugs they may have due

Post by **DonZoomik** » Feb 19, 2019 11:40 pm this post

ReFS does help but it can't be used everywhere. On one site I have to use a ZFS box, on the other NTFS etc, backup copy to a NAS etc... Also some clients just hate Windows with a passion (barely putting up with VBR server but storing data on Linux).
Sparse file support has existed for nearly 20 years on Windows alone, I can' t find any markers on XFS (but being extent-based, likely from inception), so I doubt that there are a huge number of bugs left.

Post by **HannesK** » Feb 20, 2019 6:58 am this post

so I doubt that there are a huge number of bugs left

I believed the same in various situations. I worked a lot with Linux in the past and asked at Veeam from time to time why we do specific things. I often got the answer "we tested it and it broke or created several thousand support cases". A deeper integration into specific file systems cost a lot of R&D resources (especially QA) that could be spend better for new features.

Post by **DonZoomik** » Feb 20, 2019 8:14 am this post

I can't argue with the risk of hidden bugs but this implementation seems such a low hanging fruit.
This integration is not file system specific as these syscalls are abstracted by kernel. If filesystem doesn't support this syscall, you get an error back and life goes on. 2 implementations (Windows and Linux) should cover all use cases if underlying filesystem supports sparse files.

R&D Forums

Feature request: sparse file support to avoid the need for compact operations

Re: Feature request: sparse file support to avoid the need for compact operations

Re: Feature request: sparse file support to avoid the need for compact operations

Re: Feature request: sparse file support to avoid the need for compact operations

Re: Feature request: sparse file support to avoid the need for compact operations

Re: Feature request: sparse file support to avoid the need for compact operations

Re: Feature request: sparse file support to avoid the need for compact operations

Re: Feature request: sparse file support to avoid the need for compact operations

Re: Feature request: sparse file support to avoid the need for compact operations

Who is online