Comprehensive data protection for all workloads
Post Reply
MrGrim
Novice
Posts: 6
Liked: never
Joined: Oct 29, 2024 5:39 pm
Full Name: Michael Kreitzer
Contact:

XFS and Thin LVM Volumes

Post by MrGrim »

Hello,

This is my first post and bold red text is telling me to have a case ID, so here it is: #07483853

I think this is worth having on the forum though for google to index. Has anyone else tried this combo? I need to describe our setup before I describe what I'm seeing, please bear with me. :)

We are using hardened repositories with XFS reflinking backed by LVM thin provisioned volumes. We have about 15 repositories each in the 10-18TB range. We try to keep around 1-2TB free per repository to accommodate any unexpected incremental size bursts. The storage is from a cloud provider and quite expensive, so we use LVM thin provisioned volumes to pool available burst space so that instead of having 20+TB unused space that we must pay for we only have <5TB.

We originally started out on plain virtual disks. XFS was formatted with the following parameters:

mkfs.xfs -b size=4096 -m reflink=1,crc=1,bigtime=1 -L <volname> <dev>

This used the default sunit/swidth values of 0. When we decided to convert to LVM thin volumes, we settled on a 1MB chunk size. To migrate the volumes we used xfs_copy which auto detected the correct values for sunit and swidth (256 each) and set them on the destination. However, I don't believe the existing data was changed in any way so is not taking into account that setting. On the Veeam side we are using copy jobs which do not have a configurable block size. The parent job is using a block size of 512KB.

We have one repository that holds the vast majority of our VM's, 95 to be exact. We are using per machine backup files with 31 point retention. This repository is 13TB large with 2.1TB free space.

What we're seeing is the majority of the free space not being released to the pool:

Code: Select all

$ df -h /repos/foo
Filesystem                         Size  Used Avail Use% Mounted on
/dev/mapper/foo                     13T   11T  2.1T  85% /repos/foo
$ fstrim -v /repos/foo
/repos/foo: 536.3 GiB (575894441984 bytes) trimmed
$ lvs -a repos/foo
LV            VG    Attr       LSize  Pool      Origin Data%  Meta%  Move Log Cpy%Sync Convert
foo           repos Vwi-aotz-- 13.00t repo-pool        96.47
This is actually an improvement, it was only 17GB a few days ago. This problem may slowly resolve with time...

To ty to figure out the problem, I checked fragmentation on this volume:

Code: Select all

# xfs_db -r /dev/repos/foo
xfs_db> frag -f
actual 45993872, ideal 3178, fragmentation factor 99.99%
Note, this number is largely meaningless.
Files on this filesystem average 14472.58 extents per file
I suspect the high fragmentation and the bulk of the data being written when sunit/swidth were 0 is the reason why so little data can be trimmed. This will likely resolve to a degree as time goes on, but a lot of data remains fairly static.

XFS does have a defragment tool, but in order to function it requires contiguous free space of the size of the largest file. It's also hard to tell if it supports reflink as it is common for some XFS tools not to (e.g. xfsdump does not). This would require adding around 8TB to each repo, and since XFS does not support shrinking we would be stuck with that.

I've considered using the Veeam defragment option, but I'm a little unclear on the details. How much free space does it need when using per machine backup files. Is it the full space of the backup job, or just the largest VM in the job? How does this combine with Immutability?

What is the block size of a copy job? Is it fixed? Is it inherited from the parent? Is it configurable via CLI?

What other considerations would you recommend to ensure data is written and contained in appropriately sized and aligned blocks? E.g. would the XFS mount option "swalloc" make any sense?

Thanks!
mkretzer
Veeam Legend
Posts: 1289
Liked: 464 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: XFS and Thin LVM Volumes

Post by mkretzer »

Do you have discard enabled in the filesystem? I am not sure but discarding unused blocks might be necessary to regain LVM thin space.
MrGrim
Novice
Posts: 6
Liked: never
Joined: Oct 29, 2024 5:39 pm
Full Name: Michael Kreitzer
Contact:

Re: XFS and Thin LVM Volumes

Post by MrGrim »

fstrim and the discard option are two means to the same end. One does not require the other. E.g. that fstrim command I ran did reclaim the space it was able to. The difference is scheduled full or on demand style with various performance possibilities for either. For example, here is the output for a repo that contains only 1 very large VM that has very little churn:

Code: Select all

$ df -h /repos/bar
Filesystem                         Size  Used Avail Use% Mounted on
/dev/mapper/bar                     14T   13T  1.7T  89% /repos/bar
$ fstrim -v /repos/bar
/repos/bar: 1.7 TiB (1865565347840 bytes) trimmed
Edit: Actually now that I check this has a parent job with a block size of 1MB. Do copy jobs inherit the parents block size?
tdewin
Veeam Software
Posts: 1856
Liked: 669 times
Joined: Mar 02, 2012 1:40 pm
Full Name: Timothy Dewin
Contact:

Re: XFS and Thin LVM Volumes

Post by tdewin »

When you use XFS with relinks, we will use "fast cloning" to make synthetic fulls. This means that a new full will refer to chunks of data in the old vbk-files. Whenever that previous file is deleted, inherently you get fragmentation. The problem with defragmenting in this situation is always, for which file? If you have 10 VBKs, should it be the first one, second, ..nth? For which one XFS should optimise?
MrGrim
Novice
Posts: 6
Liked: never
Joined: Oct 29, 2024 5:39 pm
Full Name: Michael Kreitzer
Contact:

Re: XFS and Thin LVM Volumes

Post by MrGrim »

Indeed I suspected the fragmentation was unavoidable. With reflink the very nature of what is "contiguous" becomes murky. So I think the goal should be to work with the various alignment and block size settings to still allow effective discards. My focus is on repair options for when those get mismatched and you don't have a lot of space to be throwing around fresh full backups. My secondary goal is to simply ask what others experience is and if they have other gotcha's I should be looking out for. :) My tertiary goal is to get as much of this info in a place indexable by search engines as possible.

What I need to learn is:

* How do you set the block size for a copy job?
* Does the Veeam maintenance/compaction/defrag operation work with per machine chains in a way that minimizes the balloon space required. If I have 50 VM's in a job, do I need room for a full backup of all 50, or just the largest of the 50?

Appreciate the feedback!
tdewin
Veeam Software
Posts: 1856
Liked: 669 times
Joined: Mar 02, 2012 1:40 pm
Full Name: Timothy Dewin
Contact:

Re: XFS and Thin LVM Volumes

Post by tdewin »

Not R&D here but just a pre-sales so take it with a grain of salt. Just my experience analyzing XFS

It is hard to say on what the blocks will align. Since we use compression the 1MB block size might not match 1MB. Eg it can be compressed to a lower amount, maybe sometimes half but it not necessarily a fixed alignment (just like a zip file is not necessary 1/2 of your original file). The block size you select in the job is how we read the source (vmdk if you use vSphere). We then take that block and compress it. So of course the bigger the block size you select at source, the less fragmentation you get at the target but also the bigger your incrementals will be and the less amount of sharing you potentially get. 1MB seems to be an optimal point between savings and performance.

The block size for a copy job is inherited from the source job because otherwise we would have to repack blocks (eg reading multiple blocks from source chain, potentially different files) to create new blocks.

The compacting maintenance job in the past was made for Reverse Incremental and Forever incremental jobs where data could get stale (we cannot shrink files). I don't think it does a lot for forward chains with weekly fulls as we build a full only with the blocks we need (presumably if you use XFS, you have weekly fulls enabled). This is also mentioned in the helpcenter (If you schedule periodic full backups, the Defragment and compact full backup file check box does not apply. https://helpcenter.veeam.com/docs/backu ... ml?ver=120)

Ultimately, what we store is already quite optimised so thin provisioning might only get you so far
tsightler
VP, Product Management
Posts: 6040
Liked: 2867 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: XFS and Thin LVM Volumes

Post by tsightler » 2 people like this post

When we decided to convert to LVM thin volumes, we settled on a 1MB chunk size.
It feels like this is the core of the problem. If you provisioned 1MB chunks, then you need entire 1MB aligned chunks to be free for them to be returned to the pool. If even 1 byte is used from a 1MB chunk, then it must stay allocated. Because of the way block cloning works, and the overall used space on the volume, it is quite unlikely that there are significant amounts of 1MB aligned chunks to be freed, I'm actually surprised it's as high as you are seeing.

To have a higher chance of having free chunks which can be returned to the pool, you would need a significantly smaller chunk size, probably something like 64K, although this would likely have a negative impact on fragmentation at the LVM layer, potentially adding a second layer of fragmentation with potential for significant performance penalty, especially as part of restores which require granular access such as FLR.
dejan.ilic
Enthusiast
Posts: 48
Liked: 5 times
Joined: Apr 11, 2019 11:37 am
Full Name: Dejan Ilic
Contact:

Re: XFS and Thin LVM Volumes

Post by dejan.ilic »

Also check this (from the documentation) regarding the LVM volumes

Create a thin pool with a specific discards mode:
$ lvcreate --type thin-pool -n ThinPool -L Size
--discards ignore|nopassdown|passdown VG

Change the discards mode of an existing thin pool:
$ lvchange --discards ignore|nopassdown|passdown VG/ThinPool

mkretzer wrote: Oct 29, 2024 7:41 pm Do you have discard enabled in the filesystem? I am not sure but discarding unused blocks might be necessary to regain LVM thin space.
Post Reply

Who is online

Users browsing this forum: Google [Bot] and 19 guests