-
- VP, Product Management
- Posts: 7098
- Liked: 1517 times
- Joined: May 04, 2011 8:36 am
- Full Name: Andreas Neufert
- Location: Germany
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
-
- Product Manager
- Posts: 14881
- Liked: 3098 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
it's generic enablement for block storage. BTRFS reflink works, but was never tested by QA. I also expect bcachefs to work, but I did not test that.DonZoomik wrote:But no generic enablement without filesystem filter (btrfs, non-ReFS SMB)?
NFS reflink needs changes in the gateway server role (different component).
-
- Novice
- Posts: 6
- Liked: 2 times
- Joined: May 23, 2024 5:35 pm
- Full Name: zbe
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
I have a stupid question perhaps. How/where can this be enabled and tested exactly? At the moment I have a SMB repository on a Proxmox VM (linux), but this won't work apparently? So if anyone is so kind to tell me basic steps to make this work from Veeam B&R on a Windows server VM (hyper-v), I'd be grateful.
Cheers.
-
- Product Manager
- Posts: 14881
- Liked: 3098 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
for SMB on top of ZFS, please see Andreas answer above. That's very likely to fail.
1. Linux machine is required (you have that as far as I see). We did not test / check ZFS on Windows (but I think that's not your question)
2. Create the file /etc/veeam/EnableZFSBlockCloning file on the Linux machine
3. Add the Linux machine as Linux repository and keep the XFS checkbox checked
Best regards,
Hannes
-
- Service Provider
- Posts: 208
- Liked: 43 times
- Joined: Oct 28, 2010 10:55 pm
- Full Name: Ashley Watson
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
-
- Product Manager
- Posts: 14881
- Liked: 3098 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
-
- Service Provider
- Posts: 208
- Liked: 43 times
- Joined: Oct 28, 2010 10:55 pm
- Full Name: Ashley Watson
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
Fully understand that 8u3a/8u3 isn't supported yet, but just wondered if Veeam has come across any of these types of issues in their labs before I dig deeper?
In the meantime, we've switched off synthetic fulls and now run active fulls once per week and the performance is great.
-
- Product Manager
- Posts: 14881
- Liked: 3098 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
I would assume, that the VCenter version is completely irrelevant for synthetic fulls. Or are you saying, that these problems only exist since 8.0U3?
If you see general issues with synthetic fulls, then I recommend reporting that to the OpenZFS bug tracker.
To answer your question: no, we did not test that combination.
Doing active full makes reflink irrelevant...
Best regards,
Hannes
-
- Service Provider
- Posts: 208
- Liked: 43 times
- Joined: Oct 28, 2010 10:55 pm
- Full Name: Ashley Watson
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
As you can imagine before I can engage with the OpenZFS team I'm going to need to precisely identify when the issue first occured.
The only reason I've switched back to active fulls is that we can continue to get the performance benefits out of OpenZFS but it will buy us time to investigate and isolate the problem. I'll confirm here when I've had chance to run some tests to identify the source of the issue. Thanks.
-
- Service Provider
- Posts: 208
- Liked: 43 times
- Joined: Oct 28, 2010 10:55 pm
- Full Name: Ashley Watson
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
https://github.com/openzfs/zfs/releases/tag/zfs-2.2.5
Yet another solid release from the OpenZFS team.
It hasn't hit the el8 zfs-testing repos yet (Rocky8/RHEL8); https://github.com/openzfs/zfs/issues/16430 , but I'm assuming it will in next few days.
-
- Service Provider
- Posts: 208
- Liked: 43 times
- Joined: Oct 28, 2010 10:55 pm
- Full Name: Ashley Watson
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
-
- Novice
- Posts: 6
- Liked: 2 times
- Joined: May 23, 2024 5:35 pm
- Full Name: zbe
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
Code: Select all
Agent: Failed to process method {Transform.CompileFIB}: Operation not permitted Failed to clone file extent. Source offset: 1835008, target offset: 1703936, length: 131072
Processing finished with errors at <time>
Code: Select all
cp file1 file2
-
- Product Manager
- Posts: 14881
- Liked: 3098 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
hmm, not really... it should becreates a clone just fine.
Code: Select all
cp --reflink=always file1 file2
Please remember, that reflink is still disabled per default on ZFS and needs to be enabled manually.
-
- Novice
- Posts: 6
- Liked: 2 times
- Joined: May 23, 2024 5:35 pm
- Full Name: zbe
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
But using `cp file1 file2` does create a clone. (It defaults to --reflink=auto, which uses copy_file_range())
-
- Veeam Software
- Posts: 1818
- Liked: 655 times
- Joined: Mar 02, 2012 1:40 pm
- Full Name: Timothy Dewin
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
-
- Novice
- Posts: 6
- Liked: 2 times
- Joined: May 23, 2024 5:35 pm
- Full Name: zbe
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
-
- Service Provider
- Posts: 208
- Liked: 43 times
- Joined: Oct 28, 2010 10:55 pm
- Full Name: Ashley Watson
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
Sorry for late reply - been stuck on other stuff recently.zbe wrote: ↑Aug 17, 2024 7:57 pm I'm getting following error when job tries to perform synthetic full:
Not sure what is not permitted, for exampleCode: Select all
Agent: Failed to process method {Transform.CompileFIB}: Operation not permitted Failed to clone file extent. Source offset: 1835008, target offset: 1703936, length: 131072 Processing finished with errors at <time>
creates a clone just fine.Code: Select all
cp file1 file2
I can confirm we are seeing the same issue when the synthetic fulls using OpenZFS. In our case on one of our main jobs, we see say 266 success and 80 fail. The retry picked up the failures and processes them correctly. Surprisingly most of the 80 failures occur at exactly the same source/target offset as what you are seeing.
Code: Select all
Agent: Failed to process method {Transform.CompileFIB}: Resource temporarily unavailable
Failed to clone file extent. Source offset: 1835008, target offset: 1703936, length: 131072
Agent: Failed to process method {Transform.CompileFIB}: Resource temporarily unavailable
Failed to clone file extent. Source offset: 2228224, target offset: 2097152, length: 131072
-
- Novice
- Posts: 6
- Liked: 2 times
- Joined: May 23, 2024 5:35 pm
- Full Name: zbe
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
-
- Influencer
- Posts: 12
- Liked: 3 times
- Joined: Jun 19, 2019 9:55 am
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
I'm getting the same error. No containers. Just Ubuntu server 24.04.1 LTS. Kind of hard to troubleshoot right now for me as I don't have a good way of forcing a synthetic full.zbe wrote: ↑Aug 17, 2024 7:57 pm I'm getting following error when job tries to perform synthetic full:
Not sure what is not permitted, for exampleCode: Select all
Agent: Failed to process method {Transform.CompileFIB}: Operation not permitted Failed to clone file extent. Source offset: 1835008, target offset: 1703936, length: 131072 Processing finished with errors at <time>
creates a clone just fine.Code: Select all
cp file1 file2
I tried setting
Code: Select all
Set-VBRJobAdvancedBackupOptions -Job 'test' -Algorithm 'Incremental' -TransformFullToSyntethic $True
-
- Novice
- Posts: 6
- Liked: 2 times
- Joined: May 23, 2024 5:35 pm
- Full Name: zbe
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
As your veeam user you can try to block clone a testfile directly on your server:
Code: Select all
cp --reflink=always testfile testfile2
-
- Service Provider
- Posts: 208
- Liked: 43 times
- Joined: Oct 28, 2010 10:55 pm
- Full Name: Ashley Watson
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
Our prmary backup unit is hyperconverged under VMware (using PCI passthroough of the HBA contorller.
Code: Select all
Agent: Failed to process method {Transform.CompileFIB}: Resource temporarily unavailable Failed to clone file extent. Source offset: 1835008, target offset: 1703936, length: 131072
Code: Select all
Creating synthetic full backup (xx% done) [fast clone]
I was checking some of the stats though on IO throughput, and noticed that when we last recreated the zpool we forgot to disable atime so I've just disabled that. (We run with sync disabled as well in our case for performance).
I noticed dozens of z_wr_int processes in the process list while the synthetic fulls were running.
Code: Select all
# zpool status VeeamBackup
# zfs get atime VeeamBackup
# zfs set atime=off VeeamBackup
# zfs get atime VeeamBackup
-
- Influencer
- Posts: 12
- Liked: 3 times
- Joined: Jun 19, 2019 9:55 am
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
That made it a lot easier to track down than waiting for an elusive synthetic full backup.zbe wrote: ↑Sep 08, 2024 6:05 pm Well, all I can say is that after switching to privileged container it all works as expected.
As your veeam user you can try to block clone a testfile directly on your server:Code: Select all
cp --reflink=always testfile testfile2
Apparently I never had block cloning enabled properly (or permanently).
Though my pool showed:
Code: Select all
$ zpool get feature@block_cloning
NAME PROPERTY VALUE SOURCE
repopool1 feature@block_cloning enabled local
$
Code: Select all
$ cat /sys/module/zfs/parameters/zfs_bclone_enabled
0
$
Code: Select all
options zfs zfs_bclone_enabled=1
-
- Product Manager
- Posts: 14881
- Liked: 3098 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
correct. That file disables all reflink checks and technically it allows to use reflink on any file system that supports reflink (bcachefs, btrfs, ocfs2...)It seems Veeam only checks if the /etc/veeam/EnableZFSBlockCloning file exists when the repository is created.
Best regards,
Hannes
-
- Chief Product Officer
- Posts: 31835
- Liked: 7325 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
-
- Service Provider
- Posts: 208
- Liked: 43 times
- Joined: Oct 28, 2010 10:55 pm
- Full Name: Ashley Watson
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
Due to reliability concerns on the synthetic rolls ups failing, we've switched our backup job to run active full backups once per week instead - this is working without issue.
However our backup copy job still generate synthetic tasks that fail at random times.
eg. on a backup copy job of 6 VMs, 3 of the VMs fail in the style of;
Code: Select all
14/10/2024 1:21:13 PM :: Agent: Failed to process method {Transform.CompileFIB}: Resource temporarily unavailable
Failed to clone file extent. Source offset: 1809317888, target offset: 2890792960, length: 1048576
Code: Select all
14/10/2024 1:21:25 PM :: Agent: Failed to process method {Transform.CompileFIB}: Resource temporarily unavailable
Failed to clone file extent. Source offset: 1808007168, target offset: 2999058432, length: 2359296
14/10/2024 1:21:25 PM :: Processing finished with errors at 14/10/2024 1:21:25 PM
Code: Select all
[14.10.2024 13:20:57.859] <139869015758400> alg | Transmission pipe configuration, CPU number: 48, digests calc: enabled, crypto policy: Policy: 'enNoCryptoProcessing', target keyset ID: 'null'
[14.10.2024 13:20:57.859] <139869015758400> cli | --pex:0;4194304;4194304;0;4194304;0;0;0;0;82;3;62;133733388578598370
[14.10.2024 13:20:57.859] <139869015758400> cli | - 0%
[14.10.2024 13:21:09.629] <139869015758400> cli | --pex:49;6392119296;6392119296;0;3988783104;0;0;0;0;0;8;99;133733388696290450
[14.10.2024 13:21:09.629] <139869015758400> cli | - 49%
[14.10.2024 13:21:19.946] <139869015758400> cli | --pex:56;7335837696;7335837696;0;4932501504;0;0;0;0;0;3;99;133733388799465590
[14.10.2024 13:21:19.946] <139869015758400> cli | - 56%
[14.10.2024 13:21:24.036] <139869531645504> cli | ERR |Command 'FibWrite(0x000x7f35cc07f7c0)' has failed.
[14.10.2024 13:21:24.036] <139869531645504> cli | >> |Resource temporarily unavailable
[14.10.2024 13:21:24.036] <139869531645504> cli | >> |Failed to clone file extent. Source offset: 1808007168, target offset: 2999058432, length: 2359296
[14.10.2024 13:21:24.036] <139869531645504> cli | >> |--tr:Failed to duplicate file extent from HostFS:///VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T060045_79DE.vib:1808007168 to /VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T132043_8D80.vbk:2999058432. Extent length: 2359296
[14.10.2024 13:21:24.036] <139869531645504> cli | >> |--tr:Failed to process de-duplicated block
[14.10.2024 13:21:24.036] <139869531645504> cli | >> |--tr:Failed to write block to blocks store.
[14.10.2024 13:21:24.036] <139869531645504> cli | >> |--tr:Failed to append block. FIB index: [2060]. FIB: [VeeamProxy3-flat.vmdk].
[14.10.2024 13:21:24.036] <139869531645504> cli | >> |--tr:Failed to process area [Data block. Start offset: [8640266240], Length: [4194304], Area ID: [1289].].
[14.10.2024 13:21:24.036] <139869531645504> cli | >> |An exception was thrown from thread [139869531645504].
[14.10.2024 13:21:24.036] <139868524877376> cli | ERR |Send thread (channel) has failed.
[14.10.2024 13:21:24.036] <139868524877376> cli | >> |Resource temporarily unavailable
[14.10.2024 13:21:24.036] <139868524877376> cli | >> |Failed to clone file extent. Source offset: 1808007168, target offset: 2999058432, length: 2359296
[14.10.2024 13:21:24.036] <139868524877376> cli | >> |--tr:Failed to duplicate file extent from HostFS:///VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T060045_79DE.vib:1808007168 to /VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T132043_8D80.vbk:2999058432. Extent length: 2359296
[14.10.2024 13:21:24.036] <139868524877376> cli | >> |--tr:Failed to process de-duplicated block
[14.10.2024 13:21:24.036] <139868524877376> cli | >> |--tr:Failed to write block to blocks store.
[14.10.2024 13:21:24.036] <139868524877376> cli | >> |--tr:Failed to append block. FIB index: [2060]. FIB: [VeeamProxy3-flat.vmdk].
[14.10.2024 13:21:24.036] <139868524877376> cli | >> |--tr:Failed to process area [Data block. Start offset: [8640266240], Length: [4194304], Area ID: [1289].].
[14.10.2024 13:21:24.036] <139868524877376> cli | >> |--tr:Failed to wait for recorder invoke.
[14.10.2024 13:21:24.036] <139868524877376> cli | >> |--tr:Failed to check queue and pending invoke status. Wait flag 'true'.
[14.10.2024 13:21:24.036] <139868524877376> cli | >> |--tr:FIB proxy was unable to process write operation, recorder '0x000x7f35cc07f7c0', area offset '8778678272', FIB 'Backup of the FIB VeeamProxy3-flat.vmdk'.
[14.10.2024 13:21:24.036] <139868524877376> cli | >> |--tr:Unable to asynchronously write data block. Block identity: [Data block. Start offset: [8778678272], Length: [4194304], Area ID: [1318].].
[14.10.2024 13:21:24.036] <139868524877376> cli | >> |An exception was thrown from thread [139869531645504].
[14.10.2024 13:21:24.036] <139868524877376> cli | Thread finished. Role: 'Area writer: Backup of the FIB VeeamProxy3-flat.vmdk'.
[14.10.2024 13:21:24.036] <139869015758400> alg | Stopping processing threads....
[14.10.2024 13:21:24.036] <139868076234304> cli | Thread finished. Role: 'Processing thread: Digests calculator.'.
[14.10.2024 13:21:24.036] <139868516484672> cli | Thread finished. Role: 'Area processing dispatcher: Digests calculator.'.
[14.10.2024 13:21:24.036] <139868550055488> cli | Thread finished. Role: 'area reader : VeeamProxy3-flat.vmdk'.
[14.10.2024 13:21:24.036] <139869015758400> alg | Processing threads were stopped.
[14.10.2024 13:21:24.036] <139869015758400> alg | Processing conveyored task... Failed.
[14.10.2024 13:21:24.036] <139869015758400> alg | ERR |Transmission pipe error.
[14.10.2024 13:21:24.036] <139869015758400> alg | >> |Resource temporarily unavailable
[14.10.2024 13:21:24.036] <139869015758400> alg | >> |Failed to clone file extent. Source offset: 1808007168, target offset: 2999058432, length: 2359296
[14.10.2024 13:21:24.036] <139869015758400> alg | >> |--tr:Failed to duplicate file extent from HostFS:///VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T060045_79DE.vib:1808007168 to /VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T132043_8D80.vbk:2999058432. Extent length: 2359296
[14.10.2024 13:21:24.036] <139869015758400> alg | >> |--tr:Failed to process de-duplicated block
[14.10.2024 13:21:24.036] <139869015758400> alg | >> |--tr:Failed to write block to blocks store.
[14.10.2024 13:21:24.036] <139869015758400> alg | >> |--tr:Failed to append block. FIB index: [2060]. FIB: [VeeamProxy3-flat.vmdk].
[14.10.2024 13:21:24.036] <139869015758400> alg | >> |--tr:Failed to process area [Data block. Start offset: [8640266240], Length: [4194304], Area ID: [1289].].
[14.10.2024 13:21:24.036] <139869015758400> alg | >> |--tr:Failed to wait for recorder invoke.
[14.10.2024 13:21:24.036] <139869015758400> alg | >> |--tr:Failed to check queue and pending invoke status. Wait flag 'true'.
[14.10.2024 13:21:24.036] <139869015758400> alg | >> |--tr:FIB proxy was unable to process write operation, recorder '0x000x7f35cc07f7c0', area offset '8778678272', FIB 'Backup of the FIB VeeamProxy3-flat.vmdk'.
[14.10.2024 13:21:24.036] <139869015758400> alg | >> |--tr:Unable to asynchronously write data block. Block identity: [Data block. Start offset: [8778678272], Length: [4194304], Area ID: [1318].].
[14.10.2024 13:21:24.036] <139869015758400> alg | >> |--tr:Processing of asynchronous write requests has failed. Output file: [Backup of the FIB VeeamProxy3-flat.vmdk].
[14.10.2024 13:21:24.036] <139869015758400> alg | >> |An exception was thrown from thread [139869531645504].
[14.10.2024 13:21:24.037] <139869015758400> stg | Updating item in local backup. Item details: [Internal FIB 'VeeamProxy3-flat.vmdk']. Is incremental CTK is available: [false]. Failed.
[14.10.2024 13:21:24.037] <139869015758400> dsk | Unbuffered file I/O statistics for '/VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T060045_79DE.vib': 417792 bytes read (2 requests), 1052672 bytes read unaligned (3 requests)
[14.10.2024 13:21:24.037] <139869015758400> stg | Closing storage file [HostFS:///VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T060045_79DE.vib].
[14.10.2024 13:21:24.037] <139869015758400> stg | Removing lock on file [/VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T060045_79DE.vib].
[14.10.2024 13:21:24.037] <139869015758400> dsk | Unbuffered file I/O statistics for '/VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-11T060019_1FAE.vib': 417792 bytes read (2 requests), 1052672 bytes read unaligned (3 requests)
[14.10.2024 13:21:24.037] <139869015758400> stg | Closing storage file [HostFS:///VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-11T060019_1FAE.vib].
[14.10.2024 13:21:24.037] <139869015758400> stg | Removing lock on file [/VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-11T060019_1FAE.vib].
[14.10.2024 13:21:24.037] <139869015758400> dsk | Unbuffered file I/O statistics for '/VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-10T060024_1ED1.vib': 417792 bytes read (2 requests), 1052672 bytes read unaligned (3 requests)
[14.10.2024 13:21:24.037] <139869015758400> stg | Closing storage file [HostFS:///VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-10T060024_1ED1.vib].
[14.10.2024 13:21:24.037] <139869015758400> stg | Removing lock on file [/VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-10T060024_1ED1.vib].
[14.10.2024 13:21:24.037] <139869015758400> dsk | Unbuffered file I/O statistics for '/VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-09T060051_59B8.vib': 417792 bytes read (2 requests), 1052672 bytes read unaligned (3 requests)
[14.10.2024 13:21:24.037] <139869015758400> stg | Closing storage file [HostFS:///VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-09T060051_59B8.vib].
[14.10.2024 13:21:24.037] <139869015758400> stg | Removing lock on file [/VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-09T060051_59B8.vib].
[14.10.2024 13:21:24.037] <139869015758400> dsk | Unbuffered file I/O statistics for '/VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-08T060222_64E0.vbk': 417792 bytes read (2 requests), 1052672 bytes read unaligned (3 requests)
[14.10.2024 13:21:24.037] <139869015758400> stg | Closing storage file [HostFS:///VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-08T060222_64E0.vbk].
[14.10.2024 13:21:24.037] <139869015758400> stg | Removing lock on file [/VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-08T060222_64E0.vbk].
[14.10.2024 13:21:24.037] <139869015758400> bkp | Building new FIB "VBK: 'veeamfs:a:747fb8ab-a505-4ee1-aa0e-c28b12257900 (vm-620369)\VeeamProxy3-flat.vmdk@/VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T132043_8D80.vbk'". Failed.
[14.10.2024 13:21:24.037] <139869015758400> cli | ERR |Failed to process method {Transform.CompileFIB}
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |Resource temporarily unavailable
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |Failed to clone file extent. Source offset: 1808007168, target offset: 2999058432, length: 2359296
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to duplicate file extent from HostFS:///VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T060045_79DE.vib:1808007168 to /VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T132043_8D80.vbk:2999058432. Extent length: 2359296
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to process de-duplicated block
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to write block to blocks store.
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to append block. FIB index: [2060]. FIB: [VeeamProxy3-flat.vmdk].
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to process area [Data block. Start offset: [8640266240], Length: [4194304], Area ID: [1289].].
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to wait for recorder invoke.
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to check queue and pending invoke status. Wait flag 'true'.
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:FIB proxy was unable to process write operation, recorder '0x000x7f35cc07f7c0', area offset '8778678272', FIB 'Backup of the FIB VeeamProxy3-flat.vmdk'.
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Unable to asynchronously write data block. Block identity: [Data block. Start offset: [8778678272], Length: [4194304], Area ID: [1318].].
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Processing of asynchronous write requests has failed. Output file: [Backup of the FIB VeeamProxy3-flat.vmdk].
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to process conveyored task.
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to update item in local backup. Item details: [Internal FIB 'VeeamProxy3-flat.vmdk'].
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to build new FIB "VBK: 'veeamfs:a:747fb8ab-a505-4ee1-aa0e-c28b12257900 (vm-620369)\VeeamProxy3-flat.vmdk@/VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T132043_8D80.vbk'" from backup chain.
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |An exception was thrown from thread [139869531645504].
[14.10.2024 13:21:24.037] <139869015758400> cli | ERR |Failed to process {Invoke} command
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |Duration: [00:00:26.533]
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |Details:
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |Agent: Failed to process method {Transform.CompileFIB}: Resource temporarily unavailable
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |Failed to clone file extent. Source offset: 1808007168, target offset: 2999058432, length: 2359296
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to duplicate file extent from HostFS:///VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T060045_79DE.vib:1808007168 to /VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T132043_8D80.vbk:2999058432. Extent length: 2359296
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to process de-duplicated block
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to write block to blocks store.
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to append block. FIB index: [2060]. FIB: [VeeamProxy3-flat.vmdk].
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to process area [Data block. Start offset: [8640266240], Length: [4194304], Area ID: [1289].].
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to wait for recorder invoke.
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to check queue and pending invoke status. Wait flag 'true'.
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:FIB proxy was unable to process write operation, recorder '0x000x7f35cc07f7c0', area offset '8778678272', FIB 'Backup of the FIB VeeamProxy3-flat.vmdk'.
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Unable to asynchronously write data block. Block identity: [Data block. Start offset: [8778678272], Length: [4194304], Area ID: [1318].].
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Processing of asynchronous write requests has failed. Output file: [Backup of the FIB VeeamProxy3-flat.vmdk].
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to process conveyored task.
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to update item in local backup. Item details: [Internal FIB 'VeeamProxy3-flat.vmdk'].
[14.10.2024 13:21:24.037] <139869015758400> cli | >> |--tr:Failed to build new FIB "VBK: 'veeamfs:a:747fb8ab-a505-4ee1-aa0e-c28b12257900 (vm-620369)\VeeamProxy3-flat.vmdk@/VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T132043_8D80.vbk'" from backup chain.
[14.10.2024 13:21:24.039] <139868990580288> cli | Next client command: [closeStg].
[14.10.2024 13:21:24.039] <139868990580288> cli | :> /VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T132043_8D80.vbk
[14.10.2024 13:21:24.039] <139868990580288> cli | Closing storage '/VeeamBackup/XFSBackups/backups/copyjob:Orbit_Critical Management VMs/Orbit_Critical Management VMs/VeeamProxy3.vm-620369D2024-10-14T132043_8D80.vbk'.
Below are what I think its relevant to our target backup copy server.
The source backup target copy server has a similar configuration.
Code: Select all
# cat /proc/mounts|grep VeeamBackup
VeeamBackup /VeeamBackup zfs rw,seclabel,noatime,xattr,noacl,casesensitive 0 0
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
VeeamBackup 98.5T 61.0T 98.4T /VeeamBackup
# zpool status
pool: VeeamBackup
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
VeeamBackup ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
sdc ONLINE 0 0 0
scsi-35000039ae873d3e5 ONLINE 0 0 0
scsi-35000039ae873c119 ONLINE 0 0 0
scsi-35000039ae873d3c9 ONLINE 0 0 0
scsi-35000039ae878d31d ONLINE 0 0 0
sdf ONLINE 0 0 0
raidz1-1 ONLINE 0 0 0
sdg ONLINE 0 0 0
sdh ONLINE 0 0 0
sdi ONLINE 0 0 0
sdj ONLINE 0 0 0
sdk ONLINE 0 0 0
sdm ONLINE 0 0 0
raidz1-2 ONLINE 0 0 0
sdn ONLINE 0 0 0
sdo ONLINE 0 0 0
sdp ONLINE 0 0 0
sdq ONLINE 0 0 0
sdr ONLINE 0 0 0
sds ONLINE 0 0 0
raidz1-3 ONLINE 0 0 0
sdt ONLINE 0 0 0
sdu ONLINE 0 0 0
sdv ONLINE 0 0 0
sdw ONLINE 0 0 0
spares
scsi-STOSHIBA_MG06SCA800EY_51X0A13GF1GF AVAIL
errors: No known data errors
Could it be something like this that is overloading the storage throttling as the number of parallel tasks hitting the storage isn't being represented under the total number of tasks.
As said, its strange that this error is intermittent, and its strange that 50% of the transformation stages on a 6 VM copy job are fialing, whereas about 9% of the transformation stages are fialing on a 322 VM copy job.
I suspect that whatever the cause of the copy job failures is the same cause as our synthetic transformation stages filing when we enable synthetic fulls.
Any thoughts?
I've attached the log files to the ticket I opened a while back (Hannes know the ticket number is 07451905)
thanks
Ashley
-
- Service Provider
- Posts: 208
- Liked: 43 times
- Joined: Oct 28, 2010 10:55 pm
- Full Name: Ashley Watson
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
-
- Service Provider
- Posts: 208
- Liked: 43 times
- Joined: Oct 28, 2010 10:55 pm
- Full Name: Ashley Watson
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
It appears there are essentially two scripts that are out there, zhammer.sh and reproducer.sh which were used to investigate block cloning issues early on when the OpenZFS team introduced the reflink support into zfs.
https://raw.githubusercontent.com/numin ... zhammer.sh
https://gist.github.com/tonyvi hutter/d ... roducer.sh
reproducer appears to write 1000 files of 1M each so we'll use a gnu utility called parallel to spin up 8 instances of this concurrently to simulate the concurrent load patterns on the storage repository.
Code: Select all
# yum install parallel
# cd /VeeamBackup/Testing
# wget https://gist.github.com/tonyhutter/d69f305508ae3b7ff6e9263b22031a84/raw/c543f1c60eec6c115bd3d1d97fa45c4bb8b3e573/reproducer.sh
# chmod o+x reproducer.sh
# echo "will cite"|parallel --citation
# seq 8 | parallel -n0 /VeeamBackup/Testing/reproducer.sh
Code: Select all
# wget https://raw.githubusercontent.com/numinit/nixpkgs/refs/heads/zhammer/nixos/tests/zhammer.sh
# chmod o+x zhammer.sh
# ./zhammer.sh
I'll now attempt to reproduce the issue from a Veeam perspective with the smallest job and proxy+storage throttles that I can in the absence of a script/version of those scripts that can simulate the Veeam traffic patterns.
My other concern is that the the synthetic full process is by nature heavy (in terms of RAM and everything else)... If we are running multiple proxies (like we do), what is stopping the scheduler allocating multiple synthetic processes to the same proxy and therefor overloading it (reasoning here is synthetic transformation process is a very different workload pattern to an incremental)?
-
- Service Provider
- Posts: 208
- Liked: 43 times
- Joined: Oct 28, 2010 10:55 pm
- Full Name: Ashley Watson
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
My understanding is we can't do this (even via power-shell) multiple times in one day - so it becomes very hard to attempt to replicate a transient error.
-
- Product Manager
- Posts: 14881
- Liked: 3098 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
there is a registry key that enforces transforms at every job run. That's a global setting. The key can be provided via support (it a "restricted" key, so I cannot post it in public here).
Best regards,
Hannes
-
- Service Provider
- Posts: 208
- Liked: 43 times
- Joined: Oct 28, 2010 10:55 pm
- Full Name: Ashley Watson
- Contact:
Re: OpenZFS 2.2 support for reflinks now available
It enabled me to consistently reproduce the issue using a single proxy (with 8 parallel processes) and a storage device running ZFS on rocky9 with a throttle limit of 4.
Both proxy and storage unit run on the same physical piece of tin under ESXi 8. the Rocky9 VM uses PCI passthrough of the HBA controller.
We are running a 23 disk raid set striping across 4 raidz1 sets with each spindle running a 16TB disk. (SEAGATE ST16000NM006) - not that any of that is relevant.
The Veeam job crashes out with the same messages as before;
17/10/2024 2:20:40 PM :: Agent: Failed to process method {Transform.CompileFIB}: Resource temporarily unavailable
Failed to clone file extent. Source offset: 3345874944, target offset: 14280163328, length: 1835008
In our lab that is about 12 minutes into a large job.
The error causes subsequent VM backups within the job to fail.
I've attached a full set of log files to the ticket (including /var/log).
If I retry the job afterwards, the job restarts ok and the VM that was previously failing on the transformation stage then goes through a successful transformation - indicating to me this is not a ZFS block cloning/corruption issue. (note: without the registry key from support it's not possible to force the transformation stage on reruns of the job within the same day,and the only reason I needed this key was to attempt to reproduce the issue).
Not sure where I go from here, but for now we are continuing to use ZFS but we have synthetics disabled, and we are able to ingest quickly enough and have sufficient storage where we can run full backups once per week.
Code: Select all
# zpool status
pool: VeeamBackup
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
VeeamBackup ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
sde ONLINE 0 0 0
sdf ONLINE 0 0 0
sdg ONLINE 0 0 0
sdh ONLINE 0 0 0
raidz1-1 ONLINE 0 0 0
sdi ONLINE 0 0 0
sdj ONLINE 0 0 0
sdk ONLINE 0 0 0
sdl ONLINE 0 0 0
sdm ONLINE 0 0 0
sdn ONLINE 0 0 0
raidz1-2 ONLINE 0 0 0
sdo ONLINE 0 0 0
sdp ONLINE 0 0 0
sdq ONLINE 0 0 0
sdr ONLINE 0 0 0
sds ONLINE 0 0 0
sdt ONLINE 0 0 0
raidz1-3 ONLINE 0 0 0
sdu ONLINE 0 0 0
sdv ONLINE 0 0 0
sdw ONLINE 0 0 0
sdx ONLINE 0 0 0
sdy ONLINE 0 0 0
spares
sdz AVAIL
errors: No known data errors
Code: Select all
diskseq/18, /dev/disk/by-id/scsi-SSEAGATE_ST16000NM006J_???????
Device Number: block 8:240-8:255 (char 21:17)
Geometry (Logical): CHS 1945332/255/63
Size: 31251759104 sectors a 512 bytes
Capacity: 14902 GB (16000900661248 bytes)
Config Status: cfg=new, avail=yes, need=no, active=unknown
Attached to: #14 (Serial Attached SCSI controller)
Who is online
Users browsing this forum: Bing [Bot], Google [Bot], mkretzer, Semrush [Bot] and 86 guests