Backup Copy Job - Compact of Full Backup File, Transformation Processes, health check

Post by **StanoSedliak** » Jan 12, 2024 8:29 am this post

Hi,

I was reading Veeam user guide about Compact, Transformation, Full backup file merge and health check but I didnt really found how is the logic behind.

The problem is that we have a lot of jobs in Veeam which we are duplicating with backup copy job to a NAS.
Source/primary backup storage is a physical hardened linux repo and secondary storage is a NAS in RAID6 attached as SMB backup repo.

We have a lot of jobs/Data which we are duplicating to the NAS with forever incremental, so I scheduled a compact backup job once per month for every job and scheduled it through the whole month, I also scheduled health checks job the same way.

As we have linux repo, we also created some physical servers as gateways (gatewayA,B etc...) as linux can not send data to NAS.

And now my question if someone would have the info.
1. Compact - "creates a new full backup file in the target repository: it copies existing data blocks from the old backup file, rearranges and stores them close to each other" - this means in our case that customerA has 1TB Full backup job on the NAS, on the day when compact is scheduled VBR will trigger a job which will copy the existing customerA full backup on the same NAS but in reality VBR copied 1TB through the gatewayA Server so read 1TB from NAS to gatewayA and again write 1TB to NAS? SO 1 TB go through the gatewayA Server, right?

2. Transformation - this is the task injecting oldest incr. into the full as we have forever incr. or full backup file merge which needs to be done on every run now, how is this task in reality be done?
I also see in the jobs Processing VMName_xy:vm-138285 CTransformAlg_f849dc04-71bb-4b3f-9aef-973875a5bc65 etc... this belong to which task because I have in inside the job for example 10 VMs, 9 VMs are done, state successfully but I still see the log, when I do not press on the VM inside the job but only open the job and check general info that this kind of processing is still running for 8/10 VMs.This job is deleting the the oldest restore point after the injection of the oldest incr. to the full was done? So the steps are: incr. backup job->injecting odtest incr. job into full-> deleting oldest incr. job from storage?

3. Health check - "calculates CRC values for backup metadata and hash values for data blocks of a disk in the backup file and saves these values in the metadata of the backup file" who is doing here the main work?

The problem is that the performance is bad and compact or transformation or health check is taking sometimes more hours, so I try to figure out how we can make it faster that we can duplicate/backup copy within 24 hours, if it makes sense for big jobs to scheduled a GFS that we do not need to do a compact job and transformation of if we can change something on the setup to make it quicker, if you have any info please let me know. Thank you!

Jan 12, 2024 3:23 pm

Hi Stano

secondary storage is a NAS in RAID6 attached as SMB backup repo.
---
We have a lot of jobs/Data which we are duplicating to the NAS with forever incremental,

May I ask why you use forever incremental and SMB repository? All those operations requires your gateway server to read the size of an entire full backup from the NAS and also write it back. A forever forward incremental backup chain will do that every day to merge the oldest increment with the full backup.

1. Compact - "creates a new full backup file in the target repository: it copies existing data blocks from the old backup file, rearranges and stores them close to each other" - this means in our case that customerA has 1TB Full backup job on the NAS, on the day when compact is scheduled VBR will trigger a job which will copy the existing customerA full backup on the same NAS but in reality VBR copied 1TB through the gatewayA Server so read 1TB from NAS to gatewayA and again write 1TB to NAS? SO 1 TB go through the gatewayA Server, right?

Correct. 1 TB will be read and 1 TB will be written again to the SMB share. Which server is doing the operation depends on your repository settings. With automatic gateway server selection, the mount server could be used as the gateway server:
https://helpcenter.veeam.com/docs/backu ... -selection

2. Transformation - this is the task injecting oldest incr. into the full as we have forever incr. or full backup file merge which needs to be done on every run now, how is this task in reality be done?
I also see in the jobs Processing VMName_xy:vm-138285 CTransformAlg_f849dc04-71bb-4b3f-9aef-973875a5bc65 etc... this belong to which task because I have in inside the job for example 10 VMs, 9 VMs are done, state successfully but I still see the log, when I do not press on the VM inside the job but only open the job and check general info that this kind of processing is still running for 8/10 VMs.This job is deleting the the oldest restore point after the injection of the oldest incr. to the full was done? So the steps are: incr. backup job->injecting odtest incr. job into full-> deleting oldest incr. job from storage?

1. Send the incremental backup to the SMB share
2. Read with the gateway server the oldest increment and entire full backup
3. Write back the size of an entire full backup to the SMB share
Again, 1TB plus incremental size read, 1TB plus new incremental size write.

3. Health check - "calculates CRC values for backup metadata and hash values for data blocks of a disk in the backup file and saves these values in the metadata of the backup file" who is doing here the main work?

With automatic gateway selection, it is the associated mount server: https://helpcenter.veeam.com/docs/backu ... ml?ver=120

The problem is that the performance is bad and compact or transformation or health check is taking sometimes more hours, so I try to figure out how we can make it faster that we can duplicate/backup copy within 24 hours, if it makes sense for big jobs to scheduled a GFS that we do not need to do a compact job and transformation of if we can change something on the setup to make it quicker, if you have any info please let me know. Thank you!

If you can use your NAS device through ISCSI as a reFS or XFS repository and use regular synthetic full backups, you will make the backup processes much faster. And you will save space through Fast Clone.

I also suggest to open a case with our customer support. Let them investigate the logs to see where the bottleneck is and then come up with a recommendation. I still think the best way forward is to reevaluate the use of SMB share and instead your NAS with iSCSI as an reFS or XFS repository.

Best,
Fabian

Best,
Fabian

Post by **StanoSedliak** » Jan 15, 2024 7:57 am this post

Hi Fabian, thank you for your input!

May I ask why you use forever incremental and SMB repository? All those operations requires your gateway server to read the size of an entire full backup from the NAS and also write it back. A forever forward incremental backup chain will do that every day to merge the oldest increment with the full backup.

We are using forever incr. because this NAS, secondary location retention is only for the last few days and we would not have enough space for regular full backup chain.

If you can use your NAS device through ISCSI as a reFS or XFS repository and use regular synthetic full backups, you will make the backup processes much faster. And you will save space through Fast Clone.

We are using SMB because:
1. our IS team told us not to use iSCSI as the network was not stable in the beginning
2. if we will mount this NAS to a server and a "hacker" will attack the server he can just delete the data on the NAS
3. we are using snapshotting so if a crypto would encrypt the data now we just revert it from "snapshot backup"

But we could use it via iSCSI with a physical linux box as you mentioned XFS and turn on hardened repository on it + synt. full/reflink, this way also we do not need anymore gateway servers as the backup would go: Primary Linux hardened repo -> copy to secondary Linux hardened repo (NAS) so it would go directly through Veeam Datamover Linux Server01 -> Veeam Datamover Linux Server02.

Do you have any more hint for the setup? Thank you again for your input!

R&D Forums

Backup Copy Job - Compact of Full Backup File, Transformation Processes, health check

Re: Backup Copy Job - Compact of Full Backup File, Transformation Processes, health check

Re: Backup Copy Job - Compact of Full Backup File, Transformation Processes, health check

Who is online