Understanding the File Merge Process

May 09, 2023 3:50 pm

In attempting to troubleshoot some issues with unusually high amounts of data transfer from our Cloud Connect server to an agent being backed up I realized some data transfer seems to occur part way through the backup process, not exclusively at the start as I would expect if the data transfer is just retrieving a list of files to determine what's changed based on modified dates.

So I'm attempting to understand more about how the backup file merge process works, the documentation doesn't really give a technical explanation of what's going on that I could find.

Feel free to correct any of my assumptions if anything looks inaccurate.

My current understanding of a "forever forward" backup is:
The first time a computer attempts to back up a VBK file is created which should contain the entire contents of the computer, or whatever's specified if it's not the entire computer.
The next backup will read file modified dates on the computer and modified dates of the files in the VBK file to determine what's changed and then backup newly modified files.
This next backup will create a VIB file on the repository containing an incremental backup with newly modified files in it.
The process of creating new VIB continues until a specified retention period is met
Then, after creating another VIB file, the oldest VIB file merged into the original VBK file
Then, every time a backup occurs, the merge process occurs again provided the desired number of restore points hasn't been increased

Primarily what I'm curious about is that merging process, my understanding is that the VIB file to be merged is downloaded by the agent and then reuploaded into the VBK file in the repository.
I don't remember where I read it, but I believe I saw another forum post that mentioned none of the Cloud Connect components will perform that merging process locally, so the data must be redownloaded by the agent and then reuploaded to the repository.

Now I'm assuming some part of my understanding is inaccurate because the VBR console does not show enough data being transferred to the agent for it to be transferring the VIB file back to the agent as I described. Also this method of "copy data to the server, later download the same data from the server, then reupload the same data to the same server" seems highly inefficient to me, so I assume either I misunderstand the process or I'm missing how that isn't ridiculously inefficient.

For a little bit of background, my past experience is primarily with a particular different software that allows you to store all data regarding a backup in a single file, as opposed to a VBK file, a bunch of VIB files, and a VBM file, and my understanding was always that the backup process is a simple "copy new data to the file on the server, delete expired data based on configured retention settings" and so the Veeam method, per my understanding, seems very inefficient and like someone went out of the way to design an overly complicated process that, for me anyways, seems prone to errors a lot. But as it doesn't make sense to develop software intentionally confusing and complicated, both to develop and subsequently support, I assume that Veeam did not actually design it that way on purpose, but rather I'm missing something.

Post by **Mildur** » May 09, 2023 4:19 pm this post

Hello Tim

Are you doing volume level or file level backups?
For file level Backups, I recommend to use Veaam's CBT driver. With change block tracking, we don't have to compare the source filesystem with the last state in the backup files.
https://helpcenter.veeam.com/docs/agent ... tml?ver=60

The first time a computer attempts to back up a VBK file is created which should contain the entire contents of the computer, or whatever's specified if it's not the entire computer.

Correct.

The next backup will read file modified dates on the computer and modified dates of the files in the VBK file to determine what's changed and then backup newly modified files.

Depends.
For Entire Computer or Volume Level backup the Agent only reads new and changed blocks. We never compare files for these block level backups. Change block tracking helps to fasten up this process.
When doing File Level backups, the agent will synchronize the content of the backup files with the content of the source filesystem. That may lead to a higher data traffic, because the agent needs to access the metadata of the filesystem in the backup files.
Starting in Veeam Agent for Windows v6, you can also use change block tracking driver for File Level backup jobs to speed up the backup backup process.

This next backup will create a VIB file on the repository containing an incremental backup with newly modified files in it.

Blocks or Files. Depends on your job type.

The process of creating new VIB continues until a specified retention period is met

Correct.

Then, after creating another VIB file, the oldest VIB file merged into the original VBK file

Correct. It's explained here: https://helpcenter.veeam.com/docs/backu ... ml?ver=120

Then, every time a backup occurs, the merge process occurs again provided the desired number of restore points hasn't been increased

Correct.

Primarily what I'm curious about is that merging process, my understanding is that the VIB file to be merged is downloaded by the agent and then reuploaded into the VBK file in the repository.
I don't remember where I read it, but I believe I saw another forum post that mentioned none of the Cloud Connect components will perform that merging process locally, so the data must be redownloaded by the agent and then reuploaded to the repository.

The agent only sends the new VIB file to the Cloud Connect repository. The repository server behind the Cloud Connect repository will take care of the transformation process (merge oldest VIB with VBK).

Best,
Fabian

May 10, 2023 2:51 pm

Okay, well now I'm confused about how I got the impression that the merge process is handled by the agent rather than the server, but I can't find the forum post I thought I saw that said that so whatever. Good to know that works in a more logical way than I previously thought.

The majority of the jobs we have configured are for "Entire Computer" so I assume from your explanation that they're all backed up as blocks rather than files.

Do I understand correctly the CBT driver would make those "Entire Computer" backups more efficient or faster in some manner? Can you give a brief explanation how it's more efficient or faster?

Do I also understand that having the CBT driver would cause File and Folder backups to be handled at a block level rather than a file level?

We currently do not use the CBT driver, but I'm really not sure why. If you can give me an explanation of how it's more efficient to do so I'll certainly inquire with the person who made that decision back when we first started using Veeam. Except for the notes in page you linked regarding particular issues with 10+ year old versions of Windows, is there any reason why the CBT driver would break things or otherwise not be recommended today? If it does make things more efficient in any scenario I have to ask why it's not just enabled by default, there must be some reason why it could be undesirable?

I can ask in the relevant forum if there's more to it than this, but we also have Mac agents in use, do I correctly understand there is no CBT driver available for anything other than Windows and that the Mac and Linux agents always handle backups as Files with no regard for blocks at all?

Post by **Mildur** » May 11, 2023 10:28 am this post

The majority of the jobs we have configured are for "Entire Computer" so I assume from your explanation that they're all backed up as blocks rather than files.

Correct. Entire Computer is backup on block level.

Do I understand correctly the CBT driver would make those "Entire Computer" backups more efficient or faster in some manner? Can you give a brief explanation how it's more efficient or faster?

Without changed block tracking, we need to check every block and compare it with the previous backup session if it has changed or not. With change block tracking, we can read from a list which block IDs have changed. Much faster than going through the entire disk every time.
Now with Veeam Agent, it depends on the source filesystem type if you need our own CBT driver or not.
For NTFS filesystems, we use the default block tracking mechanism. We will get the information about the filesystem from the Master File Table of the backed-up volume and create a snapshot of the metadata in the backup file. For each run we repeat this process and compare the new snapshot with the snapshot created in the previous job session.

For better performance and if you use FAT or reFS volumes on your backed up machine, we cannot use the default mechanism. Here our own CBT driver is required. It also works for NTFS volumes. Here the changed block tracking information is stored directly on the backed up machine in VCT files in the following folder: C:\ProgramData\Veeam\EndpointData\CtStore.

Do I also understand that having the CBT driver would cause File and Folder backups to be handled at a block level rather than a file level?

Yes. But only for files bigger than 50MB. Our own CBT driver is required.

If it does make things more efficient in any scenario I have to ask why it's not just enabled by default, there must be some reason why it could be undesirable?

For machines with NTFS it is not necessary required. And the machine requires reboots each time you run an update or new installation. Besides that side effects, it's safe to use on support systems. Just consider the limitations in our "important" note box on the link I have provided. If you installed the driver on an unsupported system and it will boot to a BSOD, you can use the recovery media to uninstall it: https://helpcenter.veeam.com/docs/agent ... tml?ver=60

I can ask in the relevant forum if there's more to it than this, but we also have Mac agents in use, do I correctly understand there is no CBT driver available for anything other than Windows and that the Mac and Linux agents always handle backups as Files with no regard for blocks at all?

Our change block tracking driver for file level backups is only available with the Windows Agent.
Veeam Agent for Mac only provided file level backups.
Veeam Agent for Linux has it's own change block tracking method only for volume level backups --> https://helpcenter.veeam.com/docs/agent ... tml?ver=60

Best,
Fabian

May 11, 2023 1:59 pm

Thanks for the information there.

I may share if the reason is different, but based on that my guess is the decision was made to not use the CBT driver on our devices at all since the vast majority are agent-based Windows backups of NTFS disks configured to backup the Entire Computer. So no notable benefit. But perhaps we'll begin using it more for specific devices with a more useful situation like when backing up virtual machines or some non-NTFS disks.

May 19, 2023 3:06 pm

Back to the original question about how the merge process works, I have an agent that frequently is saying "Failed to perform backup. Gateway is not available. Failed to perform backup files merge." as a failure message on the job. From what I can tell everything finished fine up until it needed to do the merge process. Then it failed. I assume some sort of connection issue occurred and was named by the Veeam agent as a unavailable gateway, I can say the gateway was definitely available and had other jobs running at the time without a problem, and obviously this one did start the backup job fine, so I'm not sure what caused the error, but the error text would seem to imply that the merge process is not actually done by the server as previously suggested in an above post.

Could you clarify that the merge process is in fact done by the server in all cases? Or are there some where the agent does that?

Update, it did retry the job, connecting successfully, and performed the merge process. However that definitely leads me to believe that the merge process does not actually occur on the server as would make sense.

Post by **Mildur** » May 20, 2023 8:15 am this post

I‘ll check with QA and let you know.

Best,
Fabian

Post by **Mildur** » Jun 05, 2023 12:01 pm this post

Hi Tim

I'm still working on this.
Could you maybe share what repository type you are using on the cloud connect side?
I checked all your comments, but couldn't find a hint which type it could be.

Is there any chance you have opened a case for it?

Best,
Fabian

Jun 05, 2023 12:46 pm

I have not opened a case, but I certainly can if you think there's something to review in the logs.

My understanding is our repository is not a SOBR. I don't believe it's technically a hardened repo either. It is on Linux, with both ZFS and XFS filesystems, one inside the other. I didn't actually set up the repository, nor do I usually do any of the repository maintenance, but I can certainly collect logs and ask any questions of the person who does that. If you let me know what you'd like to know about, if there's a label somewhere that has a "repository type" I'm not sure where that is.

Post by **Mildur** » Jun 05, 2023 1:02 pm this post

Hi Tim

You can find the type of the repository in your backup server console (Service Provider VBR). Please note, type "Hardened" for a Linux server with immutability feature is available starting in v12:

Best,
Fabian

Jun 05, 2023 1:10 pm

The type just says Linux.

Jun 16, 2023 3:40 pm

Any update on this? I have another recent issue where the computer experienced a connection failure right as the backup portion of the job was completing. Seems it couldn't reconnect to do the merge portion of the job. I suppose it's not impossible that it just coincidentally lost the connection with a tiny amount of data remaining and so was technically still on the backup portion, but from what I can tell it's definitely needing the agent to be connected to the repository to perform the merge process. Whether the actual merge process happens on the server or client side I'm unsure, it makes sense from an efficiency perspective to be processed on the server side, but I seem to be encountering different connection issues that are all happening before the merge can complete, which don't seem like they should be relevant if the agent isn't involved in the process.

Here's the thread about that issue. Again, it's not entirely impossible that the connection was dropped with just bytes of data remaining and so it only looks to me like it finished the backup portion, in which case that's probably an entirely unrelated issue. But the only way I know of to get status information is by digging through the log files, and from my digging it does look like the backup portion finished, and then the agent disconnected, intentionally or not I'm not sure, but it couldn't reconnect to do the merge and so the merge never happened.
veeam-cloud-providers-forum-f34/connect ... 87778.html

Post by **Mildur** » Jul 03, 2023 8:23 am this post

Hello Tim

I'm apologize. I wasn't working the last 3 weeks.
While I was gone, I forwarded the question to one of my team colleagues. My colleague discussed it with our QA team and we got the confirmation that it works as I commented earlier in this topic:

- Merge process happens on the Service Provider side.
- The agent downloads minor meta data.
- Metadata contains changed block information with default CBT mechanism.
- No backup data will be downloaded while the merge process happens.

Please let our support team have a look at the logs, if you still see multiple gigabytes of data downloaded to the agent when a merge process happens.

The retry issues mentioned in the other topics of yours are already checked and discussed by someone else from our team.

Best,
Fabian

Jul 03, 2023 3:35 pm

I've looked over it and the issue with multiple gigabytes of data being downloaded ended up being unrelated to the merge process itself, so your explanation makes sense. However it does appear the agent needs to remain connected to the repository the entire time the merge process is happening, I got another error here just this morning where an agent got disconnected during the process and so the job failed.

So it seems, although the process does appear to be happening on the repository side properly, the agent still needs to be actively connected the entire time in order for the process to finish. So it does have the speed benefit of processing on the server, but still struggles with some of my customers who have network connection issues. Usually it finishes fine the next time it runs, typically these connection issues are very temporary. However the behavior still results in a job listed as "Failed" so in my opinion it's still broken. Even though it seems to not matter long-term.

Is that expected that the agent needs to be connected through the duration of the merge process?

Post by **Mildur** » Jul 04, 2023 11:30 am this post

Yes, the Agent must still be able to talk to the cloud connect environment while the job is running.
The synthetic full is orchestrated by the agent, but executed by the repository server after the incremental backup data was uploaded from the agent to the cloud connect repository server.

Best,
Fabian

R&D Forums

Understanding the File Merge Process

Re: Understanding the File Merge Process

Re: Understanding the File Merge Process

Re: Understanding the File Merge Process

Re: Understanding the File Merge Process

Re: Understanding the File Merge Process

Re: Understanding the File Merge Process

Re: Understanding the File Merge Process

Re: Understanding the File Merge Process

Re: Understanding the File Merge Process

Re: Understanding the File Merge Process

Re: Understanding the File Merge Process

Re: Understanding the File Merge Process

Re: Understanding the File Merge Process

Re: Understanding the File Merge Process

Who is online