Agent-based backup of Windows, Linux, Max, AIX and Solaris machines.
Post Reply
hpadm
Enthusiast
Posts: 51
Liked: 10 times
Joined: May 18, 2021 1:55 pm
Location: Slovakia
Contact:

Excluding large temporary files from 'entire computer' server backup

Post by hpadm »

The business software we use has a large database that receives frequent upgrades. The upgrade process creates two temporary copies of the db files. These do not get cleaned up afterwards, and the vendor recommends leaving them around, to be able to investigate if the upgrade process fails. This doubles the amount of data to be read, messes with CBT, degrades compression/deduplication ratio, and results in 3GB larger (five times larger) daily incremental backups compared to if the temporary files are not there.

I would like to exclude these files from the backup. However I'd also like to keep the backup method and performance close to what it currently is. I am not clear on how exactly I am supposed to do this. On one hand I see that file exclusion is only available in File-level backups, but then I also read mention of how backups can be upgraded to volume-level, and how VSS can be configured to exclude things. I can say that the VSS 'FilesNotToSnapshot' registry key is not in applied during 'Entire computer' level backup, even though VSS is being used to do the snapshot.

I am wary of using straight File level backup, since the server also hosts several hundred thousands small files. It probably wouldn't make a big overall difference, I just try to consider other options before adding casual inefficiency to the system.
Mildur
Product Manager
Posts: 9848
Liked: 2607 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian K.
Location: Switzerland
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by Mildur »

Hi hpadm
and results in 3GB larger (five times larger) daily incremental backups compared to if the temporary files are not there.
Is it 3GB or 3TB?
3 GB doesn‘t sound bad. If you choose file level backup, the required storage for incremental backups could be much higher. Each changed file has to be backed up, while volume level backup only backups the changed block inside the files.

For using file level backups, I suggest to move the database files to a dedicated volume. Check with your vendor if that‘s possible.
You can use a file level backup job to backup the db files on file level. For the other volumes, select in the job protecting the entire volume.
This will make the job a hybrid job. All selected „entire volumes“ will be processed in the volume-level mode. And the „volume with the db files“ will be processed in the file level mode.

Thanks
Fabian
Product Management Analyst @ Veeam Software
hpadm
Enthusiast
Posts: 51
Liked: 10 times
Joined: May 18, 2021 1:55 pm
Location: Slovakia
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by hpadm » 1 person likes this post

The 400% overhead on daily backups is a nuisance, but the main issue is with full backups meant to be archived long term. The presence of two near-identical copies of the database more than doubles the size of the backup, and even with max deduplication, for some reason Veaam is not doing anything about the copies (maybe it works differently than I assume).

I could split and resize the storage partitions to keep the database and the linked document archive separate if needed. I don't know what the storage layout change would do to existing backups though. I guess in this case even FLR from orphaned backups would do...

If there is no nicer way, I'll probably go with file level backup. Though when I tried configuring it, the UI threw me off. There is no fancy folder tree as in the standalone client, it just wants file paths. There's also two checkboxes, "operating system" and "personal files", which really doesn't sound like it intends to do a volume-level snapshot.
Mildur
Product Manager
Posts: 9848
Liked: 2607 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian K.
Location: Switzerland
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by Mildur »

Database files normally have a much worst deduplication than other files. That‘s probably why you don‘t see a better deduplication between two databases. After the upgrade, the copied database will be differently from the upgraded one.

If you can’t use a dedicated disk and it‘s only 3GB, i still recommend to ignore it and use just volume level backups. I don‘t believe you will save a lot of storage with file level backup and excluding a 3GB file.
I don't know what the storage layout change would do to existing backups though. I guess in this case even FLR from orphaned backups would do...
Nothing.
Backups are backups. They kept their content and you can use it to do restores.
There is no fancy folder tree as in the standalone client, it just wants file paths. There's also two checkboxes, "operating system" and "personal files", which really doesn't sound like it intends to do a volume-level snapshot.
Yes. Because if you use VBR to configure agent policies, you configure a policy for each agent in this policy. VBR doesn‘t know how the path structure will look like on each agent. So you have to provide the Paths you want to backup.
Personal files will be a file level backup, because it backups specific files. Operating system will backup the system reserved partition and the system volume on volume level.
https://helpcenter.veeam.com/docs/backu ... ml?ver=110

The behavior of this hybrid jobs is documented here:

https://helpcenter.veeam.com/docs/backu ... ml?ver=110
A file-level backup job may contain entire volumes and individual folders or files from the other volumes (this is referred to as hybrid backup job). In this case, entire volumes are processed using volume-level backup mode while specific folders from other volumes are processed using file-level backup mode.
Product Management Analyst @ Veeam Software
hpadm
Enthusiast
Posts: 51
Liked: 10 times
Joined: May 18, 2021 1:55 pm
Location: Slovakia
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by hpadm » 1 person likes this post

I took a hard look at the backup sizes and the database characteristics.
- Only 3% of all 256kB blocks in the database stay the same from day to day, even when there is no user activity.
- but 99+% of the database file stays the same, the churn is on the byte level.
- The database 7z-compresses to 10% its original size (with a large dictionary).
- The overhead of the two temporary copies in the full backup is +0.7x the database size, and represents a +30% increase of the full backup's size.

If I were to leave things as-is, the yearly storage overhead for these temporary files would be +20%.
If I were to additionally raise the short-term retention for incrementals from 2mo to 12mo, to have access to accurate historical snapshots, the total yearly storage overhead would become +95%. That would be a solid chunk of our NAS storage, and would definitely need addressing.

I am considering splitting up the job for full host backup, into OS + database + documents, to be able to give them vastly different backup and retention policies. For example, the OS is boring and doesn't care about long term retention.The database could probably use daily forward incremental with 5y retention (currently veeam UI caps at 2y). The documents archive is an ever-growing pile of static data that could probably get away with just 1 full backup per year. But I'm not sure if splitting it up like that would be a good idea.
VBR doesn‘t know how the path structure will look like on each agent. So you have to provide the Paths you want to backup.
Actually, the VBR console does have a 'Files' section which seems to be able to browse the full contents of every host with an agent on it (I think). So the capability is there, it might just be an unimplemented UI feature.
Personal files will be a file level backup, because it backups specific files. Operating system will backup the system reserved partition and the system volume on volume level.
What confuses me is, if you back up the OS volume, it will grab the user data as well. Checking one and leaving the other checked makes no sense, unless the "file-level operating system backup" would just grab select OS-related directories on C:\, skipping everything else. I was kinda expecting the 'personal files' checkbox to get greyed out. Maybe it exists for cases where the Users directory is moved via Folder redirection to a different volume?
If you choose file level backup, the required storage for incremental backups could be much higher. Each changed file has to be backed up, while volume level backup only backups the changed block inside the files.
Huh, wait, hold on a moment there. So there are no incremental differences being computed against previous versions of matching file paths, it's just all-or-nothing?
EDIT: the guide for the standalone agent 5.0 states as much. The guide for VBR agent management does not mention this show-stopping detail.
Mildur
Product Manager
Posts: 9848
Liked: 2607 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian K.
Location: Switzerland
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by Mildur »

I suggest to create two test backup jobs and let it run for a few days. Compare the job performance and storage usage with file backup and volume backup jobs.
So the capability is there, it might just be an unimplemented UI feature.
Files shows only files from managed servers. So yes, probably it could be ported to work for agents as well.
But I assume, with the recommended approach to do entire computer or volume level backups, not many users of veeam would use it on a daily basis.
What confuses me is, if you back up the OS volume, it will grab the user data as well.

Correct. Selecting operating system will also include personal files. And it will be backed up on volume level.
Maybe it exists for cases where the Users directory is moved via Folder redirection to a different volume?
Some enduser (workstation) only wants to protect personal files. If you have a windows provisioning system in place, you don‘t have to backup the entire machine. Just deploy a new machine with it and restore your personal files.
Huh, wait, hold on a moment there. So there are no incremental differences being computed against previous versions of matching file paths, it's just all-or-nothing?
File Lebel backup jobs will always backup the entire file if it has changed. Unchanged files will not be backed up.
Example:
You have word document which has a size of 200MB. Then you add a single letter on the last page.
With volume level backup, it would only backup this added 1-2kb of blocks. With file level backup, it would backup the entire 200MB.
Now imagine hundreds of files just changing a little bit. They will backed up entirely each time the change.

File level backups can take up more space and will have lower backup performance compared to a volume backup.

Thanks
Fabian
Product Management Analyst @ Veeam Software
hpadm
Enthusiast
Posts: 51
Liked: 10 times
Joined: May 18, 2021 1:55 pm
Location: Slovakia
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by hpadm »

File level backups can take up more space and will have lower backup performance compared to a volume backup.
Yeah... so it surprises me that in multiple previous threads where people were asking for volume level backup with file exclusion, the suggested answer was to use File-level backup. The only remaining explanation for that would be if the File-level backup was able to upgrade to volume-level, but would still apply the configured file exclusions. In that case, I'm confused why Veeam did not add file exclusion UI into volume-level and full computer backup modes as well. According to MS documentation, VSS is clearly able to apply file exclusions while performing a volume-level snapshot (with some considerations), however Veeam is not using it.
Mildur
Product Manager
Posts: 9848
Liked: 2607 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian K.
Location: Switzerland
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by Mildur »

I will give it a test in my lab.
One of my team told me a few minutes ago that the key should work.
I will update the topic when I have tested it.

Thanks
Faban
Product Management Analyst @ Veeam Software
hpadm
Enthusiast
Posts: 51
Liked: 10 times
Joined: May 18, 2021 1:55 pm
Location: Slovakia
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by hpadm »

I have switched the backup job to File-level backup, selected OS and D:\ to back up, and added and exclusion entry to D:\temp\*. I then observed the backup job. I saw that at the beginning, it creates FilesNotToSnapshot\VeeamExcludePaths and fills in paths to the usual stuff, like Recycle Bin, swapfile, temporary user files and some System Volume Information stuff. Disturbingly, my exclusion path was not included in there. The backup completed, I inspected it, and I see that the excluded path is still there.

So, to summarize:
- when the volume root is selected for backup, the procedure gets upgraded to volume-level backup and behaves exactly like before
- path exclusions have no effect
- adding a separate set of values under FilesNotToSnapshot\ has no effect
- freezing Veeam's values from being altered, and appending my exclusion paths to it, causes an error.
Mildur
Product Manager
Posts: 9848
Liked: 2607 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian K.
Location: Switzerland
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by Mildur »

Thanks for testing it.
We will surely check it. If the documentation needs an adjustment, I will make sure that we can update it.
Product Management Analyst @ Veeam Software
hpadm
Enthusiast
Posts: 51
Liked: 10 times
Joined: May 18, 2021 1:55 pm
Location: Slovakia
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by hpadm »

Thank you for bearing with me. I have reverted everything to a stable state and then tried again, carefully.

I noticed that I made a mistake when entering the exclusion paths. It is not perfectly clear, but to make this work, the user needs to enter a folder path (cannot exclude individual files), and must not add a wildcard character, because this changes it from a folder path to an exclusion mask, and it skips adding these to FilesNotToSnapshot. Which is odd, because VSS does accept wildcards here. Anyway, when I formatted the paths properly, they did get added to Veeam's custom exclusion registry setting.

I found that Veeam's way of converting exclusion file paths into patterns inserted into FilesNotToSnapshot is defective. Veeam converts "path" into "path\* /s", whereas all other entries in that key use "*.* /s". When I ran my tests, I discovered that files without a dot suffix were being excluded, but files with a dot suffix, most importantly "data.db", were not being excluded. This is a case where * is being interpreted differently than *.*.

I also noticed that I didn't fully confirm how things work, and ran today's testing in incremental backup mode to speed things up. Just in case, I redid all my tests in Active Full mode.

Finally, I noticed that various backup modes have certain quirks:
- When defining exclusions in File-level mode, the text box description says "Exclude masks:", even though to make this work the entry must not be formatted in the form of a mask.
- When defining exclusions in Volume-level mode, the boxes all say "Volume name:". It is possible to enter a folder path, and it will get a folder-like icon, but then I get a warning, and these entries get skipped.
- Entire Computer mode does not have an UI for defining exclusions, even though technically there's nothing preventing it from being there.
hpadm
Enthusiast
Posts: 51
Liked: 10 times
Joined: May 18, 2021 1:55 pm
Location: Slovakia
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by hpadm » 1 person likes this post

To summarize the above:
- Veeam skips exclusion file paths containing wildcards even though VSS supports them.
- Veeam converts exclusion file paths into a wildcard pattern for VSS, but mistakenly uses * instead of *.* so it misses stuff.
- The UI only allows folder exclusions in one of three modes, but technically it should work for all of them.

Suggestions for improvement:
- When an exclusion entry is a path without wildcards, append "\*.* /s". Atm ther's a mistake and files with a dot are not excluded.
- When an exclusion entry is a path with wildcards, do not skip it. Add it to the list for VSS, just append /s for recursion.
- The UI for full computer, volume-level and file-level backups could be merged into a single screen, and the capabilities for selecting what to back up and what to exclude could be merged together. The system would then display the proposed backup strategy.
hpadm
Enthusiast
Posts: 51
Liked: 10 times
Joined: May 18, 2021 1:55 pm
Location: Slovakia
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by hpadm »

I have submitted the issue with the wrong wildcard used as Case #05688841.
I have submitted the suggestion to make the exclusion patterns not restricted to just folders, as Case #05688805.
Mildur
Product Manager
Posts: 9848
Liked: 2607 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian K.
Location: Switzerland
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by Mildur »

Thanks for summary and the cases.

Thanks
Fabian
Product Management Analyst @ Veeam Software
hpadm
Enthusiast
Posts: 51
Liked: 10 times
Joined: May 18, 2021 1:55 pm
Location: Slovakia
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by hpadm »

Unfortunately both cases were closed due to unavailability of staff. Which is a problem since both of these are fairly basic mistakes and they have significant consequences - one causes file exclusions to not work as documented, the other prevents (otherwise supported) file-level exclusions and wildcard-based exclusions. I really wish these would get picked up and resolved by the time of VBR 12's release.

One thing I found just now that confuses me, though. The builtin exclusion rules include

Code: Select all

$UserProfile$\AppData\Local\Temp\* /s
, which also uses the single wildcard format, and thus should also be causing exclusion issues. However, I checked my latest backup, and those files are all gone.
Mildur
Product Manager
Posts: 9848
Liked: 2607 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian K.
Location: Switzerland
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by Mildur »

Hi

That can happen with cases for our free and community products. They only get support on best effort. I will check the behavior later this week in my lab. I‘m currently out of office.

Thanks
Fabian
Product Management Analyst @ Veeam Software
hpadm
Enthusiast
Posts: 51
Liked: 10 times
Joined: May 18, 2021 1:55 pm
Location: Slovakia
Contact:

Re: Excluding large temporary files from 'entire computer' server backup

Post by hpadm » 1 person likes this post

My conclusion in the former case was incorrect. Wildcard '*' does work in a clean test environment. My exclusion tests on the database server were failing due to some other factor. I have mirrored the database files into my test environment and reproduced the behavior there. Close examination of the files' attributes and permissions revealed that the 'Read Only' attribute is the cause.

My other conclusion, that Veeam/VSS only uses Veaam's 'VeeamExcludePaths', was also incorrect. I was too focused on my issue and my testing was too narrow. The reality is that every entry under the FilesNotToSnapshot registry key gets added together. So the admin is not entirely restricted by Veeam's UI; if they really need to, they can manually write a local exclusion entry using VSS's syntax.

veeam-agent-for-windows-f33/filesnottos ... 75094.html
I have discovered that in june 2021, a Veeam user encountered this exact behavior, went through the same investigation steps, questioned the * vs *.* difference, identified the read-only attribute as the culprit, and also did tests with other attributes. However they seem to have given up pursuing the topic. It is unfortunate that I was unable to find this thread before. Even now, it requires some very specific keywords to locate via web search. I found it while searching for 'VSS FilesNotToSnapshot read only not excluded'.

Here is my revised list of recommendations to Veeam:
  1. Improve documentation on 'Specifying Folders to Back Up' for VBR and for Agent. The 'VSS exclusion rules skip over read-only files' thing seems to be undocumented behavior - I could not find any relevant mention of 'read only' in MS documentation for VSS (I searched the PDF version). If you have a developer support contract with Microsoft, you could ask them for an explanation. It could be of value to warn users that read-only files will not be excluded, for some reason. I have to say though, those pages are really heavy on warnings, exceptions and side-notes.
  2. Revise the UI and exclusion pattern logic, as outlined in case #05688805. The manual currently warns "If you include a whole volume in the file-level backup, you cannot apply filters to include or exclude files of a specific type in/from the backup. You can only exclude specific folders that reside on the volume.". However, this is an artificial restriction. VSS allows excluding individual files, allows excluding patterns with wildcards, and allows using certain documented $variables. I assume this happened because the 'file-level backup upgraded to volume-level' mechanism was added later on, and was bolted onto the existing UI. As it is right now, exclusion entries with wildcards get silently skipped over.
Post Reply

Who is online

Users browsing this forum: No registered users and 4 guests