Agent-based backup of Windows, Linux, Max, AIX and Solaris machines.
Post Reply
chrisflyckelen
Service Provider
Posts: 59
Liked: 7 times
Joined: Oct 15, 2019 7:51 am
Contact:

Backup Size much larger than original size

Post by chrisflyckelen »

Hello guys,

i am backing up a virtual Microsoft failover cluster using the Veeam Agent for Windows. The job runs every day and creates synthetic full backups on Saturdays.
Original size of the active cluster node ist 16,9TB. I noticed that the data size of synthetic full backup is about three times (i.e. 45,4TB) larger than the original size.

This issue allocates too much storage space. I have to change the retention policy to avoid an backup outage.

What can be the reason for that?

Case ID 04802722

Greetz,
Christian
Mildur
Product Manager
Posts: 8549
Liked: 2223 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian K.
Location: Switzerland
Contact:

Re: Backup Size much larger than original size

Post by Mildur »

What are you using as a backup repo?
If you use a reFS or xfs formated backup repo, FastClone can be used with synthetic Fulls.
Then the storage usage should only be around 18-24TB.
Without FastClone, you have 3 Full Backups and the daily increments. Then these 45 terabytes are realistic. Or do you mean, that a single vbk file is 45TB?

https://helpcenter.veeam.com/docs/backu ... ml?ver=110
Product Management Analyst @ Veeam Software
chrisflyckelen
Service Provider
Posts: 59
Liked: 7 times
Joined: Oct 15, 2019 7:51 am
Contact:

Re: Backup Size much larger than original size

Post by chrisflyckelen »

Hey Mildur,

thanks for your reply.
We are using a ReFS formatted repo on a physical machine running Windows Server 2019. I assume this configuration meet the requirements as FastClone is enabled by default.
The vbk file is around 27TB due to compression.

We had some conversation with the support team. The action plan is to run a compact job. We configured this to run last night, but it doesn't.
PetrM
Veeam Software
Posts: 3229
Liked: 520 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Backup Size much larger than original size

Post by PetrM »

Hello,

Did compact job fail with an error or it just was not started?

Thanks!
perjonsson1960
Veteran
Posts: 443
Liked: 44 times
Joined: Jun 06, 2018 5:41 am
Full Name: Per Jonsson
Location: Sweden
Contact:

Re: Backup Size much larger than original size

Post by perjonsson1960 »

I have the same problem now. I have started to backup up our new file server cluster recently. The cluster is not yet in production. Files are being copied from the old cluster once a week using RoboCopy. The initial .vbk backup file was just over 17 TB. After that I have run three incremental backups manually, once a week (the job is not yet scheduled). The .vib files are between 100 and 200 GB. And today a synthetic full was created, and the new .vbk file ended up being just over 27 TB. I have never seen this behaviour before. In all others backup jobs all the synthetic fulls are virtually the same size.
We are using WIndows Server 2016 and ReFS file system, and fast clone was used when the synthetic full was created.
PetrM
Veeam Software
Posts: 3229
Liked: 520 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Backup Size much larger than original size

Post by PetrM »

Hello,

I suppose that some new data appeared on source and this data had been processed during the latest incremental session prior synthetic full itself was created. Basically, the latest full must contain the full image of the workload. If my assumption is wrong, then I'd suggest to contact our support team because a precise examination of infrastructure is needed. By the way, you may also check REFS space savings, some examples are described in this blog.

Thanks!
perjonsson1960
Veteran
Posts: 443
Liked: 44 times
Joined: Jun 06, 2018 5:41 am
Full Name: Per Jonsson
Location: Sweden
Contact:

Re: Backup Size much larger than original size

Post by perjonsson1960 »

New data appeared on source, sure. But not 10 TB compressed data. Not by a longshot. The incrementals have read approx. 2 TB of uncompressed data together since the initial full backup, and that was compressed to about 700 GB according to the job logs. And isn't a synthetic full supposed to be both compacted and defragmented? I don't understand what is happening. What will happen when the next synthetic full is created? I don't understand how 700 GB of compressed data can increase the size of the full backup file from 17 to 27 TB.

I configured the backup job with a GFS policy to keep four weekly and three monthly fulls, but I don't know if I have room for that, even though the B&R inline data deduplication is used. And I haven't even started to do backup copies of this cluster...

The size of this file server cluster, and the size of the backups, are starting to become a problem, and I must seriously consider to backup the two servers in separate jobs, instead of backing up the whole shebang as a failover cluster. Especially if the synthetic fulls are growing like this, and also since the health check took over 26 hours with the 17 TB backup. How long will the health check take on a 27 TB backup? It is simply not a realistic scenario.
PetrM
Veeam Software
Posts: 3229
Liked: 520 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Backup Size much larger than original size

Post by PetrM »

Hello,

From my point of view, this behavior is not expected. I'd suggest to contact our support team and ask our engineers to look at the issue.

Thanks!
perjonsson1960
Veteran
Posts: 443
Liked: 44 times
Joined: Jun 06, 2018 5:41 am
Full Name: Per Jonsson
Location: Sweden
Contact:

Re: Backup Size much larger than original size

Post by perjonsson1960 » 1 person likes this post

Case #04826794
perjonsson1960
Veteran
Posts: 443
Liked: 44 times
Joined: Jun 06, 2018 5:41 am
Full Name: Per Jonsson
Location: Sweden
Contact:

Re: Backup Size much larger than original size

Post by perjonsson1960 » 2 people like this post

The Veeam support has found the probable cause for the large synthetic full; The dedup was turned off because there were too many blocks in the dedup tree:

"[17.05.2021 14:09:13.806] < 11748> stg | WARN|Dedup index limit of [8388608] blocks was reached. Next block will not be added to dedup tree."

According to the support's suggestion, I have now changed the Storage setting in the job to "Local target (Large blocks)" and started an Active Full. Stay tuned.
perjonsson1960
Veteran
Posts: 443
Liked: 44 times
Joined: Jun 06, 2018 5:41 am
Full Name: Per Jonsson
Location: Sweden
Contact:

Re: Backup Size much larger than original size

Post by perjonsson1960 »

The Active Full has been done. The result was a new full backup file of approx. 17 TB, as expected. The new synthetic full will be created on Monday 7th June. We'll see then how it looks.

A question that I hope someone can answer: The "dedup tree", is that an index that is built for each backup file, the whole backup chain, or for the entire repository?
PetrM
Veeam Software
Posts: 3229
Liked: 520 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Backup Size much larger than original size

Post by PetrM »

Hello,

It's related to an index that is built for each backup file if you're talking about log line above. However, I'd leave the task to analyze and interpret debug logs to our support team, sometimes more context is required and we don't have it because we cannot troubleshoot technical issues over forum posts.

Thanks!
perjonsson1960
Veteran
Posts: 443
Liked: 44 times
Joined: Jun 06, 2018 5:41 am
Full Name: Per Jonsson
Location: Sweden
Contact:

Re: Backup Size much larger than original size

Post by perjonsson1960 »

I have an open case as mentioned above. It was your support technician who mentioned the dedup tree for me.
perjonsson1960
Veteran
Posts: 443
Liked: 44 times
Joined: Jun 06, 2018 5:41 am
Full Name: Per Jonsson
Location: Sweden
Contact:

Re: Backup Size much larger than original size

Post by perjonsson1960 »

When enabling the Inline data deduplication in a backup job, there are four storage optimization options, "Local target (Large blocks)" (as mentioned above), "Local target", "LAN target" and "WAN target". But in a backup copy job you can enable the Inline data dedup, but there are no options. Why?

What happens if I want to make backup copies of these large backups, if I cannot choose Large blocks, when that is required in the backup job?
PetrM
Veeam Software
Posts: 3229
Liked: 520 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Backup Size much larger than original size

Post by PetrM »

Backup copy operates with the block size that is used by source job. You need to change storage optimization settings of a source job, run active full with a source job and with a backup copy afterwards. On the other hand, I'm not sure that this modification would address the existing issue.

Thanks!
perjonsson1960
Veteran
Posts: 443
Liked: 44 times
Joined: Jun 06, 2018 5:41 am
Full Name: Per Jonsson
Location: Sweden
Contact:

Re: Backup Size much larger than original size

Post by perjonsson1960 »

Okay, it uses the same block size as the source. Super!

Well, if the Large blocks scenario doesn't solve the problem with the overly large synthetic full, then we will probably have to rethink the whole thing, and either backup the two servers in separate jobs, or use File Share backups instead. I don't really know, though, if it is a good thing to backup a cluster that way? If one of the servers goes down, then its volumes (logical drives) fail over to the other server. How does that affect the backup job(s)? I guess that B&R can handle this if the job is set to backup a failover cluster, like it is now, but I don't know what happens if the two servers are backed up in separate jobs, or if File Share backups are used?
PetrM
Veeam Software
Posts: 3229
Liked: 520 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Backup Size much larger than original size

Post by PetrM »

Hello,

May I ask you to clarify once again the problem that we're trying to solve?

I thought the purpose is to understand how a small amount of compressed data can increase the size of the synthetic full backup from 17 Tb to 27 TB, at least it was mentioned in the post above. One more point to consider is that large blocks provide the lowest deduplication ratio because the likelihood to find two identical large blocks is lower than to deduplicate two blocks of a smaller size. If you want to achieve better deduplication ratio and reduce an overall backup file size, then you should decrease a size of block, for instance to use 512 Kb (LAN Target). However, in this case you can potentially overgrow memory and CPU resources of your backup repository because large deduplication table is produced when a large file consists of small blocks.

Also, I'm not sure that splitting a job on two different ones is a reliable approach to process a cluster.

Thanks!
perjonsson1960
Veteran
Posts: 443
Liked: 44 times
Joined: Jun 06, 2018 5:41 am
Full Name: Per Jonsson
Location: Sweden
Contact:

Re: Backup Size much larger than original size

Post by perjonsson1960 »

As I have written above, the likely cause for the large synthetic full is that the inline data deduplication was turned off because the "dedup tree" reached its limit on the number of blocks in it (8388608). So your support technician asked me to choose "Large blocks" in order to reduce the number of blocks in the "dedup tree". So I don't think that going for smaller blocks is an option here, because then I guess that the "dedup tree" grows full even sooner?
PetrM
Veeam Software
Posts: 3229
Liked: 520 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Backup Size much larger than original size

Post by PetrM » 1 person likes this post

Ok, I see now. Obviously, you may try out what our support has suggested. On the other hand, I'm not sure that I clearly understand the purpose of this limit and which criteria are used to determine such a limit. I'm going to clarify it, stay tuned.

Thanks!
perjonsson1960
Veteran
Posts: 443
Liked: 44 times
Joined: Jun 06, 2018 5:41 am
Full Name: Per Jonsson
Location: Sweden
Contact:

Re: Backup Size much larger than original size

Post by perjonsson1960 » 1 person likes this post

The use of Large blocks seems to have solved the problem, because the new synthetic full that was created during the night, ended up being approx. 17 TB, just as the active full is, and not 27 TB like the last time.
PetrM
Veeam Software
Posts: 3229
Liked: 520 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Backup Size much larger than original size

Post by PetrM » 1 person likes this post

Hello,

This deduplication tree limit is a hard-coded product setting. The default value is 8 million blocks, once it's exceeded the backup stops deduping data in the backup storage and as a result it stores more data. In this case, the size of synthetic backup will be multiple of the number of nodes in WSFC. This limit can be changed by a registry key but it's not recommended because large deduplication table could provoke performance issues on a backup repository. I believe that the workaround provided by our support engineer is solid enough and I suggest to stick with the recommended approach.

Thanks!
perjonsson1960
Veteran
Posts: 443
Liked: 44 times
Joined: Jun 06, 2018 5:41 am
Full Name: Per Jonsson
Location: Sweden
Contact:

Re: Backup Size much larger than original size

Post by perjonsson1960 » 1 person likes this post

Thank you very much!

It says in the job storage configuration that "Local target (Large blocks)" is required for processing source machines with disks larger than 100 TB. So I suppose that I have some margin to grow with my 17 TB file server cluster... :-)
Steve-nIP
Service Provider
Posts: 117
Liked: 49 times
Joined: Feb 06, 2018 10:08 am
Full Name: Steve
Contact:

Re: Backup Size much larger than original size

Post by Steve-nIP » 1 person likes this post

perjonsson1960 wrote: May 31, 2021 8:49 am The Veeam support has found the probable cause for the large synthetic full; The dedup was turned off because there were too many blocks in the dedup tree:

"[17.05.2021 14:09:13.806] < 11748> stg | WARN|Dedup index limit of [8388608] blocks was reached. Next block will not be added to dedup tree."

According to the support's suggestion, I have now changed the Storage setting in the job to "Local target (Large blocks)" and started an Active Full. Stay tuned.
Thanks for this, this is fantastic information, I've hit the same issue myself without ever knowing it. It would be nice if this issue was exposed in the UI, with a recommendation to use 4MB blocks and run an active full.
PetrM
Veeam Software
Posts: 3229
Liked: 520 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Backup Size much larger than original size

Post by PetrM »

Hi Steve,

Many thanks for your feedback, your request is noted!

Thanks!
Post Reply

Who is online

Users browsing this forum: chad.aiken and 14 guests