Host-based backup of VMware vSphere VMs.
Post Reply
alwubfje
Novice
Posts: 8
Liked: never
Joined: May 20, 2026 12:20 am
Full Name: John Smith
Contact:

Design Philosophy of Backup Size Estimation in a Data Domain Environment

Post by alwubfje »

We would like to confirm the design philosophy behind the backup size estimation performed during the pre-processing phase in Veeam, before backup data is transferred to the repository.

In this environment, we are using Veeam Backup & Replication v12, and Dell EMC Data Domain is configured as a backup repository via SOBR (Scale-Out Backup Repository).

In terms of actual behavior, the backup size estimation in Veeam does not take into account the space reduction effects of Data Domain deduplication.
As a result, even though sufficient free space actually existed due to deduplication on the Data Domain, a job became unable to run during the pre-processing phase because the estimation is based on logical size.

Therefore, we would like not only to confirm this behavior but also to understand the design intent behind it, as well as whether there are other considerations that could impact future operations.

We would appreciate your clarification on the following points:
  1. Our understanding is that this behavior represents a conservative design intended to prevent job failures due to insufficient capacity during data transfer.
    Is this understanding correct?
    If there are any other reasons, please let us know.
  2. Why is it not explicitly stated as a limitation or consideration that “the effects of deduplication storage are not taken into account in this estimation”?
  3. Similarly, are there any other designs, configurations, or behaviors where characteristics of deduplication storage (such as Data Domain) are not taken into account?
    If so, please also advise on any operational precautions or risks associated with them.
  4. We understand that this behavior can currently be avoided by adjusting registry keys. However, dependency on such undocumented parameters raises significant concerns regarding operational continuity, especially for versions v13 and later.
    For example, if these keys are removed, modified, or disabled in future versions, it may no longer be possible to avoid the same issue.
    Therefore, please clarify the following:
    • Are these tuning parameters (for example, SOBRFullSizeEstimatePercent) expected to remain available in future versions, including v13 and beyond?
    • Are there any plans to implement an official feature in the VBR console that allows enabling/disabling size estimation or configuring thresholds?

In the past, we used Veritas NetBackup, where no such size estimation was performed during the pre-processing phase, and jobs did not stop prior to data transfer for this reason.

Due to this behavioral difference, we are strongly concerned that other implicit design constraints may exist and could affect future operations.

We appreciate your response.
david.domask
Product Manager
Posts: 3674
Liked: 896 times
Joined: Jun 28, 2016 12:12 pm
Contact:

Re: Design Philosophy of Backup Size Estimation in a Data Domain Environment

Post by david.domask »

Hi alwubfje,

Since you're using Scale-out Backup Repositories (SOBR), there are actually two logical estimates you need to take into consideration here:

SOBR Estimates => SOBR size estimates and Free space calculations

DataDomain space savings

For SOBR, the above linked User Guide page and KB article explain the reasoning and logic behind this approach.

For DataDomain (and any deduplication appliance / fast-clone capable file system), the answer is simply that we cannot accurately know the efficacy of the space saving mechanisms in advance, even with estimated provided from the Vendors. Too many factors such as what is in the workload, use of encryption, etc, come in to play and that makes such estimates less reliable than required to avoid filing the physical space on the appliance.

The end result of being too optimistic with the space saving mechanisms is that your deduplication appliance runs out of physical space, and the normal garbage collection / deduplication processes are impacted; thus we err on the side of caution and fail the job based on logical estimates to avoid such situations, as it is not always straightforward to recover from a dedup appliance that is physically full.

The configuration parameters you're discussing have been in place for many versions (I believe since the introduction of SOBRs) and there are no intentions on removing them -- if there are changes to the configuration parameter name, we will include them in the release notes on new versions.
David Domask | Product Management: Principal Analyst
Post Reply

Who is online

Users browsing this forum: Bing [Bot], rlmicrosa and 43 guests