Host-based backup of VMware vSphere VMs.
Post Reply
FrenchBlue
Expert
Posts: 138
Liked: 23 times
Joined: Mar 18, 2021 6:04 pm
Contact:

Questions on deduplication

Post by FrenchBlue »

Hello,

I've read the doc there https://helpcenter.veeam.com/docs/backu ... ml?ver=120 but it's not a simple topic :)
- I understand it's only online dedup, not offline, right?
- The doc says that the dedup is made by the Veeam data mover, so it effectively happens on the repository, not the backup proxy? (which makes sense, just to be sure)
- At which level does the dedup apply, a given backup job or globally? For example if I have 2 backup jobs, each one backuping a single VM from the same template, will dedup work there?
- Does dedup apply in the same way to performance and capacity tiers?

Thanks.
david.domask
Veeam Software
Posts: 2756
Liked: 631 times
Joined: Jun 28, 2016 12:12 pm
Contact:

Re: Questions on deduplication

Post by david.domask »

Hi FrenchBlue,

Did maybe the wrong link get copy/pasted? Looks like that's a topic about how to import backups from Object Storage repositories in a disaster recovery event, and don't see deduplication and compression being discussed.

Check our User Guide link here: https://helpcenter.veeam.com/docs/backu ... ation.html

I think it answers your questions, and as noted in the document, there is space reduction happening both on the proxy and the repository to maximize the efficiency and potential savings.
David Domask | Product Management: Principal Analyst
FrenchBlue
Expert
Posts: 138
Liked: 23 times
Joined: Mar 18, 2021 6:04 pm
Contact:

Re: Questions on deduplication

Post by FrenchBlue »

Hello, yes sorry it was a bad paste from me, I've corrected it with the proper link, but I still have the remaining questions then :)
david.domask
Veeam Software
Posts: 2756
Liked: 631 times
Joined: Jun 28, 2016 12:12 pm
Contact:

Re: Questions on deduplication

Post by david.domask » 1 person likes this post

Source side and target side refer to backup proxy and repository respectively, I think maybe that is introducing the confusion for you, so you can read the below understanding it talks about proxies and repositories respectively:
Veeam Backup & Replication uses Veeam Data Movers to deduplicate VM data:
  • Veeam Data Mover in the source side deduplicates VM data at the level of VM disks. Before the source-side Veeam Data Mover starts processing a VM disk, it obtains digests for the previous restore point in the backup chain from Veeam Data Mover in the target side. The source-side Veeam Data Mover consolidates this information with CBT information from the hypervisor and filters VM disk data based on it. If some data block exists in the previous restore point for this VM, the source-side Veeam Data Mover does not transport this data block to the target. In addition, in the case of thin disks, the source-side Veeam Data Mover skips unallocated space.
  • Veeam Data Mover in the target side deduplicates VM data at the level of the backup file. It processes data for all VM disks of all VMs in the job. The target-side Veeam Data Mover uses digests to detect identical data blocks in transported data and stores only unique data blocks in the resulting backup file.
This happens per job, so it's not global across multiple jobs. The same data movers are used for both backups to Performance Tier and offloads (copy and move) to Capacity Tier, though the behavior is a bit different. Since offloads to Capacity Tier are working from already deduped/compressed backups, there won't be much savings from the datamovers.
David Domask | Product Management: Principal Analyst
FrenchBlue
Expert
Posts: 138
Liked: 23 times
Joined: Mar 18, 2021 6:04 pm
Contact:

Re: Questions on deduplication

Post by FrenchBlue » 1 person likes this post

Thanks, all clear now.
Post Reply

Who is online

Users browsing this forum: Baidu [Spider], Glasofruix and 62 guests