Comprehensive data protection for all workloads
Post Reply
tntteam
Enthusiast
Posts: 68
Liked: 5 times
Joined: Aug 28, 2015 12:40 pm
Full Name: tntteam
Contact:

Theorical question about deduplication

Post by tntteam »

Hi,

I have some questions regarding how deduplication is supposed to work with Veeam.

Veeam deduplication is working on a "per job" basis.
I setup a backup job of 117 VMs (mixed windows 2008, 2012, linux'es).
I setup full each saturday, and start the backup job on thursday for the first time.
So I get a .vbk (=full) on thursday, plus 1 .vib on friday, then one .vbk on saturday.
Deduplication is on and set on "local target"
Compression is on, default settings

Why do my first full backup os 3.4TB and the second full backup that did happen on saturday (day+2) is 3.4TB too ?
The data can't have changed this much ?

What Am I missing ? Is the "per job" deduplication means "per run of each job" and deduplication doesn't work between each subsequent run of the same job?

Sorry it may be not very clear :(
PTide
Product Manager
Posts: 6551
Liked: 765 times
Joined: May 19, 2015 1:46 pm
Contact:

Re: Theorical question about deduplication

Post by PTide »

Hi,
Why do my first full backup os 3.4TB and the second full backup that did happen on saturday (day+2) is 3.4TB too?
Deduplication takes place inside job, between VMs, which means that if there are many VMs with similar data (OS files, database files etc) in the same job then the resulting full backup file will be smaller than the total amount of data on all VMs due to deduplication applied to similar data. So, the statement
tntteam wrote:<...>deduplication doesn't work between each subsequent run of the same job<...>
is correct.

Thank you.
dellock6
VeeaMVP
Posts: 6166
Liked: 1971 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Theorical question about deduplication

Post by dellock6 »

Since you are running a "full" backup weekly, this backup has to have NO relation with previous chain and be independent. For this reason it does only deduplicate inside itself and doesn't look at blocks stored in previous restore points. Otherwise, you may look at forever-forward incremental or reversed incremental.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
tntteam
Enthusiast
Posts: 68
Liked: 5 times
Joined: Aug 28, 2015 12:40 pm
Full Name: tntteam
Contact:

Re: Theorical question about deduplication

Post by tntteam »

THank you guys, I was misunderstanding the internal mechanics. I understand better why Veeam produced whitepapers about veeam+win2k12 builtin dedup.

Also when I see dedup 1.0x or 1.1x in backup results, I can conclude that deduplication is not worthy ?
dellock6
VeeaMVP
Posts: 6166
Liked: 1971 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Theorical question about deduplication

Post by dellock6 » 1 person likes this post

Again, it's based on the kind of operation that is in place.
those numbers usually comes out during an incremental run, but exactly because it's only extracting changed blocks compared to previous run, chances are those blocks are all new and unique, thus there's no other block similar to them to have dedupe between them :)

I know sometimes deduplication can be tricky to understand...
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Post Reply

Who is online

Users browsing this forum: Baidu [Spider], Semrush [Bot] and 65 guests