-
- Enthusiast
- Posts: 68
- Liked: 5 times
- Joined: Aug 28, 2015 12:40 pm
- Full Name: tntteam
- Contact:
Theorical question about deduplication
Hi,
I have some questions regarding how deduplication is supposed to work with Veeam.
Veeam deduplication is working on a "per job" basis.
I setup a backup job of 117 VMs (mixed windows 2008, 2012, linux'es).
I setup full each saturday, and start the backup job on thursday for the first time.
So I get a .vbk (=full) on thursday, plus 1 .vib on friday, then one .vbk on saturday.
Deduplication is on and set on "local target"
Compression is on, default settings
Why do my first full backup os 3.4TB and the second full backup that did happen on saturday (day+2) is 3.4TB too ?
The data can't have changed this much ?
What Am I missing ? Is the "per job" deduplication means "per run of each job" and deduplication doesn't work between each subsequent run of the same job?
Sorry it may be not very clear
I have some questions regarding how deduplication is supposed to work with Veeam.
Veeam deduplication is working on a "per job" basis.
I setup a backup job of 117 VMs (mixed windows 2008, 2012, linux'es).
I setup full each saturday, and start the backup job on thursday for the first time.
So I get a .vbk (=full) on thursday, plus 1 .vib on friday, then one .vbk on saturday.
Deduplication is on and set on "local target"
Compression is on, default settings
Why do my first full backup os 3.4TB and the second full backup that did happen on saturday (day+2) is 3.4TB too ?
The data can't have changed this much ?
What Am I missing ? Is the "per job" deduplication means "per run of each job" and deduplication doesn't work between each subsequent run of the same job?
Sorry it may be not very clear
-
- Product Manager
- Posts: 6551
- Liked: 765 times
- Joined: May 19, 2015 1:46 pm
- Contact:
Re: Theorical question about deduplication
Hi,
Thank you.
Deduplication takes place inside job, between VMs, which means that if there are many VMs with similar data (OS files, database files etc) in the same job then the resulting full backup file will be smaller than the total amount of data on all VMs due to deduplication applied to similar data. So, the statementWhy do my first full backup os 3.4TB and the second full backup that did happen on saturday (day+2) is 3.4TB too?
is correct.tntteam wrote:<...>deduplication doesn't work between each subsequent run of the same job<...>
Thank you.
-
- VeeaMVP
- Posts: 6166
- Liked: 1971 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Theorical question about deduplication
Since you are running a "full" backup weekly, this backup has to have NO relation with previous chain and be independent. For this reason it does only deduplicate inside itself and doesn't look at blocks stored in previous restore points. Otherwise, you may look at forever-forward incremental or reversed incremental.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- Enthusiast
- Posts: 68
- Liked: 5 times
- Joined: Aug 28, 2015 12:40 pm
- Full Name: tntteam
- Contact:
Re: Theorical question about deduplication
THank you guys, I was misunderstanding the internal mechanics. I understand better why Veeam produced whitepapers about veeam+win2k12 builtin dedup.
Also when I see dedup 1.0x or 1.1x in backup results, I can conclude that deduplication is not worthy ?
Also when I see dedup 1.0x or 1.1x in backup results, I can conclude that deduplication is not worthy ?
-
- VeeaMVP
- Posts: 6166
- Liked: 1971 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Theorical question about deduplication
Again, it's based on the kind of operation that is in place.
those numbers usually comes out during an incremental run, but exactly because it's only extracting changed blocks compared to previous run, chances are those blocks are all new and unique, thus there's no other block similar to them to have dedupe between them
I know sometimes deduplication can be tricky to understand...
those numbers usually comes out during an incremental run, but exactly because it's only extracting changed blocks compared to previous run, chances are those blocks are all new and unique, thus there's no other block similar to them to have dedupe between them
I know sometimes deduplication can be tricky to understand...
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Who is online
Users browsing this forum: Baidu [Spider], Semrush [Bot] and 65 guests