Dedupe rate question

Availability for the Always-On Enterprise

Dedupe rate question

Veeam Logoby kan_two » Tue Jan 09, 2018 2:48 pm

Is there any data on Veeam efficiency wrt deduplication vs other deduplication methods?

As I understand is Veeam dedup inline as opposed to post process. I imagine that a post process dedupe could greatly reduce space usage of e.g. two separate full backups of two separate but similar VMs (eg same OS). How efficient would Veeam inline dedupe be in this scenario? Is the checksumming and duplicate block checking so efficient that one would achieve almost the same dedupe rates, or are the inline dedupe mainly able to match against data in the same backup job/stream?

I imagine that changed block tracking, deltas + synthetic fulls, ReFS and so on used by Veeam are contributing to a LOT of space savings, but these are not strictly dedupe tools even though the outcome is much the same, so I wonder how much dedupe in itself contributes, compared to these others.

Cheers,
kan_two
Lurker
 
Posts: 2
Liked: never
Joined: Tue Jan 09, 2018 2:35 pm
Full Name: Kan Two

Re: Dedupe rate question

Veeam Logoby foggy » Tue Jan 09, 2018 3:48 pm

Hi Kan, you're correct, Veeam B&R performs inline deduplication within it's backup files that is completely different from what dedupe appliances or Windows Deduplication do (and typically doesn't affect their dedupe rate). Here're a couple of links that should give you a better understanding:

Deduplication
Data Compression and Deduplication
foggy
Veeam Software
 
Posts: 15621
Liked: 1164 times
Joined: Mon Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson

Re: Dedupe rate question

Veeam Logoby kan_two » Fri Jan 12, 2018 3:54 pm

Hi, thanks for the info!

It would be nice to know how Veeam inline dedupe compares with post process dedupe?

Is it so efficient wrt checksumming and comparing blocks so it approaches post process dedupe, or are there any situations where Veeam dedupe would be significantly less efficient (e.g. across different backup jobs or runs, after manual full backups and so on), knowing that post processing will scour every stored bit for dupliacte occurrences, whereas inline dedupe somehow need to keep track of data already in the pipeline or previously processed.
kan_two
Lurker
 
Posts: 2
Liked: never
Joined: Tue Jan 09, 2018 2:35 pm
Full Name: Kan Two

Re: Dedupe rate question

Veeam Logoby tdewin » Sat Jan 13, 2018 3:06 pm

Veeam is inline dedupe
tdewin
Veeam Software
 
Posts: 1186
Liked: 406 times
Joined: Fri Mar 02, 2012 1:40 pm
Full Name: Timothy Dewin

Re: Dedupe rate question

Veeam Logoby tdewin » Sat Jan 13, 2018 4:02 pm

So didn't complete the previous post. Veeam is inline dedupe. It is based on a default 1MB block size (lan target) although you can change this to wan target (256kb). This makes the deduplication overhead quite low while still given good results when for example encountering empty blocks or VM OS data deployed from the same template. Veeam does keep track of blocks in the current chain. The good thing is, this way Veeam does really fast backups, and really fast restores since data fragmentation is low, with a limited amount of resource.

Global dedupe with post processing might give you more gain. This is exactly the reason why we do integrate with StoreOnce and DataDomain. However, the rehydration process does give a significant impact during restore. That's why it is mostly recommended to use them in a second tier
tdewin
Veeam Software
 
Posts: 1186
Liked: 406 times
Joined: Fri Mar 02, 2012 1:40 pm
Full Name: Timothy Dewin


Return to Veeam Backup & Replication



Who is online

Users browsing this forum: Bing [Bot], hamidou.garba, KevinJ and 1 guest