Dedupe rate question

kan_two · Post by **kan_two** » Jan 09, 2018 2:48 pm this post

Is there any data on Veeam efficiency wrt deduplication vs other deduplication methods?

As I understand is Veeam dedup inline as opposed to post process. I imagine that a post process dedupe could greatly reduce space usage of e.g. two separate full backups of two separate but similar VMs (eg same OS). How efficient would Veeam inline dedupe be in this scenario? Is the checksumming and duplicate block checking so efficient that one would achieve almost the same dedupe rates, or are the inline dedupe mainly able to match against data in the same backup job/stream?

I imagine that changed block tracking, deltas + synthetic fulls, ReFS and so on used by Veeam are contributing to a LOT of space savings, but these are not strictly dedupe tools even though the outcome is much the same, so I wonder how much dedupe in itself contributes, compared to these others.

Cheers,

Post by **foggy** » Jan 09, 2018 3:48 pm this post

Hi Kan, you're correct, Veeam B&R performs inline deduplication within it's backup files that is completely different from what dedupe appliances or Windows Deduplication do (and typically doesn't affect their dedupe rate). Here're a couple of links that should give you a better understanding:

Deduplication
Data Compression and Deduplication

kan_two · Post by **kan_two** » Jan 12, 2018 3:54 pm this post

Hi, thanks for the info!

It would be nice to know how Veeam inline dedupe compares with post process dedupe?

Is it so efficient wrt checksumming and comparing blocks so it approaches post process dedupe, or are there any situations where Veeam dedupe would be significantly less efficient (e.g. across different backup jobs or runs, after manual full backups and so on), knowing that post processing will scour every stored bit for dupliacte occurrences, whereas inline dedupe somehow need to keep track of data already in the pipeline or previously processed.

tdewin · Post by **tdewin** » Jan 13, 2018 3:06 pm this post

Veeam is inline dedupe

tdewin · Post by **tdewin** » Jan 13, 2018 4:02 pm this post

So didn't complete the previous post. Veeam is inline dedupe. It is based on a default 1MB block size (lan target) although you can change this to wan target (256kb). This makes the deduplication overhead quite low while still given good results when for example encountering empty blocks or VM OS data deployed from the same template. Veeam does keep track of blocks in the current chain. The good thing is, this way Veeam does really fast backups, and really fast restores since data fragmentation is low, with a limited amount of resource.

Global dedupe with post processing might give you more gain. This is exactly the reason why we do integrate with StoreOnce and DataDomain. However, the rehydration process does give a significant impact during restore. That's why it is mostly recommended to use them in a second tier

R&D Forums

Dedupe rate question

Re: Dedupe rate question

Re: Dedupe rate question

Re: Dedupe rate question

Re: Dedupe rate question

Who is online