- 
				kan_two
- Lurker
- Posts: 2
- Liked: never
- Joined: Jan 09, 2018 2:35 pm
- Full Name: Kan Two
- Contact:
Dedupe rate question
Is there any data on Veeam efficiency wrt deduplication vs other deduplication methods?
As I understand is Veeam dedup inline as opposed to post process. I imagine that a post process dedupe could greatly reduce space usage of e.g. two separate full backups of two separate but similar VMs (eg same OS). How efficient would Veeam inline dedupe be in this scenario? Is the checksumming and duplicate block checking so efficient that one would achieve almost the same dedupe rates, or are the inline dedupe mainly able to match against data in the same backup job/stream?
I imagine that changed block tracking, deltas + synthetic fulls, ReFS and so on used by Veeam are contributing to a LOT of space savings, but these are not strictly dedupe tools even though the outcome is much the same, so I wonder how much dedupe in itself contributes, compared to these others.
Cheers,
			
			
									
						
										
						As I understand is Veeam dedup inline as opposed to post process. I imagine that a post process dedupe could greatly reduce space usage of e.g. two separate full backups of two separate but similar VMs (eg same OS). How efficient would Veeam inline dedupe be in this scenario? Is the checksumming and duplicate block checking so efficient that one would achieve almost the same dedupe rates, or are the inline dedupe mainly able to match against data in the same backup job/stream?
I imagine that changed block tracking, deltas + synthetic fulls, ReFS and so on used by Veeam are contributing to a LOT of space savings, but these are not strictly dedupe tools even though the outcome is much the same, so I wonder how much dedupe in itself contributes, compared to these others.
Cheers,
- 
				foggy
- Veeam Software
- Posts: 21182
- Liked: 2164 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Dedupe rate question
Hi Kan, you're correct, Veeam B&R performs inline deduplication within it's backup files that is completely different from what dedupe appliances or Windows Deduplication do (and typically doesn't affect their dedupe rate). Here're a couple of links that should give you a better understanding:
Deduplication
Data Compression and Deduplication
			
			
									
						
										
						Deduplication
Data Compression and Deduplication
- 
				kan_two
- Lurker
- Posts: 2
- Liked: never
- Joined: Jan 09, 2018 2:35 pm
- Full Name: Kan Two
- Contact:
Re: Dedupe rate question
Hi, thanks for the info!
It would be nice to know how Veeam inline dedupe compares with post process dedupe?
Is it so efficient wrt checksumming and comparing blocks so it approaches post process dedupe, or are there any situations where Veeam dedupe would be significantly less efficient (e.g. across different backup jobs or runs, after manual full backups and so on), knowing that post processing will scour every stored bit for dupliacte occurrences, whereas inline dedupe somehow need to keep track of data already in the pipeline or previously processed.
			
			
									
						
										
						It would be nice to know how Veeam inline dedupe compares with post process dedupe?
Is it so efficient wrt checksumming and comparing blocks so it approaches post process dedupe, or are there any situations where Veeam dedupe would be significantly less efficient (e.g. across different backup jobs or runs, after manual full backups and so on), knowing that post processing will scour every stored bit for dupliacte occurrences, whereas inline dedupe somehow need to keep track of data already in the pipeline or previously processed.
- 
				tdewin
- Veeam Software
- Posts: 1856
- Liked: 669 times
- Joined: Mar 02, 2012 1:40 pm
- Full Name: Timothy Dewin
- Contact:
Re: Dedupe rate question
Veeam is inline dedupe
			
			
									
						
										
						- 
				tdewin
- Veeam Software
- Posts: 1856
- Liked: 669 times
- Joined: Mar 02, 2012 1:40 pm
- Full Name: Timothy Dewin
- Contact:
Re: Dedupe rate question
So didn't complete the previous post. Veeam is inline dedupe. It is based on a default 1MB block size (lan target) although you can change this to wan target (256kb). This makes the deduplication overhead quite low while still given good results when for example encountering empty blocks or VM OS data deployed from the same template. Veeam does keep track of blocks in the current chain. The good thing is, this way Veeam does really fast backups, and really fast restores since data fragmentation is low, with a limited amount of resource.
Global dedupe with post processing might give you more gain. This is exactly the reason why we do integrate with StoreOnce and DataDomain. However, the rehydration process does give a significant impact during restore. That's why it is mostly recommended to use them in a second tier
			
			
									
						
										
						Global dedupe with post processing might give you more gain. This is exactly the reason why we do integrate with StoreOnce and DataDomain. However, the rehydration process does give a significant impact during restore. That's why it is mostly recommended to use them in a second tier
Who is online
Users browsing this forum: Baidu [Spider], Bing [Bot], Google [Bot], musicwallaby and 53 guests