Deduplication - Feature Enhancement?

Availability for the Always-On Enterprise

Deduplication - Feature Enhancement?

Veeam Logoby ekisner » Wed Mar 15, 2017 5:55 pm

Hi there... so this post is 100% based on the assumption that my understanding of how Veeam's dedup works is a correct understanding.

My understanding is that Veeam will dedup on a per-chain basis... so if you've got multiple jobs, each with per-vm chain active, each VBK and VRB will be deduplicated only against itself. If you've got one super-job with no per-vm chain, then it will effectively dedup the entire backup. Or you can use file-system or block-level deduplication.

I was recently thinking about how it would be nice to have dedup with ReFS3, without having to resort to using storage hardware with built in dedup.

While I'm sure it would take a significant re-code, what are the thoughts on this kind of file layout:

1) A tiered "chunk" repository file, stored on nice fast flash storage (ie it can be stored "somewhere" defined by the user)... it would of course need to be configured to have a maximum size.
2) Each VBK, VRB, etc, is deduplicated against the chunk file.

In this regard, you now have the resilience of ReFS3, with deduplication, on tiered storage (both cost-effective and probably faster since most people probably don't run their backups to flash). Assuming you were backing everything up to homogenous storage (ie not storing the chunk file on faster storage) you could leverage the ReFS3 fast-clone to make deduplication of duplicate data and rehydration of no longer duplicate data almost immediate. You also get ideal scenario deduplication, resulting in both the flexibility of per-vm chains and the storage efficiency of a single blob job.
Posts: 130
Liked: 31 times
Joined: Thu Jul 26, 2012 8:04 pm
Full Name: Erik Kisner

Re: Deduplication - Feature Enhancement?

Veeam Logoby Gostev » Thu Mar 16, 2017 1:32 am 1 person likes this post


We've evaluated this before but decided against this, mostly because it appeared that our backups being self-contained files is one of the top features that our customers like about our product. This gives them better reliability (no SPOF of dedupe pool) and multiple operational benefits, for example you can easily copy required backup to some external drive and take it with you - something with your proposed architecture, where each file requires a dedupe blob.

And in any case, we can expect Microsoft to eventually port their dedupe to ReFS as well (because ReFS is going to completely replace NTFS down the road), which will give you the best of both worlds.

Veeam Software
Posts: 21139
Liked: 2301 times
Joined: Sun Jan 01, 2006 1:01 am
Full Name: Anton Gostev

Return to Veeam Backup & Replication

Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 49 guests