Hi there... so this post is 100% based on the assumption that my understanding of how Veeam's dedup works is a correct understanding.
My understanding is that Veeam will dedup on a per-chain basis... so if you've got multiple jobs, each with per-vm chain active, each VBK and VRB will be deduplicated only against itself. If you've got one super-job with no per-vm chain, then it will effectively dedup the entire backup. Or you can use file-system or block-level deduplication.
I was recently thinking about how it would be nice to have dedup with ReFS3, without having to resort to using storage hardware with built in dedup.
While I'm sure it would take a significant re-code, what are the thoughts on this kind of file layout:
1) A tiered "chunk" repository file, stored on nice fast flash storage (ie it can be stored "somewhere" defined by the user)... it would of course need to be configured to have a maximum size.
2) Each VBK, VRB, etc, is deduplicated against the chunk file.
In this regard, you now have the resilience of ReFS3, with deduplication, on tiered storage (both cost-effective and probably faster since most people probably don't run their backups to flash). Assuming you were backing everything up to homogenous storage (ie not storing the chunk file on faster storage) you could leverage the ReFS3 fast-clone to make deduplication of duplicate data and rehydration of no longer duplicate data almost immediate. You also get ideal scenario deduplication, resulting in both the flexibility of per-vm chains and the storage efficiency of a single blob job.