Comprehensive data protection for all workloads
Post Reply
Yuki
Veeam ProPartner
Posts: 252
Liked: 26 times
Joined: Apr 05, 2011 11:44 pm
Contact:

Windows 2012 and deduplication

Post by Yuki » Feb 21, 2013 12:05 am

This question is not about the restore procedure, but about how Veeam and VMware see/treat deduplicated volumes.

On our file server - Win2012 with dedupe, we got terabytes of data and using dedupe to achieve some space savings. On 10TB volume (5x2TB VMDKs mounted in windows) we have 3.9TB of used space after deduplication with 34% dedupe ratio and 2TB dedupe "savings". So in my understanding we are actually storing about 5.9TB raw equivalent without this nifty feature.

Our Veeam jobs show that we actually process 4.9TB of data. Why the mismatch? I assume 4.9TB is what VMware VMDKs actually contain (not empty space)? Or is it just the blocks that have ever been touched by windows, even if they now contain no data (for example files were deleted or reduced in size after deduplication had processed them)?

tsightler
VP, Product Management
Posts: 5310
Liked: 2162 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Windows 2012 and deduplication

Post by tsightler » Feb 21, 2013 12:59 am

Yuki wrote:Or is it just the blocks that have ever been touched by windows, even if they now contain no data (for example files were deleted or reduced in size after deduplication had processed them)?
This. Veeam does not know anything about the underlying filesystem, only about the disk and what blocks have been "used". That's the nature of an image based backup. Deleting files does not delete data from the disk blocks, the blocks are simply marked as free so that they can be reused. If you create a new, completely empty VMDK, write a 100GB file to it, and then delete it, Veeam will still backup 100GB of data.

Windows dedupe actually dirties more blocks than non-dedupe on a "per-file" basis since each new file (or changed file) is originally written non-deduped. Then the Windows dedupe process reads the files, and splits them into the "common" portion (potentially adding new segments to the dedupe pool) and the "unique" portion of the file and frees the original blocks, but these blocks are still "dirty" from the Veeam perspective since VMware will report them as changed and they will still contain new data.

Post Reply

Who is online

Users browsing this forum: No registered users and 12 guests