Windows Server 2012 R2 No Deduplication

mgruben0L3oVVGD1V0C6 · Sep 18, 2017 3:37 pm

Hey all,

I'm aware that this is yet another Windows server 2012 Data Deduplication thread, but after reading the FAQ and searching through some of the other posts, I didn't find anything that was exactly on-point, so I thought I would post my question here

My question:
I have 0 B SavedSpace (0% deduplication rate) on a Data-Deduplication-enabled volume containing only a single-VM backup job.
Is this right?

My setup:
The VM being backed up is a Windows Server 2012 R2 file server, which itself has Data Deduplication enabled on its meaningfully-large volumes.
The Veeam Backup and Replication server is Windows Server 2012 R2, with Data Deduplication installed.
The backup repository is an NTFS-formatted 6 trillion-byte (~5.45 Terabyte (Tebibyte if you want to go there)) volume, on which Data Deduplication has been enabled for files older than 1 day, as a general-purpose file server.
Its storage settings are as follows:

The backup job is forever-forward incremental.
Its storage settings are as follows:

A single full backup exists, followed by multiple incrementals:

The FAQ (veeam-backup-replication-f2/read-this-f ... tml#p95276) seems to suggest that Windows Server 2012's deduplication will give you savings when you have multiple jobs.
Could it be that Server 2012's deduplication has no savings to give me here, because I have only a single job backing up the incremental deltas of a single VM?

skrause · Post by **skrause** » Sep 18, 2017 8:58 pm this post

Are you sure the de-duplication process is actually running on the server?

I get some level of savings on pretty much everything and I don't align blocks or decompress the files before writing.

Run "Get-DedupStatus [Volume]" on the volumes and it will tell you how much free/saved space and how many files are optimized/in policy.

An example from one of my repository servers just now (this volume is 6TB total):

Code: Select all

PS C:\Users\skrause> Get-DedupStatus J:

FreeSpace    SavedSpace   OptimizedFiles     InPolicyFiles      Volume
---------    ----------   --------------     -------------      ------
1.31 TB      4.81 TB      80                 79                 J:

nmdange · Post by **nmdange** » Sep 18, 2017 9:35 pm this post

I don't think your dedupe settings are correct. By setting the file age to 1 day, it can't process the full backup which is always updated on the current day. The default 2012 R2 settings are also not optimal for backup files. On 2016, you can just choose the option for "backup server" as the profile, but on 2012 R2 you have to do some manual tweaking. This is for DPM, but the section on what dedupe settings to use should also apply to Veeam https://technet.microsoft.com/en-us/lib ... c.12).aspx

In particular these registry settings:

Tune dedup processing for backup data files—Run the following PowerShell command to set to start optimization without delay and not to optimize partial file writes. Note that by default Garbage Collection (GC) jobs are scheduled every week, and every fourth week the GC job runs in “deep GC” mode for a more exhaustive and time intensive search for data to remove. For the DPM workload, this “deep GC” mode does not result in any appreciative gains and reduces the amount of time in which dedup can optimize data. We therefore disable this deep mode.

Set-ItemProperty -Path HKLM:\Cluster\Dedup -Name DeepGCInterval -Value 0xFFFFFFFF

Tune performance for large scale operations—Run the following PowerShell script to:
Disable additional processing and I/O when deep garbage collection runs
Reserve additional memory for hash processing
Enable priority optimization to allow immediate defragmentation of large files

Set-ItemProperty -Path HKLM:\Cluster\Dedup -Name HashIndexFullKeyReservationPercent -Value 70
Set-ItemProperty -Path HKLM:\Cluster\Dedup -Name EnablePriorityOptimization -Value 1

These settings modify the following:
HashIndexFullKeyReservationPercent: This value controls how much of the optimization job memory is used for existing chunk hashes, versus new chunk hashes. At high scale, 70% results in better optimization throughput than the 50% default.
EnablePriorityOptimization: With files approaching 1TB, fragmentation of a single file can accumulate enough fragments to approach the per file limit. Optimization processing consolidates these fragments and prevents this limit from being reached. By setting this registry key, dedup will add an additional process to deal with highly fragmented deduped files with high priority.

That said, it's possible you won't see any benefit from deduping the backups because Veeam is already doing its own compression and dedupe and your backup job has only a single full backup file. You won't see massive savings unless you have multiple full backup files on disk.

Not related to your issue specifically, but you would also want your backup volume to be enabled for large file records. This is done either with "format /L" in cmd or "Format-Volume -EnableLargeFRS" in PowerShell.

skrause · Post by **skrause** » Sep 19, 2017 1:21 pm this post

Oh, wait. You are using Reverse incremental. I didn't notice that.

You will not see nearly as much deduplication savings as you would with other methods (though the transform process on synthetics is pretty slow with dedup on)

mgruben0L3oVVGD1V0C6 · Sep 19, 2017 6:14 pm

nmdange wrote:By setting the file age to 1 day, it can't process the full backup which is always updated on the current day.

Ah yes, I hadn't considered that the full backup is updated every day after the number of restore points is met.

nmdange wrote:it's possible you won't see any benefit from deduping the backups because Veeam is already doing its own compression and dedupe and your backup job has only a single full backup file. You won't see massive savings unless you have multiple full backup files on disk.

This I think suggests the answer for our setup; it would be better if we had space for two full backups at once, which could then be deduplicated by Server 2012, but this is not possible as the full backup file is ~70% the size of the target volume.

nmdange wrote:Not related to your issue specifically, but you would also want your backup volume to be enabled for large file records. This is done either with "format /L" in cmd or "Format-Volume -EnableLargeFRS" in PowerShell.

The source I was able to find on this, https://www.veeam.com/kb2023, suggests that trying to deduplicate files larger than 1TB is bad per se, which may be an issue that I am running into here as well.

It sounds like my File Server Veeam volume is not properly configured to deduplicate the backup files that it currently stores, but nevertheless it also sounds like, in the best case, my deduplication savings would be meager anyway.

Thanks for the help all!

R&D Forums

Windows Server 2012 R2 No Deduplication

Re: Windows Server 2012 R2 No Deduplication

Re: Windows Server 2012 R2 No Deduplication

Re: Windows Server 2012 R2 No Deduplication

Re: Windows Server 2012 R2 No Deduplication

Who is online