Comprehensive data protection for all workloads
Post Reply
RedVision81
Enthusiast
Posts: 39
Liked: 2 times
Joined: Jan 14, 2016 8:02 am
Full Name: Peter
Contact:

Server 2019 + ReFS + Deduplication Experiences

Post by RedVision81 »

Hi,

we do a lot of testing with Server 2012R2, Server 2016, and now Server2019 to get the most out of our Backup Repository with Windows Deduplication. Hence i would appreciate every feedback and experiences you already made. There are so many Guides, KBs, BPs out there and i really dont anymore what is the best setting getting optimal results.

Our current Setup:

Windows Server 2019 Standard
2 Socket Xeon E5-2620v2 6Cores (Hyperthreaded 24 Cores)
128 GB RAM
22 x 900 GB SAS
4 x 800 GB Intel PCIe SSD Cards
2 x 300 GB SAS RAID 1 for OS

We created a tiered Storage Pool in Windows with 19,3 TB Capacity
22 x 900 GB SAS = Raid6
4 x 800 GB Intel PCIe SSD Cards
Formatted with ReFS and 64K

Backup Job:
Incrementel Mo-Fr
Active Full on Saturday
Compression Level: Optimal
Storage optimization: LAN target.

Backup Repo:
no Storage Compatiblity Settings enabled

Full Backups are 4,5 - 5,1 TB
Incrementals are 150 - 210 GB

on Windows Server 2012R2 (NTFS) with an iSCSI LUN (Synolgy 12 x 3 TB - RAID6) we got the highest Dedup Ratio: ~75%
(but which was very slow due to Dedup Core Limitation)
on Windows Server 2016 (NTFS) an iSCSI LUN (Synolgy 12 x 3 TB - RAID6) we got : ~70%
Now on the new Windows Server 2019 (ReFS) we only get: ~50%

Since Microsoft changed the File Size Limit to 4 TB at Server 2016 for deduping files, we now always have round about 500 - 1,1TB per full backup left which are not deduped. This wasnt an issue in Server202R2.

Due to some problems at the moment: Dedup stopped working since a few days from now on (new files arent optimized anymore, without any reasonable cause - no errors in event viewer etc.) we now think about to re-configure and re-format the Backup Repository and if we still should use ReFS.

Questions:
1. Would you use NTFS or ReFS in this Scenario?
2. Would you recommend any Storage Compatibilty Settings?
(align Backup File Data Blocks, or Decompression)
3. Would the Setting "Use Per-VM backup files" bypass the 4TB File Size Limit (biggest VM is round about 2,6 TB)?

Thanks a lot in advance.

Peter
doktornotor
Enthusiast
Posts: 94
Liked: 29 times
Joined: Mar 07, 2018 12:57 pm
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by doktornotor »

RedVision81
Enthusiast
Posts: 39
Liked: 2 times
Joined: Jan 14, 2016 8:02 am
Full Name: Peter
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by RedVision81 »

already tried that, and gots me a more worse dedup ratio
doktornotor
Enthusiast
Posts: 94
Liked: 29 times
Joined: Mar 07, 2018 12:57 pm
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by doktornotor » 1 person likes this post

Well, the dedup ratio is ~75% here with 2019/ReFS set up per the blog link post above. The "Compression Level: Optimal" certainly sounds wrong according to pretty much any howto I've seen anywhere.
RedVision81
Enthusiast
Posts: 39
Liked: 2 times
Joined: Jan 14, 2016 8:02 am
Full Name: Peter
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by RedVision81 »

in the meantime i found the issue why dedup stopped working... OMG thats such a shame Microsoft....

there are 3 folders on our repo drive:

b:\veeam (Veeam Catalog and NfsDatastore)
b:\veeam_repository (Veeam Backups)
b:\veeam_endpoints (Veeam Endpoint Backups)

in Deduplication Configuration i set an Exclusion for the Folder Veeam, last week.
Microsoft puts this as \Veeam in the Exclusion list.
BTW: The Folder has been choosen through the Add-Button from the occuring File Explorer Window

But this is a Trap!!

\veeam is now handled as a Wildcard Entry for any Folder starting with the word "veeam" hence Deduplication stopped working for the other two folders.

Now i changed the exclusion list to:

\veeam\NfsDatastore
\veeam\VBRCatalog

and i am back in business and its working as usual!
RedVision81
Enthusiast
Posts: 39
Liked: 2 times
Joined: Jan 14, 2016 8:02 am
Full Name: Peter
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by RedVision81 »

doktornotor wrote: Feb 13, 2019 11:59 am Well, the dedup ratio is ~75% here with 2019/ReFS set up per the blog link post above. The "Compression Level: Optimal" certainly sounds wrong according to pretty much any howto I've seen anywhere.
when using compression none, my fulls will get 7-8 TB, and because of Microsoft changed the maximum file size limit to 4 TB for deduping files, i would just loose more space and get a more worse ratio.
doktornotor
Enthusiast
Posts: 94
Liked: 29 times
Joined: Mar 07, 2018 12:57 pm
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by doktornotor »

Have no idea about 8TB backups. Perhaps you should switch to per-VM backup files.
RedVision81
Enthusiast
Posts: 39
Liked: 2 times
Joined: Jan 14, 2016 8:02 am
Full Name: Peter
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by RedVision81 »

but this sounds worth a try
doktornotor
Enthusiast
Posts: 94
Liked: 29 times
Joined: Mar 07, 2018 12:57 pm
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by doktornotor »

Even if it doesn't help with dedup (though I'd say it should avoid the LFS dedup limit altogether), I'd feel much more comfortable with those regarding possible corruption/issues with restore. Let us know how it goes. ;)
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by foggy »

Hi Peter, are you running Veeam B&R v9.5 U4? Have you enabled support for block cloning on deduplicated files?
engedib
Influencer
Posts: 11
Liked: 7 times
Joined: Jan 04, 2019 2:40 am
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by engedib » 1 person likes this post

RedVision81 wrote: Feb 13, 2019 12:05 pm when using compression none, my fulls will get 7-8 TB, and because of Microsoft changed the maximum file size limit to 4 TB for deduping files, i would just loose more space and get a more worse ratio.
I would try the job with per-VM backup files and dedup friendly compression method.
RedVision81
Enthusiast
Posts: 39
Liked: 2 times
Joined: Jan 14, 2016 8:02 am
Full Name: Peter
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by RedVision81 »

foggy wrote: Feb 14, 2019 3:55 pm Hi Peter, are you running Veeam B&R v9.5 U4? Have you enabled support for block cloning on deduplicated files?
Hey Foggy,
yes we are already on the latest version. But Block Cloning only works with synthetic full right? at the moment we use active full once a week. Maybe i will swith to synthetic full once a week and active full once a month.
RedVision81
Enthusiast
Posts: 39
Liked: 2 times
Joined: Jan 14, 2016 8:02 am
Full Name: Peter
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by RedVision81 »

engedib wrote: Feb 14, 2019 7:52 pm I would try the job with per-VM backup files and dedup friendly compression method.
yes this is definitily worth a try. but i still dont know why it should be better using dedup friendly compression. Windows is deduping the files anyway regardless of the compression level, or do i miss something?
doktornotor
Enthusiast
Posts: 94
Liked: 29 times
Joined: Mar 07, 2018 12:57 pm
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by doktornotor » 1 person likes this post

At minimum, the compression sounds like a total waste of resources (CPU cycles, RAM and time) if deduplication is supposed to be on filesystem level.
RedVision81
Enthusiast
Posts: 39
Liked: 2 times
Joined: Jan 14, 2016 8:02 am
Full Name: Peter
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by RedVision81 »

but you safe traffic, and we have to use NBD Mode for Transport (Simplivity NFS Datastores) hence compression level is not unimportant for us, because of the transport limit in vmkernel.
YoMarK
Enthusiast
Posts: 55
Liked: 8 times
Joined: Jul 13, 2009 12:50 pm
Full Name: Mark
Location: The Netherlands
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by YoMarK »

You should switch to synthetic full's(fast clone), Per-VM backup to actually use ReFS's capabilities. We've got great results with that, both in backups speeds and storage savings(Windows 2016 ReFS).
If you don't want to use fast clone, there is very little to gain with ReFS(yes, it's in theory more robust, but in practice it isn't.)
benthomas
Veeam Vanguard
Posts: 39
Liked: 11 times
Joined: Apr 22, 2013 2:29 am
Full Name: Ben Thomas
Location: New Zealand
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by benthomas »

I could be wrong, but I'm pretty sure the 4tb limit is the max supported limit, not the max size of the dedupe. Eg. An 8TB file will still be completely dedupe, rather than say only the first 4tb being dedupe, then the last 4tb staying full size, but if it were to corrupt or perform slowly, Microsoft don't support it.

As for compression, the reason it's not recommended is because it masks dedupable data and increases CPU overhead.

Its also possible that your data sets have changed overtime, and your data just isn't as dedupable as it used to be.
Ben Thomas | Solutions Advisor | Veeam Vanguard 2023 | VMCE2022 | Microsoft MVP 2018-2023 | BCThomas.com
RedVision81
Enthusiast
Posts: 39
Liked: 2 times
Joined: Jan 14, 2016 8:02 am
Full Name: Peter
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by RedVision81 »

no since 2016 Server only the first 4TB are deduped. Files over 4TB always show a rest of using diskspace: File 4,6TB - Space on disk 600 GB, File 5,5TB - Space on disk 1,5 TB and so on. only Files under 4 TB are 0 bytes on disk.
RedVision81
Enthusiast
Posts: 39
Liked: 2 times
Joined: Jan 14, 2016 8:02 am
Full Name: Peter
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by RedVision81 »

YoMarK wrote: Feb 15, 2019 2:04 pm You should switch to synthetic full's(fast clone), Per-VM backup to actually use ReFS's capabilities. We've got great results with that, both in backups speeds and storage savings(Windows 2016 ReFS).
If you don't want to use fast clone, there is very little to gain with ReFS(yes, it's in theory more robust, but in practice it isn't.)
yes i will give it a try, just need to reorganize my tape jobs then. Per-VM Backup is already activated and its really a great advantage.
DonZoomik
Service Provider
Posts: 368
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by DonZoomik »

Cross-post from veeam-backup-replication-f2/refs-3-0-an ... 56658.html
Currently investigating with support, but it seems that when you enable deduplication on ReFS, you practically lose all block clone benefits. It "works" but as slowly as on NTFS with deduplication enabled and processed (block clone written) data loses any savings (not deduplicated and not clone of existing data).
I rehydrated files for a test job and block clone worked as expected (fast!) but when files were re-deduplicated, it gets slow again.
mkaec
Veteran
Posts: 462
Liked: 133 times
Joined: Jul 16, 2015 1:31 pm
Full Name: Marc K
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by mkaec »

doktornotor wrote: Feb 15, 2019 8:33 am At minimum, the compression sounds like a total waste of resources (CPU cycles, RAM and time) if deduplication is supposed to be on filesystem level.
Sometimes I like using compression with the repository set to decompress in order to save on network bandwidth.
virtualguru
Lurker
Posts: 2
Liked: 3 times
Joined: Feb 08, 2019 5:22 pm
Full Name: Michael Schreiber
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by virtualguru »

When considering compression and deduplication, you first need to identify what your desired outcome is. Both technologies can save disk space but using them together can reduce the effectiveness of deduplication because compression gets applied first. As stated previously in this thread, compression can reduce network bandwidth but at the expense of increased CPU demand on the proxy. Also previously stated in this thread, compression can mask dedupable data thereby decreasing deduplication rates. It's important to remember that deduplication will increase CPU demand on the repository server and both technologies, whether used individually or together, will increase CPU demand during a restore. Returning to where I started, you need to decide what your desired outcome is. If reducing network bandwidth is most important, use the highest compression setting. If you want the best deduplication ratios, don't use compression at all. If your backup system is CPU constrained, you may not want to use either technology. If you're looking for the best balance of compression and deduplication, I would suggest you start with the "Dedupe-friendly" compression setting.
DonZoomik
Service Provider
Posts: 368
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by DonZoomik » 1 person likes this post

As I mentioned in the other thread, the issue has been resolved by support - it's confirmed in lab to be slow and as it's handled by refs.sys, there is nothing Veeam can do about it.

On enabling compression even with dedup - it also helps when your repository is limited by write performance.
Windows deduplication can still be good even if you use compressed data. As Veeam's compression is fixed block, it can still further deduplicate data. Cloned systems compress to the same dataset etc. Multiple fulls are still likely very similar (even when compressed) and can be deduplicated.
Look at test results here: https://www.craigrodgers.co.uk/index.ph ... on-part-3/ While total savings are greatest with uncompressed data, in the end it's a game of compromises.
RedVision81
Enthusiast
Posts: 39
Liked: 2 times
Joined: Jan 14, 2016 8:02 am
Full Name: Peter
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by RedVision81 »

thats a very interresting article. thanks a lot!
craig.rodgers2
Novice
Posts: 4
Liked: 1 time
Joined: Sep 24, 2015 11:55 am
Full Name: Craig Rodgers
Location: Northern Ireland
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by craig.rodgers2 »

Glad you liked it :)
xarses
Lurker
Posts: 2
Liked: never
Joined: Jun 21, 2017 6:15 pm
Full Name: Sebastian Santamaria
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by xarses »

Interesting article!
We have the idea of updating our backup scheme, we currently have
Dr4100 as primary storage
veeam physical server
approx 300VMs organized in backups group jobs

we have the posiblity to format dr4100 (Currently the DR4100 is not very performant) and install windows server (2016, 2019 need to make a choice) with ReFS
Frenchyaz
Influencer
Posts: 13
Liked: 4 times
Joined: Nov 01, 2018 8:32 pm
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by Frenchyaz »

RedVision81 wrote: Feb 13, 2019 12:05 pm when using compression none, my fulls will get 7-8 TB, and because of Microsoft changed the maximum file size limit to 4 TB for deduping files, i would just loose more space and get a more worse ratio.
Not sure the size limit is 4TB for server 2019 as I can dedup a 7.42TB file and its size on disk is only 3.48TB so it's like 50%?
DonZoomik
Service Provider
Posts: 368
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by DonZoomik »

7,42-3,48=~4TB. Only 4 terabytes go dedupped. Size on disk should be near zero for fully dedupped file (savings are per volume, not per file).
LeoKurz
Veeam ProPartner
Posts: 28
Liked: 7 times
Joined: Mar 16, 2011 8:36 am
Full Name: Leonhard Kurz
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by LeoKurz »

Any insights on which allocation unit size to use with dedup on refs? Ususally you shoud choose 64k for a repo but with dedup?

Thanx
__Leo
DonZoomik
Service Provider
Posts: 368
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Server 2019 + ReFS + Deduplication Experiences

Post by DonZoomik » 1 person likes this post

It shouldn't matter for deduplication because deduplication doesn't care about cluster boundaries but better stay with 64k if you ever want to disable deduplication.
There is some effect if you use compression (resulting data block after compression can be allocated with 4k granularity instead of 64k) but as you in practice can't use any ReFS benefits (synthetics, fast merge), it doesn't really matter.
Post Reply

Who is online

Users browsing this forum: a.sim and 155 guests