How to provision a 1,5 PB ReFS Repo in 2024?

Post by **mkretzer** » Jul 07, 2024 8:59 am this post

Hello,

we are in the process to build our (hopefully last) ReFS repo. All other repos already use XFS+LVM but we don't want to put all eggs in one basket and want to keep at least one copy of our most important backups on ReFS. The new flash storage system we plan should have more than 1,2 PB in one volume (so we can store our biggest backups most efficiently).
The backend storage system we plan to use supports a max per LUN size of 256 TB (others like new Dell storages even support only 128 TB).

The only problem is there are not many good options in 2024 to get this big volume on windows.
1. In the past we used dynamic, spanned volumes without issues. But this is so far deprecated that our worry is it won't be supported in newer server OS versions.
2. StorageSpaces still does not support disks behind any kind of RAID (it supports FC and ISCSI but still has the requirement of “direct physical disk access”.
3. Even with non HW Raid (which is still an option for us) i am not sure StorageSpaces on single servers provides performance as good as a modern flash RAID array. We did a quick benchmark on the newest Windows 2025 build with StorageSpaces dual parity and random IO benchmarks are nearly 6 times slower on dual parity vs. simple disk (CPU 3,85 GHz AMD newest generation).

Any idea how we best aggregate our LUNs to get a nice, big volume?

Markus

DonZoomik · Post by **DonZoomik** » Jul 07, 2024 11:27 am this post

Why just not do SOBR out of 256T LUNs...?
Using a SAN as ReFS backend is not optimal, as ReFS will not do SCSI UNMAP on anything else but Storage Spaces. And Storage Spaces Dual Parity needs some CLI tuning to work well...

...at least one copy of our most important backups on ReFS

I'd rather keep it on something else. About 6 months ago I had to swipe dust off old WS2016 era workarounds to make a 250T ReFS system (WS2022) stable again as it would consistently lock up during block cloning. If you can afford to store your backups on AFA SAN, go buy a StoreOnce or Data Domain for example.

Post by **mkretzer** » Jul 07, 2024 4:30 pm this post

"Something else" would mean S3 for us... The S3 systems we got a quote on were twice as expensive as block storage.

About SOBR: SOBR still cannot rebalance without "backup downtime"! Our environment data change rate is very dynamic. Some of our biggest machines only peak once per year. But then the change rate is very high.
So its extremely difficult to always know the right amount of "spare capacity" per extend i need. with 7 extends with each 256 TB it would be about 50 TB to be safe... But this also means ~ 350 TB of our shiny new flash storage has to be unallocated.
We already had a situation where there was not enough space and synthetics were no longer fast-cloned which then lead to the same situation on the new extend because Veeam chose the "wrong" extend to switch to (because it also cannot look into the future - its hard for us to choose the right extends).

With XFS we do everything with nearly PB-Size XFS volumes (one per storage system). Its so much easier and much more efficient. We never had any issues.

Maybe we really have to go all-in with XFS. Its just too good in combination with LVM. And with V13 coming to Linux it all can run on one big server again...

DonZoomik · Post by **DonZoomik** » Jul 07, 2024 10:30 pm this post

But this also means ~ 350 TB of our shiny new flash storage has to be unallocated.

You'll have to treat these LUNs as thick provisioned anyways due to lack of UNMAP. I wonder if on Storage Spaces, when you UNMAP ReFS, does Storage Spaces discard space on underlying thin/SSD drive (I know LVM does)...
Actually Storage Spaces is supported with SAN for simple disks - I presume that this is what you actually want (instead of Dual Parity) as your SAN likely provides back-end redundancy/RAID. See here: https://learn.microsoft.com/en-us/windo ... requisites

Storage Spaces is supported on iSCSI and Fibre Channel (FC) controllers as long as the virtual disks created on top of them are nonresilient (Simple with any number of columns).

And with V13 coming to Linux it all can run on one big server again...

Yup, currently ReFS is mostly good for a all-on-one box, as in scale-out environments XFS is better. We'll see how v1 of Veeam on LInux will turn out...

Post by **mkretzer** » Jul 07, 2024 10:42 pm this post

Unmap is not relevant as this is a dedicated system which will always have 99 % pool usage.

About SAN with Storage Spaces and simple disks: What comes after the HBA part does not sound like external RAID is in any way supported:
"Adapters must not abstract the physical disks, cache data, or obscure any attached devices. This guideline includes enclosure services that are provided by attached just-a-bunch-of-disks (JBOD) devices."

I just wonder which systems are external, fibre attached AND do not have a RAID controller.

R&D Forums

How to provision a 1,5 PB ReFS Repo in 2024?

Re: How to provision a 1,5 PB ReFS Repo in 2024?

Re: How to provision a 1,5 PB ReFS Repo in 2024?

Re: How to provision a 1,5 PB ReFS Repo in 2024?

Re: How to provision a 1,5 PB ReFS Repo in 2024?

Who is online