Testing ReFS & storage thoughts

pkelly_sts · Post by **pkelly_sts** » Sep 06, 2018 9:49 am this post

I'm in a fortunate position in that, because I've just migrated some storage arrays from HP P2000 to HPe MSA 2052, I have a bunch of storage due for decom that I can have a "play" with first before it goes so I wanted to take the opportunity to run a test of 2016/ReFS to help plan what we'll end up doing when we migrate for real (our physical Backup server is 2012 R2 & due for replacement) so I thought I'd drop the question here along the lines of: What would you do?

Hardware Kit list:
P2000-1:
1 x SFF FC head unit & 1 x expansion enclosure
33 x 300Gb 15k SAS
13 x 600gb 10k SAS

P2000-2:
Identical to P2000-1

We actually currently have these fronted by a pair of FalconStor appliances (until it's decommed as we've migrated those to new as well) so I /could/ pool the whole lot in some way & present it but we'd be limited by 16Tb of FalconStor capacity licensing.

So, bearing in mind the P2000 is limited to VDisks comprising a max 16 physical disks, that gives either 2.4(ish) TB in RAID-10 or 4.2Tb in RAID-6 for each set of 16 of the 300's and 3.6Tb (R10) or 6Tb (R6) for the 600's.

Either way it results in 3 chunks of disk that I can present to the new test repository server which, for now, will likely be a VM to start with as I can do all this remotely before I spin up a spare physical server to host the repo's.

Current full backup size for this site is around 4tb

So, first question then is, once the initial full backup (or backup copy) is completed, does the benefits or ReFS for subsequent synthetic fulls outweigh the overhead of using Raid-6?
Assuming I'm testing with a Vm repo server initially, If I present the 3 LUNs (from one of the P2000's) to the VM would I be best just creating a SoBR with the 3 LUNs or is there a benefit to using storage spaces?
How about if I then present a further identical 3 luns from the other P2000?

I think my ultimate aim is to
A) see what the performance of ReFS is like given OUR source VMs & change rate (knowing everyone is different)
B) See what we can achieve keeping on-disk long-term to save us having to revert to tape for older backups (we'll still be using tape for the foreseeable as a final stage)
C) Establish whether, when I deploy the final solution on the final hardware, I should just create all repo storage as ReFS & let Veeam just get on with it...
D) If it's deemed relevant, I'll learn a bit about storage spaces as I haven't had a need to touch them up to now so haven't really absorbed what they might offer (or what their limitations might be)

What would you do?

Post by **Mike Resseler** » Sep 18, 2018 6:40 am this post

ReFS (don't forget to use 64K) will give you the benefit of using less storage and very fast merges when you use it in an incremental mode with a (for example weekly) synthetic full. What actually happens is that you will create your first backup (full, everything needs to be written so I/O). Then you do (again, for example) during the week incremental backups. Again, all of those will be written to disk (so I/O). On Saturday you do a synthetic full. Now the benefit comes in:
1: It will be fast because the I/O will be limited. It is just pointers and metadata that needs to be made because due to the blockcloning system from ReFS, we don't need to create a new file from the full backup and merge the changes in that file. We simply create a file that consist of pointers to blocks that are already on the system.
2: There will be less storage usage simply because we don't write the data twice.

Refs can heal itself and detect corruption, but you need classic storage spaces or S2D to benefit from that. If you don't want to use it, ReFS will still detect corruption but you won't be able to heal it. Therefore you need to make sure you apply a good 3-2-1 rule to avoid having bad or no backups.

Personally (but I am biased) I would test S2D (if you have the licenses) in combination with ReFS and see what the benefits are. In my humble opinion, they are really great. I know that there have been some issues with ReFS but most of them seem to be solved. Just make sure that you have enough memory to run it

My 2 cents

pkelly_sts · Post by **pkelly_sts** » Sep 18, 2018 9:24 am this post

Thanks for the comments. I ended up spinning up a ReFS repo on a VM & provisioned the storage to it just as mount points for now, formatted as ReFS (used 64k stripes etc.) and let it rip with an extra/test backup copy.

I found some wierd behaviour after the first weekly backup in that the "deduplication" of the main volume (I added 3 ReFS volumes as a SoBR) was crazy high - I mean, a backup folder showing 438Gb used (right-click | properties on the folder) when viewed through storage managment indicated that <10Gb was actually used on disk! I wasn't expecting this (at least not so early in the testing) as I know dedupe as such is different/not compatible with fast-cloneing, but i have the screen-grabs as evidence to look at at some point!

I still have much to learn on S2D but from what I gather it's primarily aimed at internal storage whereas I'm lucky in this case to have spare external/FC storage that I'm looking to use until it drops basically (it's out of hardware support, not worth spending more on as we have new prod storage, but we see no sense in disposing of it whilst it still works and we can internally have a load of spare disks avail).

So, more reading on my part but thanks for the feedback...

R&D Forums

Testing ReFS & storage thoughts

Re: Testing ReFS & storage thoughts

Re: Testing ReFS & storage thoughts

Who is online