Comprehensive data protection for all workloads
Post Reply
Novox
Expert
Posts: 128
Liked: 22 times
Joined: Jul 12, 2016 12:51 pm
Location: Vermont, U.S.A.
Contact:

ReFS w/Integrity Streams in resilient Storage Spaces (Software RAID) vs. Hardware RAID for data reliability?

Post by Novox »

I posted this on ServerFault but am getting no results. This isn't specifically related to Veeam, but I know Veeam's users will likely have insight into my question and it does impact DAS repository targets in VBR :D

I am specifically comparing two scenarios to determine which is most fault tolerant/data resilient, WITHOUT regard for speed.

I could implement a hardware RAID solution, however, hardware RAID HBA's are file system agnostic and do not recalculate/check Parity "On Read."

Alternatively, ReFS with Integrity Streams enabled has direct integration with Storage Spaces (software RAID) such that:
When used in conjunction with a mirror or parity space, ReFS can automatically repair detected corruptions using the alternate copy of the data provided by Storage Spaces. Repair processes are both localized to the area of corruption and performed online, requiring no volume downtime.
https://docs.microsoft.com/en-us/window ... s-overview

Additionally, I've read that hardware RAID still has the potential to lose the entire array during a drive failure due to a.) bitrot that hasn't been detected by a patrol read and/or b.) the massive amount of read volume placed on un-failed drives to rebuild the array which may, itself, cause an additional drive to fail, destroying the entire array.

Given the above, is ReFS w/Integrity Streams in a Parity or Mirror Storage Pool in Storage Spaces more resilient/less prone to unrecoverable corruption issues than hardware RAID?

ServerFault isn't giving me much love (https://serverfault.com/questions/10350 ... ta-regardl)

Thank you!
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: ReFS w/Integrity Streams in resilient Storage Spaces (Software RAID) vs. Hardware RAID for data reliability?

Post by Gostev »

Yes, in theory it should be more resilient thanks to seamless error correction on read. As you noted, RAID can potentially just return bad data (bad read due to URE), while my understanding ReFS on Storage Space Direct (S2D) with data integrity stream enabled for files detects "bad" payload and delivers a mirror or parity copy of the affected block seamlessly.

But when talking about corner cases like rebuild after partial loss of drives, you should probably also consider how mature is RAID technology vs. S2D. Both will put quite some stress on the remaining storage, but S2D may additionally have bugs that have not been caught yet. I'm not aware of the current state of S2D in terms of reliability, but I know a few years ago it was pretty bad (S2D cluster we used as a repository needed to be completely rebuild every few weeks).
Andreas Neufert
VP, Product Management
Posts: 6749
Liked: 1408 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: ReFS w/Integrity Streams in resilient Storage Spaces (Software RAID) vs. Hardware RAID for data reliability?

Post by Andreas Neufert »

Storage Spaces is not a simple tool that should be used just because it is there. You need good preparation in case of HCL hardware, controllers and infrastructure knowledge. If you do not want to invest a lot of time in the correct preparation, always go with a battery backed RAID controller => a "good" one.
Novox
Expert
Posts: 128
Liked: 22 times
Joined: Jul 12, 2016 12:51 pm
Location: Vermont, U.S.A.
Contact:

Re: ReFS w/Integrity Streams in resilient Storage Spaces (Software RAID) vs. Hardware RAID for data reliability?

Post by Novox »

Thank you both for responding so quickly!

It sounds like the consensus (both here and in the few comments at ServerFault *1,2) is ReFS w/IS in Storage Spaces may/should provide improved data reliability, as in, it seems like it should with Storage Spaced Integration as indicated in Microsoft documentation. However, SS or S2D isn't as well established a technology, with a lengthy track record of reliability, as compared to hardware RAID HBAs. (In fact, there are recent instances which show other issues with SS/S2D).

Given the totality of this information, the "better" of the two options with respect to data reliability at this time seems to be a battery backed RAID controller. (I could also increase the patrol read/scrub frequency to help catch latent UREs and use RAID6 to increase a degraded array's chances of rebuild success). I could still use ReFS, but in this scenario, with Integrity Streams ON; I might know sooner if there's a URE, but ReFS itself wouldn't be able to correct the issue.

I think I have my answer, thanks again!

*1.)
"What is the Storage Space configuration for the best protection of data" - not to use it? It randomly died for us, we never went back.
*2.)
From my experience, in terms of reliability, hardware RAID is better comparing to Storage Spaces. I had several times when Storage Spaces failed to rebuild after a drive failure (not related to bitrot).
Novox
Expert
Posts: 128
Liked: 22 times
Joined: Jul 12, 2016 12:51 pm
Location: Vermont, U.S.A.
Contact:

Re: ReFS w/Integrity Streams in resilient Storage Spaces (Software RAID) vs. Hardware RAID for data reliability?

Post by Novox »

Just found this, https://smbitjournal.com/2012/05/when-n ... -reliable/, for an alternative perspective, however this doesn't seem to contradict the consensus; that SS/S2D ReFS w/IS "OnRead" data-recovery features are only as good as how they are implemented.
Novox
Expert
Posts: 128
Liked: 22 times
Joined: Jul 12, 2016 12:51 pm
Location: Vermont, U.S.A.
Contact:

Re: ReFS w/Integrity Streams in resilient Storage Spaces (Software RAID) vs. Hardware RAID for data reliability?

Post by Novox »

Sorry for all the repeated posts...

Can we at least agree that if ReFS w/IS in Storage Spaces worked as intended, i.e.
When integrity streams are enabled, ReFS can clearly determine if data is valid or corrupt. Additionally, ReFS and Storage Spaces can jointly correct corrupt metadata and data automatically [via "OnRead" checksum verification AND periodic Integrity Scrubbing].
[is text I added]

(https://docs.microsoft.com/en-us/window ... ty-streams)

That it is likely more resilient than a hardware RAID solution? If only for the fact that ReFS and Storage Spaces have tight integration and hardware RAID solutions know nothing about the underlying file system?

Again, this makes the assumption that the Windows OS, Storage Spaces, ReFS, Integrity Streams, Integrity Scrubbing, etc all work as intended.
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: ReFS w/Integrity Streams in resilient Storage Spaces (Software RAID) vs. Hardware RAID for data reliability?

Post by Gostev »

Yep, agreed - this is exactly what I said in my first reply.
Post Reply

Who is online

Users browsing this forum: maoty@actwill.com.cn and 105 guests