Comprehensive data protection for all workloads
Post Reply
RGijsen
Expert
Posts: 124
Liked: 25 times
Joined: Oct 10, 2014 2:06 pm
Contact:

New repository - NTFS or REFS with regards to restore speeds

Post by RGijsen »

Hi. So far I've always had a lot of trouble fitting our backups to our main repository. Our remote repository uses ReFS which works fine on there, but our main site still uses NTFS. We are a small company and budget isn't 'just' available. I now have about a 8TB NTFS repository with dedupe enabled, storing about 28TB of data. We dedupe everything older than 0 days, so we dedupe everything. Last week sh#t hit the fan for one VM, so I had to restore it. It was just a 30GB VM, but restoring took about 30 minutes. Bottleneck is the storage. The repository is on a 16 x 1TB 7200rpm SAS disk array, on its own dedicated 8Gb FibreChannel SAN (with that I mean it's not on the production SAN). The rest of the storage on that disk array is just archive storage. 99% of all IO on that SAN is coming from Veeams repository. Of course, this is somewhat to be expected. Dedupe overtime makes for a huge fragmentation of big files, so restoring a VM will take much more IO than when stored without dedupe.

I've had it with the crappy solutions, which takes me a lot of time which also tranlates into money. So next wednesday I'm installing a new disk shelf on the backup-SAN with 12x 2TB 7200rpm. That one will be solely used for Veeam. That means that space is not an issue anymore as long as I do some type of dedupe - be that actual NTFS dedupe or ReFS block-cloning (I know technically that's no dedupe but the result is more or less the same). Now I'm not sure what to format the repository with.

If I use NTFS with dedupe, and set it to dedupe only everything older than let's say a week, my restores should be quick. We do one active full a week, which would mean even in the last day of the backup chain, I'd still be able to restore a VM from non-deduped ie pretty much sequential data. If I need to restore something quickly, that mostly means I need to restore the last backed up status. If I need older data that's always been less urgent. So I think NTFS with dedupe > 7 days is a good option here.

If I use ReFS, I don't have that choice. I see the benefit of ReFS in it preventing redundant data in the first place, however, that means over time all full backup files will get so fragmented that restore times will get longer and longer again, just as my current issue is.

The main reason for finally ordering new storage is that restoring that 30GB VM took about 30 minutes which is unacceptable slow. I can't imagine how long it takes to restore a 2TB VM, even if we use instant recovery. So what do you think? should I go ReFS or NTFS?
Gostev
Chief Product Officer
Posts: 31556
Liked: 6719 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: New repository - NTFS or REFS with regards to restore sp

Post by Gostev »

RGijsen wrote:I see the benefit of ReFS in it preventing redundant data in the first place, however, that means over time all full backup files will get so fragmented that restore times will get longer and longer again, just as my current issue is.
"Fragmentation" is very different on ReFS since what is cloned are Veeam blocks, which are 512KB in size on average, so impact will be nowhere near one you're seeing with Windows deduplication that uses variable block size dedupe down to a few KB. While drives you're buying provide about 100 IOPS each, so simple math gives you total throughput of 50MB/s per spindle from 100% fragmented volume - and you have 12 of those.
RGijsen
Expert
Posts: 124
Liked: 25 times
Joined: Oct 10, 2014 2:06 pm
Contact:

Re: New repository - NTFS or REFS with regards to restore sp

Post by RGijsen »

Thanks Gostev. I'm not entirely sure about that IOPS calculation, seems a bit too simplified ;) But ofcourse, random 512KB vs random 4k is a no brainer. I'll test with ioMeter in advance. Another thing though with ReFS is one I covered here: microsoft-hyper-v-f25/inline-dedupe-on- ... 38900.html. We talked about a form of active-full with ReFS. Or semi-full more like, a full-disk scan and only transfer actual changed blocks. I've had no response there anymore, is anything like that in the pipeline?
Gostev
Chief Product Officer
Posts: 31556
Liked: 6719 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: New repository - NTFS or REFS with regards to restore sp

Post by Gostev »

I should have used "simplified math" instead of "simple math" :D

Sorry about not coming back to you on another thread - this was when I left for business travel for a few weeks, so I could not get back to you. Yes, this is definitely something we're planning to investigate and test post-v10. Generally we see many benefits in leverage advanced file system capabilities, so you can expect us building more stuff on top of that.
Post Reply

Who is online

Users browsing this forum: Baidu [Spider], Google [Bot], jsprinkleisg, lee.rivas, Semrush [Bot] and 97 guests