REFS Restore Performance

JPMS · Post by **JPMS** » Jan 26, 2020 5:54 pm this post

I have had a search through the forum and there seems to be little discussion of REFS restore performance. It's all very well having faster backups and saving space using block-cloning but what's the impact on speed of restore?

What originally got me thinking about this in particular was issues tape users are having with Synthetic Fulls created with block-cloning - tape-f29/slow-tape-job-performance-t61054.html#p356233. Spoiler alert - don't bother with block-cloning if you don't want to see your tape backups crippled! In that thread, the reason given by Veeam for the massive reduction in tape backup speed was due to the rehydration of REFS backups. When I initially read the posts I thought that if that was true then it should be the same for restores as they are using the same process.

Anyway, as you can read there, I have just implemented a Server 2019 REFS repo and have run some test restores to see the impact of block-cloning on restore speeds and it was greater than I expected it to be.

The full details of what I did are in my post on the other thread but in summary:
Did an Active Full of all our VMs (repo set to 'Use-per-VM backup files'). Next night did Forward Incremental with Synthetic Full (to dump to tape). The following night did another Forward Incremental. Then did test restores from each night of 14 VMs and also a single VM.

The 14 VMs had a total size of 710GB. Data changes for first incremental 16GB. Data changes for second incremental 33GB
Restore speed from Active Full 563MB/s, Restore speed after one incremental 473MB/s, Restore speed after two incrementals 438MB/s

The single VM I restored had a size of 61GB. Data changes for first incremental 2GB. Data changes for second incremental 6GB
Restore speed from Active Full 488MB/s, Restore speed after one incremental 458MB/s, Restore speed after two incrementals 384MB/s

Caveat: I only ran these tests a single time and no doubt other peoples experience may differ, especially depending on the amount of data change. It was to give me a rough idea not meant to be a 'proper' test. That said, I was quite surprised by the results, a >20% drop after just two incremental backups!

What is other people's experience with restore speeds from block-cloned backups? My implementation is based on Server 2019 LTSB with the latest mid month January REFS update. Is this sort of performance drop to be expected? Is it going to keep getting worse with each incremental backup? Is it better with other REFS implementations (2016 or Server 1903, 1909)?

Finally, unless Veeam can sort out the crappy tape performance with block-cloned backups, I am going to have to disable the block cloning. In this situation, am I better reformatting as NTFS or sticking with REFS for the formatting of the drive?

Post by **Gostev** » Jan 26, 2020 8:07 pm this post

This comes down to how storage works, and the simple fact that random I/O is much slower than sequential I/O...

Also, the issue is definitely not specific to ReFS, as it's was no different in NTFS days. Rather, the issue is specific to using forever incremental backup with no periodic fulls. You can get great performance on any file system when using periodic Active Fulls, but then you're obviously trading disk space restore performance for slower backup performance, larger impact on production environment, and significantly increased disk space usage.

The good news is that storage only gets faster every year, so these days the performance drop due to random I/O is much less of a concern than it was when I started at Veeam almost 12 years ago. Especially since Veeam uses fairly large block size.

v10 does bring some tape offload performance improvements which should help with enterprise-grade backup storage.

Post by **mkretzer** » Jan 26, 2020 8:54 pm this post

What kind im improvements are there with tape in V10? Real parallel processing?

Post by **Gostev** » Jan 27, 2020 9:32 pm this post

Not sure what do you mean by "real parallel processing" in the context of offloading backups to tape...
But there's a bunch of under the hood enhancements and optimizations for this specific use case in v10.

Post by **mkretzer** » Jan 28, 2020 7:15 am this post

"real parallel processing" = Not only sequentially reading the backup files and writing to tape but reading with multiple streams from multiple files even to one tape drive.

Post by **Gostev** » Jan 28, 2020 1:54 pm this post

Well, that would be a questionable feature to have really, as this would result in multiple full backups spread across multiple tapes. This means much slower restores (as you will need to mess with multiple tapes to restore a single backup file), but more importantly worse archive reliability (as loosing a single tape means losing multiple backups, even if other tapes are still fine). Let's not forget we do backups not for the sake of fast backups, but to ensure fast and reliable restores!

JPMS · Jan 29, 2020 8:51 am

I'm with mkretzer on this. I understand your reply Gostev but with tapes running to 12TB capacity we have no need to exceed a single tape so multiple tape problems are not an issue for us. I understand what you say about the effect on restore speeds, but as a secondary backup medium, we are unlikely to ever need to restore from tape but we do have to backup weekly and tape backup speeds are a real issue. It could be implemented as a user selectable option.

I noted your comment "v10 does bring some tape offload performance improvements" (my emphasis) but what we need is major tape offload performance improvements. My post tape-f29/slow-tape-job-performance-t61054.html#p356233 has received no reply. If my repo is capable of delivering a single stream of 400MB/s it is unacceptable that this can only be written to tape at 100MB/s when the tape drive is capable of 300MB/s. Not only is there the issue of the greatly increased backup time but the 'shoe shinning' additional wear on tapes and tape drive.

Parallel processing is one possible solution, buffering in memory could be another.

Post by **mkretzer** » Jan 29, 2020 9:09 pm this post

Gostev, sorry to say this but i think you really do not see the problems with that standpoint. We have really fast backup storages, still the tapes cannot keep streaming all the time and this is not good for the tapes at all. Furthermore, we need more tape drives to get our backups done in the backup window at all!
As soon as we will get to LTO 9 or LTO 10 no rotating disk based storage will be able to keep the tapes streaming at optimum speed.

I simply cannot understand why a feature which is there in other backups solutions (for example HP DataProtector) for > 15 years and Veeam again and again tells me "this does not make sense!".

With DataProtector you can configure 1-32 streams. Why not give users the choice?
With Veeam you have no chance at all. I had cases again and again and never a solution. Veeam can simply not use tape drives efficiently if the source storage is not SSD based or has the data on disk in a layout that all can be read sequentially.

Post by **Gostev** » Jan 29, 2020 9:20 pm this post

Giving users choices means creating more code tunnels, further diluting QC resources and as a result delivering worse quality for all users, including those NOT using those new options. My job here at Veeam is specifically to say "No" to adding choices

Next, comparing Veeam with legacy tools tape back tools which were designed 30 years ago for file-level backup is totally invalid. Obviously, multi-streaming does not present any of those issues I was talking about in my previous post, when backing up files with average size well under 100KB. Whatever functionality those tools have is totally irrelevant for Veeam, which deals with multi-TB backup files - making concerns explained above extremely relevant.

Having said that, I will be much more willing to consider adding this feature as soon we will get to LTO 9 or LTO 10, and it becomes the new norm for the majority of our customers. But until that happens, trust me - there are a number of much more pressing and essential feature request for our tape development team to work on.

mkretzer wrote: ↑Jan 29, 2020 9:09 pmVeeam can simply not use tape drives efficiently if the source storage is not SSD based or has the data on disk in a layout that all can be read sequentially.

This, on the other hand, is a true statement - which is why this was the area I focused our tape dev team for v10. This is also the reason why you'll see v10 having almost no other "visible" enhancements in v10. It was a big task, but down the road other product functionality will also benefit from the work they've done in v10.

RGijsen · Feb 03, 2020 8:28 am

JPMS wrote: ↑Jan 29, 2020 8:51 am I'm with mkretzer on this. I understand your reply Gostev but with tapes running to 12TB capacity we have no need to exceed a single tape so multiple tape problems are not an issue for us. I understand what you say about the effect on restore speeds, but as a secondary backup medium, we are unlikely to ever need to restore from tape but we do have to backup weekly and tape backup speeds are a real issue. It could be implemented as a user selectable option.

I noted your comment "v10 does bring some tape offload performance improvements" (my emphasis) but what we need is major tape offload performance improvements. My post tape-f29/slow-tape-job-performance-t61054.html#p356233 has received no reply. If my repo is capable of delivering a single stream of 400MB/s it is unacceptable that this can only be written to tape at 100MB/s when the tape drive is capable of 300MB/s. Not only is there the issue of the greatly increased backup time but the 'shoe shinning' additional wear on tapes and tape drive.

Parallel processing is one possible solution, buffering in memory could be another.

The following comment is meant to be constructive, not offensive.

I also agree with mkretzer. While I understand Gostevs post about backup needing to be sequentially on tape, that's just theoretical and not really feasible with todays streamers anymore. Nor was it 10 years ago with that times technology. We don't all have budget for all-flash 32+Gbps multiple PB SANs just for backup purposes. Much worse than a tape backup being not sequential is a tapestreamer not getting in streaming mode and shoe-shining your tapes.
Honestly I get a bit of a double feeling. From one (small) side I go with Gostev in this, but on a way bigger side I feel he's doing exactly what he's more or less accusing Microsoft of with his Office365 GUI 'rant' in his latest digests. MS dictates what is good for us, and now so does Veeam. It's certainly not that black and white, but still, users should have an option. The argument of having multiple code-paths is a complete non-argument in my book in this matter.

Personally, on our main site, we've stopped using ReFS altogether. On our previous backup repository (dedicated 16Gbps SAN with at that point only mechanical disks), within weeks the restore performance didn't exceed 30MBps anymore. After having some corruption and running across the fact that there are practically no ReFS tools or troubleshooting available at all, we lost terrabytes of data and space on our repository. ReFS marked a few files bad, making them not visible anymore. However, one is unable to reclaim the space, so all that space is just lost. We now have ReFSutil.exe which can help recovering some of the files, but the lost space is still a huge issue. When using an actual Storage Spaces setup, I'm pretty sure ReFS can shine (but don't read too much topics on Technet about it, you might get scared) but on single volumes... my bet it it's way better to get some more large spindels and stay with NTFS. Spindels aren'y that expensive, and well worth the investment in my book. The space savings with ReFS are great, but the obvious randomization of data kills performance. Unless you have a nice big SSD solution of course. We've moved back to NTFS and for the forthcoming years we won't look back at ReFS until it matures more.

We've stopped using our trusty old LTO3 library a few years ago, and switched to a remote-site repository with spindels. However, aside from the fact that ransomware could be in our systems without it being activated or known, given all threats there are in the recent time, we consider starting using a tape library again just to have a not-connected backup. Having read dozens of topics about tape with Veeam, it seems Veeam is just not that good with tapes (and of course I understant you'll read more bad news than good news). We also come from dataprotector, and while it had it nags, it was a blast seeing multiple streamers actually stream and therefore be really efficient. Comparing that with Veeam IS valid. A 100% valid comparison. I understand your point, but from a users endpoint view this 30 year old technology worked better than this new one (with regards to tapes, don't get me wrong).

Often in your digest, you 'attack' other companies, and claim how good Veeam is and how well they are doing, and how much better Veeam is compared to others. Fair enough, Veeam is probably the best of the breed at the moemnt. But it's not walhalla either. Please be and keep critical of your own company and products as well, and don't pull an MS on your users by knowing what's best for your users, because you don't.

'I don't dicatate how or where the user is going to use his computer, that's all up to the user.' - Jack Tramiel, 1985.

Post by **aich365** » Feb 03, 2020 9:22 am this post

We have several repositories which are VMs and so they cannot access tape directly.
Their storage is ReFS on a SAN.
Our tape server is a physical machine connected to an IBM Tape library
Our backups run at 25-30MB/s because the data has to be transferred across the 1Gbps network.
Veeam has proxy servers that can transfer data with Direct San Access (DSA)
How long before we can have DSA for tape?
The alternative seems to be multiple physical server with tape configured as repositories.

We have some NTFS repositories which run at 75MB/s - not sure whether to upgrade these to ReFS.

daniel.farrelly · Feb 03, 2020 5:32 pm

Please tell me I've missed something, but isn't the point of tape archives is that they're dependable, nearly indestructible, and most certainly air-gapped to some extent depending on their config? Isn't this one of the reasons why tape has resurfaced in popularity last couple years? Of course it may be possible to restore at a few hundred MB/s - in a perfect setting, but tape has never been designed for performance. If you are that concerned about restoration performance, then you need all-flash arrays with 40/50g nics, though i'm not sure about recommending periodic active fulls

... then again, if anyone's doing that, I'm curious!
Tape is (still) great. Should one base their entire backup infrastructure around a single tape drive, probably not. But I do understand budgets can be limited. If this is the case, you have one tape environment for all your backups, then I would hope you're more interested in how accurate your actual backups are versus how fast you can complete a restore.

Post by **Gostev** » Feb 03, 2020 6:30 pm this post

I think you just got lost in tape vendors' marketing

while tape indeed has generally better reliability than disk, of course it is not flawless either - and it's not uncommon that a certain tape becomes unreadable. Which is exactly why you don't want to end up in a situation where this tape contains a few blocks from nearly every backup you have.

Feb 04, 2020 5:45 pm

Gostev,

sorry, but "cuncurrency = 4" setting would most likely solve all our issues and still would mean that with per-VM 99 % of our VMs (yes, most are small, yes, other customers have bigger VMs) will not be spanned over more than one tape. But it would reduce shoe-shining and make tape even more reliable.

And best of all: In the event where everything needs to be restored (nor unlikely as tape is somewhat of a "last line of defense") the data could be written out to multiple target repos at the same time when everything needs to be restored from tape.

Post by **DonZoomik** » Feb 04, 2020 8:02 pm this post

I've found that defragmenting backup repositories does help a lot with restore (and even backup) performance over time, even with ReFS. However you must not have synthetic block cloned fulls on ReFS (only forever incremental chains!) - if you do have synthetic fulls then defragment will break up joined blocks and you will lose any savings over time.

Post by **Gostev** » Feb 05, 2020 5:44 pm this post

You bring up a good point, as this has been my other concern with the proposal. Introducing concurrency and reading multiple backup files at once actually means putting MORE random I/O on already IOPS-constrained and struggling storage, which is a typical backup repository... so I fail to see how doing this can help with tape offload performance in typical scenarios.

If you ever tried to do multiple concurrent copy operations from a single HDD, you will know exactly what I'm talking about.

Post by **mkretzer** » Feb 05, 2020 9:40 pm this post

Gostev,

in our case this is simply not the issue and i have prooven that to your support many times!! We have alot of drives in our repos but 95 % of these drives are idle the whole time because of the extremely inefficient sequential read.

One stream (diskspd) does ~150 - 250 MB/s, many streams do > 2 GB/s!

Post by **Gostev** » Feb 05, 2020 9:56 pm this post

In that case, v10 should help you.

cfizz34 · Post by **cfizz34** » Feb 07, 2020 1:20 pm this post

When will v10 be released?

Post by **veremin** » Feb 07, 2020 1:53 pm this post

It's already shipped, so you can either request the RTM build via support ticket - or wait for few weeks, until it becomes generally available. Thanks!

JPMS · Post by **JPMS** » Feb 07, 2020 11:15 pm this post

We've got v10 and have found no discernable difference in speed of output to tape, the caveat being we haven't done extensive testing, just taken some backups that we already had speed measurements for and run them again.

We've had B&R for 10 months and have struggled to find a setup we are happy with (quite a bit of it discussed here in the forum). Started with Centos repo, changed to Server 2019 LTSB with REFS, and then tried Server 1903 but no noticeable improvement over 2019. Now gone back to 2019 LTSB with NTFS. We use forward incremental with synthetic fulls and transform previous backup chains into rollbacks with a monthly active full. It has the highest IO requirement but this isn't an issue for us and is the only solution we have found that allows us to dump out to tape at a consistent 300MB/s.

Gostev, you mentioned improvements in tape performance. Can you give us any details and in what sort of operations you would expect to see an impact?

JPMS · Post by **JPMS** » Feb 08, 2020 12:36 pm this post

Gostev wrote: ↑Feb 03, 2020 6:30 pm I think you just got lost in tape vendors' marketing while tape indeed has generally better reliability than disk, of course it is not flawless either - and it's not uncommon that a certain tape becomes unreadable. Which is exactly why you don't want to end up in a situation where this tape contains a few blocks from nearly every backup you have.

Isn't this what happens anyway if you don't set 'Use per-VM backup files'? The backup file that is created is a mix of blocks from different VMs so when that is dumped out to tape then the result is no different from writing multiple streams of individual VM backups to tape?

Post by **Gostev** » Feb 09, 2020 8:54 pm this post

v10 will only help tape offload performance in scenarios when backup storage does have available IOPS capacity, such as the perfect case that mkretzer has shared in his last post. But, if backup storage is already hitting its IO performance limits, then obviously no tricks can help to make the offload faster.

Anyway, by now this topic has been completely hijacked

and since I believe the original question was fully answered in my first post, I will lock this topic down now. If there's a need to discuss our tape functionality or if you have questions on how it works, you're more than welcome to start the corresponding new topic in the appropriate sub-forum - where it can actually be seen by the responsible PMs

Thanks!

R&D Forums

REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Re: REFS Restore Performance

Who is online