Comprehensive data protection for all workloads
Locked
Gostev
Chief Product Officer
Posts: 31816
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Gostev » 2 people like this post

Hi Iain,

Yes, according to the feedback so far, with the latest fixes ReFS is now usable for the majority of users. Some still have issues, which are suspected to be caused by lack of RAM on the backup repository server. You can read the detailed feedback on the last few pages of this thread.

You need to make sure you have KB4088787 installed (this update will be downloaded and installed automatically from Windows Update, but you can also download it from Microsoft Update Catalog). This is 2nd week of March update, so all the future updates will include the required fix as well.

Normally, there's no need for any reg changes.

Until 4KB reliability is proven in the field, it is safer to stay with 64KB block size. Especially since Microsoft is reportedly working on optimizing ReFS memory pressure in the April update, which of course will be much higher with the smaller block size.

Thanks!
JaySt
Service Provider
Posts: 454
Liked: 86 times
Joined: Jun 09, 2015 7:08 pm
Full Name: JaySt
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by JaySt » 1 person likes this post

Gostev, all good news, but i still don't get why there's no Veeam KB describing the current status. Some kind of advise on the minimum requirement of KB4088787 and , as you mentioned, things like "use 64K for now" is good stuff to get out there if you ask me.
Veeam Certified Engineer
Iain_Green
Service Provider
Posts: 158
Liked: 9 times
Joined: Dec 05, 2014 2:13 pm
Full Name: Iain Green
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Iain_Green »

Hi Gostev,
Thanks for the confirmation, however regarding:
Gostev wrote:Some still have issues, which are suspected to be caused by lack of RAM on the backup repository server. You can read the detailed feedback on the last few pages of this thread.
I have gone back over ten pages, and unless I have missed it, there is no real specific information.

I assume the online documentation is relevant to REFS?
Hardware
CPU: x86 processor (x86-64 recommended).
Memory: 4 GB RAM, plus up to 2 GB RAM (32-bit OS) or up to 4 GB RAM (64-bit OS) for each concurrent job depending on backup chain’s length and backup files sizes. For more information, see Limitation of Concurrent Tasks.
Network: 1 Gbps or faster for on-site backup and replication, and 1 Mbps or faster for off-site backup and replication. High latency and reasonably unstable WAN links are supported.
Many thanks

Iain Green
Gostev
Chief Product Officer
Posts: 31816
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Gostev » 1 person likes this post

JaySt wrote:Gostev, all good news, but i still don't get why there's no Veeam KB describing the current status. Some kind of advise on the minimum requirement of KB4088787 and , as you mentioned, things like "use 64K for now" is good stuff to get out there if you ask me.
Because the KB was literally just published, and we can't possibly have enough information to make any recommendations. Please don't forget we're not the vendor behind ReFS, so all we can base our recommendations on is empiric evidence. We don't have the luxury of approaching this scientifically, by looking at the ReFS code to calculate total metadata array sizes, consider cleanup algorithms peculiarities etc.

We don't ask Michelin for a recommendation on oil type for our car, just because we use their tiers, right? And yet, pressing Veeam for recommendations on Microsoft ReFS block size and memory footprint became so normal people "don't get it" why Veeam does not have the KB mere few days after the major ReFS patch :D this doesn't seem very fair to me!

Anyway, please just give us some time to collect some real-world data. There's little use of semi-scientific guesstimates anyway, which is the only thing we would be able to provide without performing long-term stress testing of the new ReFS code. Thank you!
kubimike
Veteran
Posts: 391
Liked: 56 times
Joined: Feb 03, 2017 2:34 pm
Full Name: MikeO
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by kubimike » 1 person likes this post

My run time still looking good. I haven't updated. This veeam box is still running BETA2

Image
Giacomo_N
Enthusiast
Posts: 93
Liked: 16 times
Joined: Feb 15, 2013 1:56 pm
Full Name: Giacomo
Location: Italy
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Giacomo_N »

Hi all, it’s time to refresh our repositories, I want to replace current two Nas with a server and windows 2016.
I’ve just read some posts, but I’m Not sure if ReFs can increase any kind of backup method or only synthetic full.
Actually I make a full during Sunday and a daily incremental, I tried in the past reverse and forever incremental, but both was slowly compare to standard incremental and we don’t have a space repository problem.
Can ReFs boost also standard incremental? And, the most important things is it now stable??
Thanks!
Gostev
Chief Product Officer
Posts: 31816
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Gostev »

No, classic incremental backup does not benefit from ReFS block cloning, so you can use any file system at all.
As for stability, I just answered this question to someone else (see the first post on this very page). Thanks!
Giacomo_N
Enthusiast
Posts: 93
Liked: 16 times
Joined: Feb 15, 2013 1:56 pm
Full Name: Giacomo
Location: Italy
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Giacomo_N »

Gostev wrote:No, classic incremental backup does not benefit from ReFS block cloning, so you can use any file system at all.
As for stability, I just answered this question to someone else (see the first post on this very page). Thanks!
And forever incremental?
Thank you gostev for the reply and for all, my choice to replace the Nas with a general purpose server come after reading your “last word” newsletter, it’s the only one vendor newsletter that does not trash since 2013 :D
mkaec
Veteran
Posts: 465
Liked: 136 times
Joined: Jul 16, 2015 1:31 pm
Full Name: Marc K
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by mkaec » 1 person likes this post

I quite enjoy Gostev's newsletter. It was an unexpected perk of licensing Veeam.
Gostev
Chief Product Officer
Posts: 31816
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Gostev »

Thank you for your kind words, guys :oops:
Giacomo_N wrote:And forever incremental?
Yes, every backup mode updates an existing full backup file, or creates new synthetic full backup files does benefits from fast cloning.
CloudMSP
Service Provider
Posts: 43
Liked: 11 times
Joined: Jul 16, 2017 5:39 am
Full Name: Veeam MSP
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by CloudMSP »

Gostev wrote:Thank you for your kind words, guys :oops:
Yes, every backup mode updates an existing full backup file, or creates new synthetic full backup files does benefits from fast cloning.
Guys, I just migrated my CC Repositories, by stopping all Veeam services, moving the data to new disks, reformatting with ReFS 64KB, and moved the data back, restarted services. Everything is working, but will Veeam automatically see that it ReFS and fast clone capable now?

Also I have another thread about how I'm still spinning up Windows 2012 R2 for VM hosts due to all the snapshot problems. Should I be using ReFS on Hyper-V hosts (about 6 TB of data )? If so, does that mean I should definitely be using 2016 now? Do you know if the production snapshot issues are better now?

Thanks.
Gostev
Chief Product Officer
Posts: 31816
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Gostev »

CloudMSP wrote:Everything is working, but will Veeam automatically see that it ReFS and fast clone capable now?
Yes. You will start seeing benefits after the next active or synthetic full backup.
Yes, the new Hyper-V 2016 snapshots are MUCH better.
mweissen13
Enthusiast
Posts: 93
Liked: 54 times
Joined: Dec 28, 2017 3:22 pm
Full Name: Michael Weissenbacher
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by mweissen13 » 1 person likes this post

Gostev wrote: Yes. You will start seeing benefits after the next active or synthetic full backup.
Hi Gostev. We did exactly the same thing as CloudMSP and it did not work that way for us. The support told me it would work that way but it actually didn't. The solution was to create a new Repository pointing to the very same directory after moving the data back. My guess is that Veeam doesn't re-check the block alignment after the repository was created and also it will not automatically activate the associated checkbox (Align backup file data blocks) in the repository's advanced settings. See Support Case ID 02308704. But maybe this was fixed in 9.5 Update 3?
Gostev
Chief Product Officer
Posts: 31816
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Gostev »

That is correct, you cannot just edit the existing repository - you have to create a new one, even if it points to the same volume. Thanks for reminding me of this!
kubimike
Veteran
Posts: 391
Liked: 56 times
Joined: Feb 03, 2017 2:34 pm
Full Name: MikeO
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by kubimike »

Following up with everyone, has the latest Microsoft KB solved this issue for you guys ?
JimmyO
Enthusiast
Posts: 55
Liked: 9 times
Joined: Apr 27, 2014 8:19 pm
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by JimmyO » 1 person likes this post

I´ve only installed it on one of my repos but it works really well so I plan to reformat my other ones to ReFS in the next couple of months.
myFist
Enthusiast
Posts: 33
Liked: 7 times
Joined: Nov 29, 2017 1:06 pm
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by myFist »

Gostev wrote:
Until 4KB reliability is proven in the field, it is safer to stay with 64KB block size. Especially since Microsoft is reportedly working on optimizing ReFS memory pressure in the April update, which of course will be much higher with the smaller block size.

Thanks!
Hello,

is the memory optimization included in April Update KB4093119?
stoal76
Novice
Posts: 7
Liked: never
Joined: Mar 10, 2015 8:18 am
Full Name: Shane
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by stoal76 »

Hi everyone,

When our Veeam backups run out of disk space on our Server 2016 REFS volume, the server stops allowing logins via RDP and on the console gets stuck at this message

Setting up personalized settings for web platform customizations

I can get to task manager on the console and see some services stopping and starting, if I try to logout it gets stuck, only option is to force the server off.

This message appears in the logs around the time after the backup runs and fulls the disk:

tcp/ip failed to establish an outgoing connection - and it looks like lots of Veeam event logs message about File is locked by 1 processes

I upgraded the server to the March cumulative update mentioned above but it still had the same problem, maybe it will be fixed in the April update.

We are only using reverse incremental backups but it does seem to be using this Fast cloning Dedupe thing as the total of the backups is more than the total disk size

Does anyone know if you can disable Veeam from using this new REFS Fast cloning feature without rolling back to NTFS??

Thanks
Bacon
Novice
Posts: 3
Liked: 1 time
Joined: Jan 26, 2018 9:35 am
Full Name: Alexander
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Bacon »

I still have the problem on the target repo for my backpcopy job. It only happen when veeam is trying to merge the oldest restore point to the full backup file. The merge is starting, after 4-6 hours the servers loads up the cpu to 100% and is stuck. I can see it on the vmware performance overview, the cpu load goes to 100% the disk down to 0%.

I installed the April updates but no luck there.
Gostev
Chief Product Officer
Posts: 31816
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Gostev »

myFist wrote:is the memory optimization included in April Update KB4093119?
I haven't heard the news but I am expecting the update we're waiting as "download only" update at first (available only via Microsoft Update Catalog). This one looks to be published on Windows Update.
dive7
Novice
Posts: 3
Liked: never
Joined: Mar 21, 2018 8:41 pm
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by dive7 »

Is anyone still experiencing this issue when using 64k block size ReFS volumes? If so, are there any workarounds?
kubimike
Veteran
Posts: 391
Liked: 56 times
Joined: Feb 03, 2017 2:34 pm
Full Name: MikeO
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by kubimike » 1 person likes this post

@dive7 what issue ?
dive7
Novice
Posts: 3
Liked: never
Joined: Mar 21, 2018 8:41 pm
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by dive7 »

Sorry, I should have clarified. We were seeing our repo server lock up and become unresponsive to RDP and HP ILO console. I originally typed out a post with more info but it got denied for not having a support case to reference. The engineer we worked with suggested offloading the proxy function to a different server. Previously we allowed the jobs to automatically select a proxy, and each job would auto-select the repo server as the proxy. Issue hasn't come up since the last server reboot, and we have since set some of our jobs to go to a new proxy.

We have never used 4k block size ReFS volumes, but we were seeing similiar symptoms with our 64k block size volumes (server lockups/high RAM).


case 02767982
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by DonZoomik »

myFist wrote:is the memory optimization included in April Update KB4093119?
I extracted both KB4093119 and KB4093120 and refs.sys is still at 10.14393.2097 (from February).

I don't use ReFS and therefor don't have any issues but I've been following this thread for a long time. :)
Cicadymn
Enthusiast
Posts: 26
Liked: 12 times
Joined: Jan 30, 2017 7:42 pm
Full Name: Sam
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Cicadymn » 1 person likes this post

Anybody seen any word on the April update that's supposed to help out a bit too?

I'm doing really good now with the previous update, but I'm almost scared to patch it and fall back into the world of ReFS horror!
BigJack
Lurker
Posts: 2
Liked: never
Joined: Apr 27, 2018 9:29 pm
Full Name: Jack Clark
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by BigJack »

dive7 wrote:Is anyone still experiencing this issue when using 64k block size ReFS volumes? If so, are there any workarounds?
I am currently experiencing this issue. I have a pair of 64-disk arrays, each formatted as ReFS with 64K blocks. Fully updated, these Windows 2016 servers still suffer terrible performance with fast-clone operations. The onsite repository has enough time to perform backups with weekly synthetic fulls. The offsite repository can't process all of our backup copy jobs. The data copies from onsite to offsite quickly, but merge operations take forever. Veeam support suggested enabling "Defragment and compact full backup file" within the backup copy jobs. Not only did this not help, but it added to the copy job lengths. I don't understand how this option would benefit an ReFS volume anyway. Also, I think our off-site copy repository is in such bad shape because I believe copy jobs can only use the "Forward Incremental-Forever" storage method, which triggers a fast-clone (SLOW) operation with every backup copy instead of just during synthetic fulls. Help!
EricJ
Influencer
Posts: 20
Liked: 4 times
Joined: Jan 12, 2017 7:06 pm
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by EricJ »

Has anyone heard anything new about the ReFS memory fix due out in April?

We don't have server lockups anymore, but during nightly backup copy jobs, the filesystem gets bogged down enough that ongoing SQL transaction jobs fail. Each night we'd reduce the transfer limit on each repo, and had to crank it down to 30MB/sec to keep the jobs from failing. It's a band-aid, but backup windows are much larger and we're not getting anywhere near the max performance of the disk arrays that we paid all this money for. :?

Hoping the rumored "April fix" will address our issue.

Thanks,
Eric
Gostev
Chief Product Officer
Posts: 31816
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Gostev » 2 people like this post

I already asked my contact for an update about a week ago, but got OOO back as he's on vacation. I will update once I hear anything back... after he's back. Thanks!
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by DonZoomik »

KB4103723 has patched refs.sys to 10.0.14393.2248 (April 28th) but nothing in release notes.
EzE
Influencer
Posts: 19
Liked: never
Joined: Feb 06, 2015 3:48 pm
Full Name: Eric H
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by EzE »

BigJack wrote: Fully updated, these Windows 2016 servers still suffer terrible performance with fast-clone operations.
Just rebuilt my backup server and having trouble with ReFS fast-clone slowness as well. I figured with the MS patches out, I could move forward with ReFS. The server is fully up to date, and ReFS.sys file is at version 10.0.14393.2273, which comes down as part of CU KB4103720. I'm using 64k clusters as recommended, but I see terrible merge times for both local disk repo and external USB repo (temporary while changing some cloud provider stuff). Local RAID 10 repo takes about 1 hour per TB for fast clone backup chain transformation. So disappointed that even after all the fixes, I need to consider going back to NTFS before I get too far down the rabbit hole. I hate to add yet another ReFS ticket to the support queue... Anyone see performance degrade with the newer version of ReFS? Maybe MS fixed it, and then broke it again?
Locked

Who is online

Users browsing this forum: Bing [Bot], morgel, Semrush [Bot] and 60 guests