Comprehensive data protection for all workloads
Locked
mark_e
Novice
Posts: 8
Liked: 2 times
Joined: Oct 10, 2016 10:13 am
Full Name: Mark Edmonds
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by mark_e »

Sounds like they are feeding you nonsense. It’s in RS4 for a start....
Our backups are being crippled because of this issue. We’ve had to result to reboot pretty much everyday and giving an extra caring hand to make sure we get all the backups done.
Even my deployment got questioned because of the backups failing, even though I only followed best practices using ReFS!
adrenaline_x
Influencer
Posts: 17
Liked: 2 times
Joined: May 03, 2016 4:24 am
Full Name: Mike Fuller
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by adrenaline_x »

Well. I wouldn't be surprised if they are feeding me bullshit at this point. They did refund our case (sa number) so i got that going for me which is nice.

The rep said that any new premium support cases that were created would not be able to get the fix either until they determine what is happening with the update they gave out. who knows.. it could have been the Techs buddy for all i know. :)

But... I'm moving Backups around to other servers now to convert them to NTFS but i'm only looking at moving 20 TBs as the main production systems are being backed up by net backup still and the new Datadomain expansions haven't been rolled out yet. I'm curious to see how fast veeam backups over our dual 40Gbe links from the Blade chasis to the switch with dual 10GBe connections to the datadomains. It should be hella fast. We are only using windows servers in remote offices as a short term backup windows with long term backups being stored in our Main DC with Datadomains so refs is not critical, but you know, i was following best practices.
edoyen
Novice
Posts: 5
Liked: 2 times
Joined: Jan 22, 2015 1:29 pm
Full Name: Eric Doyen
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by edoyen »

We are another victim of the ReFS issues. We have a Sev-A premier support ticket open with MS and burning through hours. We have applied the private fix, but every time we attach the ReFS disk back to our VM, Memory spikes, then CPU, then the VM becomes unresponsive and crashes. This is even with all Veeam components uninstalled on the repo.

Currently, the latest memory dump is being analyzed by debug team.
1.Till now, this case is opened for days, and ESC resource is engaged several hours before;
2.All previous effort proves that existing workarounds/HF doesn’t fix the issue which indicates this issue is quite similar with the known bug, but it differs in detail such like condition/released binary version and so forth;
3.After ESC resource engaged, we now know that the issue is located in ReFS, but as per it’s quite similar as the known bug, it is even harder to identify the difference and find workaround/solution;
4.As per we have taken days to test known guidance/steps to fix this issue, and now it’s narrowed down to be inside ReFS, our EE is still investigating the dump file, so before the solid analysis, there’s limited actions to perform at your side, even though we both are eager to conduct steps to fix it;
Veeam Ticket: 02448810
MS Ticket: 117123117388111
jzilak
Influencer
Posts: 19
Liked: 1 time
Joined: May 10, 2017 9:01 am
Full Name: Josef Zilak
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by jzilak »

Hi,
Fix for ReFS performance problems is scheduled to be released Feb 2018. Some customers obtained through Premier support test signed ReFS.sys for verification. There where multiple versions as Microsoft fix some regression issues. So in some cases customer gets new binary drop fixing those regression issues (and recommendation to replace previous version or rollback). This is where confusion probably started. Microsoft doesn’t recommend running production environment on test driver release. However latest drop fixed most (if not all) performance issues.
kubimike
Veteran
Posts: 373
Liked: 41 times
Joined: Feb 03, 2017 2:34 pm
Full Name: MikeO
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by kubimike »

Im running pretty stable with the older beta release test driver. If anyone needs help please shout. I spent many sleepless nights babysitting my box. Its now 95% stable. I've been harassing microsoft for their latest fix, I guess since we aren't forking over 100k a year in support fees I don't get a copy.
thomas.raabo
Service Provider
Posts: 28
Liked: 11 times
Joined: Oct 31, 2016 6:27 pm
Full Name: Thomas Raabo
Location: infrastructure guy
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by thomas.raabo » 1 person likes this post

edoyen wrote:We are another victim of the ReFS issues. We have a Sev-A premier support ticket open with MS and burning through hours. We have applied the private fix, but every time we attach the ReFS disk back to our VM, Memory spikes, then CPU, then the VM becomes unresponsive and crashes. This is even with all Veeam components uninstalled on the repo.

Currently, the latest memory dump is being analyzed by debug team.
1.Till now, this case is opened for days, and ESC resource is engaged several hours before;
2.All previous effort proves that existing workarounds/HF doesn’t fix the issue which indicates this issue is quite similar with the known bug, but it differs in detail such like condition/released binary version and so forth;
3.After ESC resource engaged, we now know that the issue is located in ReFS, but as per it’s quite similar as the known bug, it is even harder to identify the difference and find workaround/solution;
4.As per we have taken days to test known guidance/steps to fix this issue, and now it’s narrowed down to be inside ReFS, our EE is still investigating the dump file, so before the solid analysis, there’s limited actions to perform at your side, even though we both are eager to conduct steps to fix it;

Veeam Ticket: 02448810
MS Ticket: 117123117388111
Feel your pain ... been on two Sev-A tickets last month.

You do know that Sev-A is not the highest level right?
edoyen
Novice
Posts: 5
Liked: 2 times
Joined: Jan 22, 2015 1:29 pm
Full Name: Eric Doyen
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by edoyen » 1 person likes this post

thomas.raabo wrote:
Feel your pain ... been on two Sev-A tickets last month.

You do know that Sev-A is not the highest level right?
I had no idea. We have learned that a case has to go through several levels of escalation before we get the full benefit of our block of hours, so finding out that there is a higher crit level does not surprise me. Props to the escalation engineering team assigned to us, though. They seem genuinely eager to help us out.
thomas.raabo
Service Provider
Posts: 28
Liked: 11 times
Joined: Oct 31, 2016 6:27 pm
Full Name: Thomas Raabo
Location: infrastructure guy
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by thomas.raabo »

edoyen wrote:
Feel your pain ... been on two Sev-A tickets last month.

You do know that Sev-A is not the highest level right?

I had no idea. We have learned that a case has to go through several levels of escalation before we get the full benefit of our block of hours, so finding out that there is a higher crit level does not surprise me. Props to the escalation engineering team assigned to us, though. They seem genuinely eager to help us out.
The highest is Sev-1 and when at that level SVP and VP gets hourly updates and for windows the hole BU knows.

You need to start shooting and start being a bad guy to get to this level.
suprnova
Enthusiast
Posts: 38
Liked: never
Joined: Apr 08, 2016 5:15 pm
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by suprnova »

kubimike wrote:Im running pretty stable with the older beta release test driver. If anyone needs help please shout. I spent many sleepless nights babysitting my box. Its now 95% stable. I've been harassing microsoft for their latest fix, I guess since we aren't forking over 100k a year in support fees I don't get a copy.
Is this the test driver that Veeam provided a while ago from Microsoft? Are you still using any registry edits?
kubimike
Veteran
Posts: 373
Liked: 41 times
Joined: Feb 03, 2017 2:34 pm
Full Name: MikeO
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by kubimike »

correct the original beta from microsoft has file version '10.0.14393.1100' . I am using the following registry edits
"RefsDisableCachedPins"=dword:00000001
"RefsProcessedDeleteQueueEntryCountThreshold"=dword:00000200
"RefsNumberOfChunksToTrim"=dword:00000020
"RefsEnableLargeWorkingSetTrim"=dword:00000001

Also the drive timeout, can't remember what that setting was.
suprnova
Enthusiast
Posts: 38
Liked: never
Joined: Apr 08, 2016 5:15 pm
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by suprnova »

kubimike wrote:correct the original beta from microsoft has file version '10.0.14393.1100' . I am using the following registry edits
"RefsDisableCachedPins"=dword:00000001
"RefsProcessedDeleteQueueEntryCountThreshold"=dword:00000200
"RefsNumberOfChunksToTrim"=dword:00000020
"RefsEnableLargeWorkingSetTrim"=dword:00000001

Also the drive timeout, can't remember what that setting was.
Thanks very much! I am thinking of going back to this because my issues have gotten worse again where before I thought I had everything fixed.
kubimike
Veteran
Posts: 373
Liked: 41 times
Joined: Feb 03, 2017 2:34 pm
Full Name: MikeO
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by kubimike »

Im not saying this is the silver bullet. I do still have freezes but that means its time for an active full. However, when any public release refs driver from msft it freezes no matter what.
suprnova
Enthusiast
Posts: 38
Liked: never
Joined: Apr 08, 2016 5:15 pm
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by suprnova »

kubimike wrote:Im not saying this is the silver bullet. I do still have freezes but that means its time for an active full. However, when any public release refs driver from msft it freezes no matter what.
Definitely understand, at the moment even running one incremental merge with [fast clone] makes the repository drive become unstable. Everything was fine for awhile, but now it's back to ground zero. I have your same registry keys, but I am running the latest refs.sys driver.
kubimike
Veteran
Posts: 373
Liked: 41 times
Joined: Feb 03, 2017 2:34 pm
Full Name: MikeO
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by kubimike »

need the driver? pm me
warwickb
Lurker
Posts: 1
Liked: never
Joined: Nov 28, 2017 5:06 am
Full Name: WarwickB
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by warwickb »

more confirmation from MS regarding regression & timeline:
Unfortunately we won’t be able to provide the Fix REFS driver download 10.0.14939.1934 as it was under testing (Private Fix) and was never released for all and had regression with it.

There is another Bug open with it and the Fix to this issue would be released around March 2018.
Mike Resseler
Product Manager
Posts: 8044
Liked: 1263 times
Joined: Feb 08, 2013 3:08 pm
Full Name: Mike Resseler
Location: Belgium
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Mike Resseler »

Hi Warwickb,

First: Welcome to the forums!
Second: Is that public information or did you get it from a system engineer?

Thanks
Mike
jamesmay
Lurker
Posts: 2
Liked: never
Joined: Dec 13, 2017 10:04 pm
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by jamesmay »

Not sure if it's the same issue but our physical Veeam server running 2016 with a ~80TB REFS volume started hanging (requiring a reset on the iLO) at the completion of every nightly backup of our hyper v environment.

After few days of this the server now hangs about a minute after boot - in safe mode this can be after login but normal mode it only gets up to "Applying computer settings...". Was on the 2017-11 and upgraded to 2017-12 but it didn't help. MS support tried their best but seem to think it's somehow network related because of the where in the normal boot up process it gets - even though it gets past this is safe mode only to hang later.
Mike Resseler
Product Manager
Posts: 8044
Liked: 1263 times
Joined: Feb 08, 2013 3:08 pm
Full Name: Mike Resseler
Location: Belgium
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Mike Resseler »

Hi James,
First: Welcome to the forums
Second: The issue that is discussed here is that ReFS becomes very unstable if there is a lot of activity on it and the size large. Not being able to boot it is not something I have heard off with this issue. Something might be related but I am not sure. Please keep working with MSFT support for now and keep us posted. Who knows this is a new problem with ReFS (I hope not though)
Mike
GarethUK
Influencer
Posts: 21
Liked: 2 times
Joined: Mar 21, 2014 11:41 am
Full Name: Gareth
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by GarethUK »

James is indeed correct. This is behaviour I have observed. We have 16 backup repo servers 5 of which are 70TB REFS enabled Windows 2016 servers.

I have previously raised this issue with MS support. I did get some registry keys which temporarily fixed the issue. However, it has reoccured. I have again today raised a premium support case with Microsoft support and they have advised me they can provide no fix, no workaround and that I must wait for the permanent fix which they are unable to provide a timescale for. They have closed the case.

Regards,


Gareth
operations
Service Provider
Posts: 12
Liked: never
Joined: Nov 25, 2017 6:49 pm
Full Name: operations
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by operations »

Based on the above and the fact that I have a 240TB REPO on ReFS, I need to do something so I was wondering what direction people are going ..

Migrate to NTFS ?
Run Full or Synthetics but stay on ReFS ?

... I cannot continue as my backup merges are killing my server and resulting backups taking days not a day.
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Gostev »

operations wrote:240TB REPO on ReFS
From what I know based on the conversation with ReFS devs, it may be possible to work around this particular bug around huge volumes by adding lots of RAM to the backup repository server. If you can't do this, then I'm afraid the only option is to fall back to NTFS until Microsoft ships that patch.
operations
Service Provider
Posts: 12
Liked: never
Joined: Nov 25, 2017 6:49 pm
Full Name: operations
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by operations »

I do not see a memory issue my Backup is at a crawl at the moment but the server that has 256GB ram in it says there is 120GB free, RAM is not an issue I could easily put in 768GB if that would help.

I presume based on what you are saying that converting to Full backups or synthetic fulls will not fix the issue ?
kubimike
Veteran
Posts: 373
Liked: 41 times
Joined: Feb 03, 2017 2:34 pm
Full Name: MikeO
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by kubimike »

operations wrote:I do not see a memory issue my Backup is at a crawl at the moment but the server that has 256GB ram in it says there is 120GB free, RAM is not an issue I could easily put in 768GB if that would help.

I presume based on what you are saying that converting to Full backups or synthetic fulls will not fix the issue ?
Use RAMMAP while a clone/merge is running and get back to us. You'll see :mrgreen:
kubimike
Veteran
Posts: 373
Liked: 41 times
Joined: Feb 03, 2017 2:34 pm
Full Name: MikeO
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by kubimike »

operations wrote:Based on the above and the fact that I have a 240TB REPO on ReFS, I need to do something so I was wondering what direction people are going ..

Migrate to NTFS ?
Run Full or Synthetics but stay on ReFS ?

... I cannot continue as my backup merges are killing my server and resulting backups taking days not a day.
Type of storage?
How many synthetic fulls per week ?
Size of VM causing the issue ?
Frequency of backups ?
Last Active full ?
Space available on disk (REPO)
Cluster size ?
How Much ram ?
Which registry keys + driver are you using ?
operations
Service Provider
Posts: 12
Liked: never
Joined: Nov 25, 2017 6:49 pm
Full Name: operations
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by operations »

Type of storage?
IBM SVC with 900GB 10K SAS
How many synthetic fulls per week ?
NONE
Size of VM causing the issue ?
from 127GB to 5TB
Frequency of backups ?
Daily
Last Active full ?
30 days running incr forever no scheduled active fulls hence the merge progress
Space available on disk (REPO)
49Tb free
Cluster size ?
?
How Much ram ?
256GB
Which registry keys + driver are you using ?
No keys + default driver shipped with OS
thomas.raabo
Service Provider
Posts: 28
Liked: 11 times
Joined: Oct 31, 2016 6:27 pm
Full Name: Thomas Raabo
Location: infrastructure guy
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by thomas.raabo »

operations wrote:Type of storage?
IBM SVC with 900GB 10K SAS
How many synthetic fulls per week ?
NONE
Size of VM causing the issue ?
from 127GB to 5TB
Frequency of backups ?
Daily
Last Active full ?
30 days running incr forever no scheduled active fulls hence the merge progress
Space available on disk (REPO)
49Tb free
Cluster size ?
?
How Much ram ?
256GB
Which registry keys + driver are you using ?
No keys + default driver shipped with OS
That will not work! contact MS and get them to help you.

You do have a support contract right?
operations
Service Provider
Posts: 12
Liked: never
Joined: Nov 25, 2017 6:49 pm
Full Name: operations
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by operations »

No like many I have no support contract.

Which why I ask what are people doing that cannot get M$ to fix the issue but still need to run production ?
thomas.raabo
Service Provider
Posts: 28
Liked: 11 times
Joined: Oct 31, 2016 6:27 pm
Full Name: Thomas Raabo
Location: infrastructure guy
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by thomas.raabo »

suprnova wrote:Im not saying this is the silver bullet. I do still have freezes but that means its time for an active full. However, when any public release refs driver from msft it freezes no matter what.
Definitely understand, at the moment even running one incremental merge with [fast clone] makes the repository drive become unstable. Everything was fine for awhile, but now it's back to ground zero. I have your same registry keys, but I am running the latest refs.sys driver.
I recommend you to disable fastclone API in veeam

RefsVirtualSyntheticDisabled DWORD = 1


what version of the driver do you have?
suprnova
Enthusiast
Posts: 38
Liked: never
Joined: Apr 08, 2016 5:15 pm
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by suprnova »

thomas.raabo wrote:
That will not work! contact MS and get them to help you.

You do have a support contract right?
Can you explain why this doesn't work? Thank you!
suprnova
Enthusiast
Posts: 38
Liked: never
Joined: Apr 08, 2016 5:15 pm
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by suprnova »

thomas.raabo wrote: what version of the driver do you have?
10.0.14393.1770
Locked

Who is online

Users browsing this forum: No registered users and 225 guests