-
- Novice
- Posts: 8
- Liked: 2 times
- Joined: Oct 10, 2016 10:13 am
- Full Name: Mark Edmonds
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Sounds like they are feeding you nonsense. It’s in RS4 for a start....
Our backups are being crippled because of this issue. We’ve had to result to reboot pretty much everyday and giving an extra caring hand to make sure we get all the backups done.
Even my deployment got questioned because of the backups failing, even though I only followed best practices using ReFS!
Our backups are being crippled because of this issue. We’ve had to result to reboot pretty much everyday and giving an extra caring hand to make sure we get all the backups done.
Even my deployment got questioned because of the backups failing, even though I only followed best practices using ReFS!
-
- Influencer
- Posts: 17
- Liked: 2 times
- Joined: May 03, 2016 4:24 am
- Full Name: Mike Fuller
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Well. I wouldn't be surprised if they are feeding me bullshit at this point. They did refund our case (sa number) so i got that going for me which is nice.
The rep said that any new premium support cases that were created would not be able to get the fix either until they determine what is happening with the update they gave out. who knows.. it could have been the Techs buddy for all i know.
But... I'm moving Backups around to other servers now to convert them to NTFS but i'm only looking at moving 20 TBs as the main production systems are being backed up by net backup still and the new Datadomain expansions haven't been rolled out yet. I'm curious to see how fast veeam backups over our dual 40Gbe links from the Blade chasis to the switch with dual 10GBe connections to the datadomains. It should be hella fast. We are only using windows servers in remote offices as a short term backup windows with long term backups being stored in our Main DC with Datadomains so refs is not critical, but you know, i was following best practices.
The rep said that any new premium support cases that were created would not be able to get the fix either until they determine what is happening with the update they gave out. who knows.. it could have been the Techs buddy for all i know.
But... I'm moving Backups around to other servers now to convert them to NTFS but i'm only looking at moving 20 TBs as the main production systems are being backed up by net backup still and the new Datadomain expansions haven't been rolled out yet. I'm curious to see how fast veeam backups over our dual 40Gbe links from the Blade chasis to the switch with dual 10GBe connections to the datadomains. It should be hella fast. We are only using windows servers in remote offices as a short term backup windows with long term backups being stored in our Main DC with Datadomains so refs is not critical, but you know, i was following best practices.
-
- Novice
- Posts: 5
- Liked: 2 times
- Joined: Jan 22, 2015 1:29 pm
- Full Name: Eric Doyen
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
We are another victim of the ReFS issues. We have a Sev-A premier support ticket open with MS and burning through hours. We have applied the private fix, but every time we attach the ReFS disk back to our VM, Memory spikes, then CPU, then the VM becomes unresponsive and crashes. This is even with all Veeam components uninstalled on the repo.
Currently, the latest memory dump is being analyzed by debug team.
MS Ticket: 117123117388111
Currently, the latest memory dump is being analyzed by debug team.
Veeam Ticket: 024488101.Till now, this case is opened for days, and ESC resource is engaged several hours before;
2.All previous effort proves that existing workarounds/HF doesn’t fix the issue which indicates this issue is quite similar with the known bug, but it differs in detail such like condition/released binary version and so forth;
3.After ESC resource engaged, we now know that the issue is located in ReFS, but as per it’s quite similar as the known bug, it is even harder to identify the difference and find workaround/solution;
4.As per we have taken days to test known guidance/steps to fix this issue, and now it’s narrowed down to be inside ReFS, our EE is still investigating the dump file, so before the solid analysis, there’s limited actions to perform at your side, even though we both are eager to conduct steps to fix it;
MS Ticket: 117123117388111
-
- Influencer
- Posts: 19
- Liked: 1 time
- Joined: May 10, 2017 9:01 am
- Full Name: Josef Zilak
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Hi,
Fix for ReFS performance problems is scheduled to be released Feb 2018. Some customers obtained through Premier support test signed ReFS.sys for verification. There where multiple versions as Microsoft fix some regression issues. So in some cases customer gets new binary drop fixing those regression issues (and recommendation to replace previous version or rollback). This is where confusion probably started. Microsoft doesn’t recommend running production environment on test driver release. However latest drop fixed most (if not all) performance issues.
Fix for ReFS performance problems is scheduled to be released Feb 2018. Some customers obtained through Premier support test signed ReFS.sys for verification. There where multiple versions as Microsoft fix some regression issues. So in some cases customer gets new binary drop fixing those regression issues (and recommendation to replace previous version or rollback). This is where confusion probably started. Microsoft doesn’t recommend running production environment on test driver release. However latest drop fixed most (if not all) performance issues.
-
- Veteran
- Posts: 391
- Liked: 56 times
- Joined: Feb 03, 2017 2:34 pm
- Full Name: MikeO
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Im running pretty stable with the older beta release test driver. If anyone needs help please shout. I spent many sleepless nights babysitting my box. Its now 95% stable. I've been harassing microsoft for their latest fix, I guess since we aren't forking over 100k a year in support fees I don't get a copy.
-
- Service Provider
- Posts: 28
- Liked: 11 times
- Joined: Oct 31, 2016 6:27 pm
- Full Name: Thomas Raabo
- Location: infrastructure guy
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Feel your pain ... been on two Sev-A tickets last month.edoyen wrote:We are another victim of the ReFS issues. We have a Sev-A premier support ticket open with MS and burning through hours. We have applied the private fix, but every time we attach the ReFS disk back to our VM, Memory spikes, then CPU, then the VM becomes unresponsive and crashes. This is even with all Veeam components uninstalled on the repo.
Currently, the latest memory dump is being analyzed by debug team.
1.Till now, this case is opened for days, and ESC resource is engaged several hours before;
2.All previous effort proves that existing workarounds/HF doesn’t fix the issue which indicates this issue is quite similar with the known bug, but it differs in detail such like condition/released binary version and so forth;
3.After ESC resource engaged, we now know that the issue is located in ReFS, but as per it’s quite similar as the known bug, it is even harder to identify the difference and find workaround/solution;
4.As per we have taken days to test known guidance/steps to fix this issue, and now it’s narrowed down to be inside ReFS, our EE is still investigating the dump file, so before the solid analysis, there’s limited actions to perform at your side, even though we both are eager to conduct steps to fix it;
Veeam Ticket: 02448810
MS Ticket: 117123117388111
You do know that Sev-A is not the highest level right?
-
- Novice
- Posts: 5
- Liked: 2 times
- Joined: Jan 22, 2015 1:29 pm
- Full Name: Eric Doyen
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
I had no idea. We have learned that a case has to go through several levels of escalation before we get the full benefit of our block of hours, so finding out that there is a higher crit level does not surprise me. Props to the escalation engineering team assigned to us, though. They seem genuinely eager to help us out.thomas.raabo wrote:
Feel your pain ... been on two Sev-A tickets last month.
You do know that Sev-A is not the highest level right?
-
- Service Provider
- Posts: 28
- Liked: 11 times
- Joined: Oct 31, 2016 6:27 pm
- Full Name: Thomas Raabo
- Location: infrastructure guy
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
The highest is Sev-1 and when at that level SVP and VP gets hourly updates and for windows the hole BU knows.edoyen wrote:
Feel your pain ... been on two Sev-A tickets last month.
You do know that Sev-A is not the highest level right?
I had no idea. We have learned that a case has to go through several levels of escalation before we get the full benefit of our block of hours, so finding out that there is a higher crit level does not surprise me. Props to the escalation engineering team assigned to us, though. They seem genuinely eager to help us out.
You need to start shooting and start being a bad guy to get to this level.
-
- Enthusiast
- Posts: 38
- Liked: never
- Joined: Apr 08, 2016 5:15 pm
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Is this the test driver that Veeam provided a while ago from Microsoft? Are you still using any registry edits?kubimike wrote:Im running pretty stable with the older beta release test driver. If anyone needs help please shout. I spent many sleepless nights babysitting my box. Its now 95% stable. I've been harassing microsoft for their latest fix, I guess since we aren't forking over 100k a year in support fees I don't get a copy.
-
- Veteran
- Posts: 391
- Liked: 56 times
- Joined: Feb 03, 2017 2:34 pm
- Full Name: MikeO
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
correct the original beta from microsoft has file version '10.0.14393.1100' . I am using the following registry edits
"RefsDisableCachedPins"=dword:00000001
"RefsProcessedDeleteQueueEntryCountThreshold"=dword:00000200
"RefsNumberOfChunksToTrim"=dword:00000020
"RefsEnableLargeWorkingSetTrim"=dword:00000001
Also the drive timeout, can't remember what that setting was.
"RefsDisableCachedPins"=dword:00000001
"RefsProcessedDeleteQueueEntryCountThreshold"=dword:00000200
"RefsNumberOfChunksToTrim"=dword:00000020
"RefsEnableLargeWorkingSetTrim"=dword:00000001
Also the drive timeout, can't remember what that setting was.
-
- Enthusiast
- Posts: 38
- Liked: never
- Joined: Apr 08, 2016 5:15 pm
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Thanks very much! I am thinking of going back to this because my issues have gotten worse again where before I thought I had everything fixed.kubimike wrote:correct the original beta from microsoft has file version '10.0.14393.1100' . I am using the following registry edits
"RefsDisableCachedPins"=dword:00000001
"RefsProcessedDeleteQueueEntryCountThreshold"=dword:00000200
"RefsNumberOfChunksToTrim"=dword:00000020
"RefsEnableLargeWorkingSetTrim"=dword:00000001
Also the drive timeout, can't remember what that setting was.
-
- Veteran
- Posts: 391
- Liked: 56 times
- Joined: Feb 03, 2017 2:34 pm
- Full Name: MikeO
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Im not saying this is the silver bullet. I do still have freezes but that means its time for an active full. However, when any public release refs driver from msft it freezes no matter what.
-
- Enthusiast
- Posts: 38
- Liked: never
- Joined: Apr 08, 2016 5:15 pm
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Definitely understand, at the moment even running one incremental merge with [fast clone] makes the repository drive become unstable. Everything was fine for awhile, but now it's back to ground zero. I have your same registry keys, but I am running the latest refs.sys driver.kubimike wrote:Im not saying this is the silver bullet. I do still have freezes but that means its time for an active full. However, when any public release refs driver from msft it freezes no matter what.
-
- Veteran
- Posts: 391
- Liked: 56 times
- Joined: Feb 03, 2017 2:34 pm
- Full Name: MikeO
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
need the driver? pm me
-
- Lurker
- Posts: 1
- Liked: never
- Joined: Nov 28, 2017 5:06 am
- Full Name: WarwickB
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
more confirmation from MS regarding regression & timeline:
Unfortunately we won’t be able to provide the Fix REFS driver download 10.0.14939.1934 as it was under testing (Private Fix) and was never released for all and had regression with it.
There is another Bug open with it and the Fix to this issue would be released around March 2018.
-
- Product Manager
- Posts: 8191
- Liked: 1322 times
- Joined: Feb 08, 2013 3:08 pm
- Full Name: Mike Resseler
- Location: Belgium
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Hi Warwickb,
First: Welcome to the forums!
Second: Is that public information or did you get it from a system engineer?
Thanks
Mike
First: Welcome to the forums!
Second: Is that public information or did you get it from a system engineer?
Thanks
Mike
-
- Lurker
- Posts: 2
- Liked: never
- Joined: Dec 13, 2017 10:04 pm
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Not sure if it's the same issue but our physical Veeam server running 2016 with a ~80TB REFS volume started hanging (requiring a reset on the iLO) at the completion of every nightly backup of our hyper v environment.
After few days of this the server now hangs about a minute after boot - in safe mode this can be after login but normal mode it only gets up to "Applying computer settings...". Was on the 2017-11 and upgraded to 2017-12 but it didn't help. MS support tried their best but seem to think it's somehow network related because of the where in the normal boot up process it gets - even though it gets past this is safe mode only to hang later.
After few days of this the server now hangs about a minute after boot - in safe mode this can be after login but normal mode it only gets up to "Applying computer settings...". Was on the 2017-11 and upgraded to 2017-12 but it didn't help. MS support tried their best but seem to think it's somehow network related because of the where in the normal boot up process it gets - even though it gets past this is safe mode only to hang later.
-
- Product Manager
- Posts: 8191
- Liked: 1322 times
- Joined: Feb 08, 2013 3:08 pm
- Full Name: Mike Resseler
- Location: Belgium
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Hi James,
First: Welcome to the forums
Second: The issue that is discussed here is that ReFS becomes very unstable if there is a lot of activity on it and the size large. Not being able to boot it is not something I have heard off with this issue. Something might be related but I am not sure. Please keep working with MSFT support for now and keep us posted. Who knows this is a new problem with ReFS (I hope not though)
Mike
First: Welcome to the forums
Second: The issue that is discussed here is that ReFS becomes very unstable if there is a lot of activity on it and the size large. Not being able to boot it is not something I have heard off with this issue. Something might be related but I am not sure. Please keep working with MSFT support for now and keep us posted. Who knows this is a new problem with ReFS (I hope not though)
Mike
-
- Influencer
- Posts: 22
- Liked: 2 times
- Joined: Mar 21, 2014 11:41 am
- Full Name: Gareth
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
James is indeed correct. This is behaviour I have observed. We have 16 backup repo servers 5 of which are 70TB REFS enabled Windows 2016 servers.
I have previously raised this issue with MS support. I did get some registry keys which temporarily fixed the issue. However, it has reoccured. I have again today raised a premium support case with Microsoft support and they have advised me they can provide no fix, no workaround and that I must wait for the permanent fix which they are unable to provide a timescale for. They have closed the case.
Regards,
Gareth
I have previously raised this issue with MS support. I did get some registry keys which temporarily fixed the issue. However, it has reoccured. I have again today raised a premium support case with Microsoft support and they have advised me they can provide no fix, no workaround and that I must wait for the permanent fix which they are unable to provide a timescale for. They have closed the case.
Regards,
Gareth
-
- Service Provider
- Posts: 12
- Liked: never
- Joined: Nov 25, 2017 6:49 pm
- Full Name: operations
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Based on the above and the fact that I have a 240TB REPO on ReFS, I need to do something so I was wondering what direction people are going ..
Migrate to NTFS ?
Run Full or Synthetics but stay on ReFS ?
... I cannot continue as my backup merges are killing my server and resulting backups taking days not a day.
Migrate to NTFS ?
Run Full or Synthetics but stay on ReFS ?
... I cannot continue as my backup merges are killing my server and resulting backups taking days not a day.
-
- Chief Product Officer
- Posts: 31806
- Liked: 7299 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
From what I know based on the conversation with ReFS devs, it may be possible to work around this particular bug around huge volumes by adding lots of RAM to the backup repository server. If you can't do this, then I'm afraid the only option is to fall back to NTFS until Microsoft ships that patch.operations wrote:240TB REPO on ReFS
-
- Service Provider
- Posts: 12
- Liked: never
- Joined: Nov 25, 2017 6:49 pm
- Full Name: operations
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
I do not see a memory issue my Backup is at a crawl at the moment but the server that has 256GB ram in it says there is 120GB free, RAM is not an issue I could easily put in 768GB if that would help.
I presume based on what you are saying that converting to Full backups or synthetic fulls will not fix the issue ?
I presume based on what you are saying that converting to Full backups or synthetic fulls will not fix the issue ?
-
- Veteran
- Posts: 391
- Liked: 56 times
- Joined: Feb 03, 2017 2:34 pm
- Full Name: MikeO
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Use RAMMAP while a clone/merge is running and get back to us. You'll seeoperations wrote:I do not see a memory issue my Backup is at a crawl at the moment but the server that has 256GB ram in it says there is 120GB free, RAM is not an issue I could easily put in 768GB if that would help.
I presume based on what you are saying that converting to Full backups or synthetic fulls will not fix the issue ?
-
- Veteran
- Posts: 391
- Liked: 56 times
- Joined: Feb 03, 2017 2:34 pm
- Full Name: MikeO
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Type of storage?operations wrote:Based on the above and the fact that I have a 240TB REPO on ReFS, I need to do something so I was wondering what direction people are going ..
Migrate to NTFS ?
Run Full or Synthetics but stay on ReFS ?
... I cannot continue as my backup merges are killing my server and resulting backups taking days not a day.
How many synthetic fulls per week ?
Size of VM causing the issue ?
Frequency of backups ?
Last Active full ?
Space available on disk (REPO)
Cluster size ?
How Much ram ?
Which registry keys + driver are you using ?
-
- Service Provider
- Posts: 12
- Liked: never
- Joined: Nov 25, 2017 6:49 pm
- Full Name: operations
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Type of storage?
IBM SVC with 900GB 10K SAS
How many synthetic fulls per week ?
NONE
Size of VM causing the issue ?
from 127GB to 5TB
Frequency of backups ?
Daily
Last Active full ?
30 days running incr forever no scheduled active fulls hence the merge progress
Space available on disk (REPO)
49Tb free
Cluster size ?
?
How Much ram ?
256GB
Which registry keys + driver are you using ?
No keys + default driver shipped with OS
IBM SVC with 900GB 10K SAS
How many synthetic fulls per week ?
NONE
Size of VM causing the issue ?
from 127GB to 5TB
Frequency of backups ?
Daily
Last Active full ?
30 days running incr forever no scheduled active fulls hence the merge progress
Space available on disk (REPO)
49Tb free
Cluster size ?
?
How Much ram ?
256GB
Which registry keys + driver are you using ?
No keys + default driver shipped with OS
-
- Service Provider
- Posts: 28
- Liked: 11 times
- Joined: Oct 31, 2016 6:27 pm
- Full Name: Thomas Raabo
- Location: infrastructure guy
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
That will not work! contact MS and get them to help you.operations wrote:Type of storage?
IBM SVC with 900GB 10K SAS
How many synthetic fulls per week ?
NONE
Size of VM causing the issue ?
from 127GB to 5TB
Frequency of backups ?
Daily
Last Active full ?
30 days running incr forever no scheduled active fulls hence the merge progress
Space available on disk (REPO)
49Tb free
Cluster size ?
?
How Much ram ?
256GB
Which registry keys + driver are you using ?
No keys + default driver shipped with OS
You do have a support contract right?
-
- Service Provider
- Posts: 12
- Liked: never
- Joined: Nov 25, 2017 6:49 pm
- Full Name: operations
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
No like many I have no support contract.
Which why I ask what are people doing that cannot get M$ to fix the issue but still need to run production ?
Which why I ask what are people doing that cannot get M$ to fix the issue but still need to run production ?
-
- Service Provider
- Posts: 28
- Liked: 11 times
- Joined: Oct 31, 2016 6:27 pm
- Full Name: Thomas Raabo
- Location: infrastructure guy
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
I recommend you to disable fastclone API in veeamsuprnova wrote:Im not saying this is the silver bullet. I do still have freezes but that means its time for an active full. However, when any public release refs driver from msft it freezes no matter what.
Definitely understand, at the moment even running one incremental merge with [fast clone] makes the repository drive become unstable. Everything was fine for awhile, but now it's back to ground zero. I have your same registry keys, but I am running the latest refs.sys driver.
RefsVirtualSyntheticDisabled DWORD = 1
what version of the driver do you have?
-
- Enthusiast
- Posts: 38
- Liked: never
- Joined: Apr 08, 2016 5:15 pm
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
Can you explain why this doesn't work? Thank you!thomas.raabo wrote:
That will not work! contact MS and get them to help you.
You do have a support contract right?
-
- Enthusiast
- Posts: 38
- Liked: never
- Joined: Apr 08, 2016 5:15 pm
- Contact:
Re: REFS issues (server lockups, high CPU, high RAM)
10.0.14393.1770thomas.raabo wrote: what version of the driver do you have?
Who is online
Users browsing this forum: AdsBot [Google], diana.boro, dlutsenko, Google [Bot], ottl05, Semrush [Bot] and 171 guests