REFS 4k horror story

Availability for the Always-On Enterprise

Re: REFS 4k horror story

Veeam Logoby alesovodvojce » Mon Feb 27, 2017 11:28 pm

@graham8 news from MS - short answer: not tried. Longer answer: we thought support is covered in our SA, but is not. Opening a ticket will cost us $500, given that MS already postponed February's patches due to this Refs issue we are very like to get an answer "we are working on it, wait", which is not enough for the expense

We are eager to read here some more answers to your questions, as our VBR repos (ReFS 4k) are failing regularly
alesovodvojce
Influencer
 
Posts: 23
Liked: 1 time
Joined: Tue Nov 29, 2016 10:09 pm

Re: REFS 4k horror story

Veeam Logoby mkretzer » Tue Feb 28, 2017 6:41 am

Microsoft closed our ticket in wich we found the memory issue with the KB because they do not think an update will arrive until march patchday....
mkretzer
Expert
 
Posts: 251
Liked: 61 times
Joined: Thu Dec 17, 2015 7:17 am

Re: REFS 4k horror story

Veeam Logoby Mike Resseler » Tue Feb 28, 2017 6:50 am 1 person likes this post

@john,

Keep us informed about your support case since it is an issue with 64K. You didn't install KB32 something right?

@all: Let's hope that MSFT indeed has a fix in March because this is really not good. And @Alesovodvojce: I'm pretty surprised there are no support tickets when you have an SA agreement :-(
Mike Resseler
Veeam Software
 
Posts: 2795
Liked: 343 times
Joined: Fri Feb 08, 2013 3:08 pm
Location: Belgium, the land of the fries, the beer, the chocolate and the diamonds...
Full Name: Mike Resseler

Re: REFS 4k horror story

Veeam Logoby kubimike » Tue Feb 28, 2017 6:34 pm

@j.forsythe
Interesting that you're having crashing with 64k. my server too is crashing daily after doing backups. Some days it works fine other days it might crash twice a day. What sort of Bugcheck are you seeing if any ? My respository is 48TB 64k ReFS connected via SAS HP P841 controller.
kubimike
Expert
 
Posts: 141
Liked: 20 times
Joined: Fri Feb 03, 2017 2:34 pm
Full Name: MikeO

Re: REFS 4k horror story

Veeam Logoby graham8 » Tue Feb 28, 2017 7:48 pm 1 person likes this post

Thanks for the updates everyone.

@alesovodvojce - it's been a while since I've felt desperate enough to call Microsoft for something, but I seem to recall that they don't charge if the issue is a legitimate Microsoft bug. At least, they've waived it for me in the past. I understand not wanting to take that risk though, of course.
graham8
Enthusiast
 
Posts: 54
Liked: 20 times
Joined: Wed Dec 14, 2016 1:56 pm

Re: REFS 4k horror story

Veeam Logoby kubimike » Tue Feb 28, 2017 7:52 pm 1 person likes this post

the way I see it $500 is a drop in the bucket for the time I'd have to spend fixing stuff because it keeps crashing. Yes they will refund you if its a bug.
kubimike
Expert
 
Posts: 141
Liked: 20 times
Joined: Fri Feb 03, 2017 2:34 pm
Full Name: MikeO

Re: REFS 4k horror story

Veeam Logoby mkretzer » Tue Feb 28, 2017 8:04 pm

I am confused about the kind of crashes you are seeing... In our case we never saw a Bluescreen, the system only "hang". But from what i understand now some of you see bluescreens?
mkretzer
Expert
 
Posts: 251
Liked: 61 times
Joined: Thu Dec 17, 2015 7:17 am

Re: REFS 4k horror story

Veeam Logoby graham8 » Tue Feb 28, 2017 8:13 pm

Right, likewise - only "hangs" here (due to extreme memory exhaustion). I've never had an actual crash. It makes me wonder if it's the same problem if someone is getting a bluescreen.
graham8
Enthusiast
 
Posts: 54
Liked: 20 times
Joined: Wed Dec 14, 2016 1:56 pm

Re: REFS 4k horror story

Veeam Logoby kubimike » Tue Feb 28, 2017 8:20 pm

sometimes it bluescreens, sometimes it just freezes and I have to reboot the system. When it does bluescreen I get a stop error bugcheck 0x133. When it freezes the iLO remote window is just BLACK
kubimike
Expert
 
Posts: 141
Liked: 20 times
Joined: Fri Feb 03, 2017 2:34 pm
Full Name: MikeO

Re: REFS 4k horror story

Veeam Logoby alesovodvojce » Tue Feb 28, 2017 9:24 pm

Symptoms of crash
- CPU allowance of the guest VM is on 100% (seen from host)
- high memory demand - always higher than satisfied), no disk activity or queue
- sometimes it is possible to move the mouse or switch the window, but not to launch new app
These are common to dozens of "crashes" we have observed at our facility during last two months. Note that the Veeam (together with its storage) is on virtual, not physical server.

What next
If you see this symptoms and attribute them to the ReFS, I would suggest to force stop the machine immediately.
We have been trying to wait, even few days if the situation will recover, but it won't. Instead, long-term guest VM troubles led its time service to behave in a crazy way - sometimes shifting time to hours and even a few months to the future. This is something that guest or host were not able to repair. Wrong dates caused lot of other strange things, i.e. bad traces saved to SQL database, so our Veeam backup scheduler stopped to behave correctly since (our case #02063994).
After a force restart the machine sometimes get in troubles again immediately. We have ended the troubles by starting the machine and
a) lightly shutting down the machine just after windows booted, if it allowed to do so (if not, than hard shutdown and again)
b) waiting few hours after start, to recover (worked only sometimes for us)

Hope this info save you some time.

little OT: thanks for the MSFT ticket advices, we are NGO so the SA is maybe crippled of support benefit because of that.
alesovodvojce
Influencer
 
Posts: 23
Liked: 1 time
Joined: Tue Nov 29, 2016 10:09 pm

Re: REFS 4k horror story

Veeam Logoby kubimike » Tue Feb 28, 2017 11:15 pm

alesovodvojce wrote:Symptoms of crash

- sometimes it is possible to move the mouse or switch the window, but not to launch new app[/i]


exactly what happens to me. iLO wont respond (Black window), but if I happen to have an RDP session going I can move boxes around, switch windows, start button doesn't work. I can launch WINKEY + R and type 'shutdown /r /t 0' nothing happens. Have to hard reset the machine.
kubimike
Expert
 
Posts: 141
Liked: 20 times
Joined: Fri Feb 03, 2017 2:34 pm
Full Name: MikeO

Re: REFS 4k horror story

Veeam Logoby Mike Resseler » Wed Mar 01, 2017 7:00 am

For all the guys out there running 64k ReFS repositories and seeing these freeze issues. Do you have created a support call with us? I am sure we want to investigate those! Even if we find something which is not related to us but to MSFT, we want to know so we can notify them and give them our analysis.

Please do the support call, post the ID's here and keep us informed on the forums.

Thanks
Mike
Mike Resseler
Veeam Software
 
Posts: 2795
Liked: 343 times
Joined: Fri Feb 08, 2013 3:08 pm
Location: Belgium, the land of the fries, the beer, the chocolate and the diamonds...
Full Name: Mike Resseler

Re: REFS 4k horror story

Veeam Logoby kubimike » Wed Mar 01, 2017 3:04 pm

no ticket open but Im getting freezing with 64k ReFS. I can open a ticket. I do have a Microsoft ticket. My machine is currently running with verifier w/special pools enabled.
kubimike
Expert
 
Posts: 141
Liked: 20 times
Joined: Fri Feb 03, 2017 2:34 pm
Full Name: MikeO

Re: REFS 4k horror story

Veeam Logoby EricJ » Thu Mar 02, 2017 9:10 pm

We are seeing the same issues described, with memory exhaustion leading to server lock-up. 32TB repository with 4k ReFS. Not a total freeze or blue-screen - RDP sometimes cuts out, sometimes stays connected but nearly unusable.

A forced reset causes a vicious cycle, as ReFS detects an unclean reboot and kicks off background integrity checks. Even with Veeam services disabled, memory usage climbs and will become exhausted again within 10-15 minutes.

Poolmon shows the culprit is refs.sys and refsv1.sys. Last weekend I reformatted the primary repository with 64k, and have not had the issue since then... but it sounds like 64k isn't a total remedy either?

Hopefully Microsoft will have a fix on patch day (March 14).
EricJ
Influencer
 
Posts: 11
Liked: 1 time
Joined: Thu Jan 12, 2017 7:06 pm

Re: REFS 4k horror story

Veeam Logoby tsightler » Thu Mar 02, 2017 9:41 pm

I don't believe that 64K is a remedy, more a mitigation. In load testing with a 100TB repoitory I was able to crash the 4K ReFS system pretty much nightly. With 64K it ran a month without issues, but did eventually have a hang around day 40 or so.
tsightler
Veeam Software
 
Posts: 4687
Liked: 1698 times
Joined: Fri Jun 05, 2009 12:57 pm
Full Name: Tom Sightler

PreviousNext

Return to Veeam Backup & Replication



Who is online

Users browsing this forum: adapterer, Bing [Bot], rkovhaev, tsightler and 23 guests