REFS 4k horror story

Availability for the Always-On Enterprise

Re: REFS 4k horror story

Veeam Logoby mkretzer » Fri Feb 10, 2017 8:07 am

Hello Oliver,

attached FC Array with HW RAID.

Scaling to > 100 3,5' Disks with integrated storage is kind of difficult...

Markus
mkretzer
Expert
 
Posts: 342
Liked: 74 times
Joined: Thu Dec 17, 2015 7:17 am

Re: REFS 4k horror story

Veeam Logoby rendest » Fri Feb 10, 2017 10:58 am

Same!

We have one legacy setup with iSCSI, which craps out even faster on ReFS (the iscsi time outs seem to make it worse)
rendest
Influencer
 
Posts: 18
Liked: 5 times
Joined: Wed Feb 01, 2017 8:36 pm
Full Name: Stef

Re: REFS 4k horror story

Veeam Logoby tschwendemann » Mon Feb 13, 2017 7:49 am

Hello Everyone,

we had the same Issue on saturday. I installed a new Backupserver at a customer last week using Windows Server 2016 with ReFS. There is one Veeam Repository on a ReFS Volume formated with 4k. We have about 12TB .vbk Files on a 100TB Volume. Even 128GB Ram were not enough to prevent the Server from crashing. Thats what I see in the Windows Eventlog:

---

Log Name: System
Source: Microsoft-Windows-WER-SystemErrorReporting
Date: 2/11/2017 6:01:34 PM
Event ID: 1001
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer:
Description:
The computer has rebooted from a bugcheck. The bugcheck was: 0x00000133 (0x0000000000000001, 0x0000000000001e00, 0x0000000000000000, 0x0000000000000000). A dump was saved in: C:\Windows\MEMORY.DMP. Report Id: 7246bf68-4886-4772-a8ea-168290eb66e7.

---

Regards
Tobias
tschwendemann
Lurker
 
Posts: 2
Liked: never
Joined: Thu Sep 19, 2013 12:53 pm
Full Name: Tobias Schwendemann

Re: REFS 4k horror story

Veeam Logoby Gostev » Mon Feb 13, 2017 7:59 pm

All, it is extremely important that everyone opens support cases with Microsoft on these 4K cluster issues, so that they are aware and prioritize fixing this issue based on the number of bug reports. I am sure they have a lot of bugs to work through given how young Windows Server 2016 is, and at least in Veeam, the number of support cases is the primary metric when prioritizing hot fixes.
Gostev
Veeam Software
 
Posts: 21608
Liked: 2405 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: REFS 4k horror story

Veeam Logoby alesovodvojce » Sun Feb 19, 2017 11:20 pm

For last few weeks we are having the same issues, affecting our Veeam 2016 Refs (4k cluster) servers only. Till now we have tried to solve them with Veeam support.
Will open MS support ticket now.
alesovodvojce
Enthusiast
 
Posts: 27
Liked: 2 times
Joined: Tue Nov 29, 2016 10:09 pm

Re: REFS 4k horror story

Veeam Logoby rahrenstorff » Mon Feb 20, 2017 3:22 pm 1 person likes this post

I just wanted to add that we implemented a number of 12TB repository appliances with 2016 ReFs using 4k cluster setting for SMB customers and have experienced no issues. This 4K problem seems to be relegated to larger repositories.
rahrenstorff
Service Provider
 
Posts: 9
Liked: 2 times
Joined: Thu Apr 23, 2015 4:10 pm
Full Name: Rodd Ahrenstorff

Re: REFS 4k horror story

Veeam Logoby Gostev » Mon Feb 20, 2017 9:05 pm

I concur, this has been our experience as well.
Gostev
Veeam Software
 
Posts: 21608
Liked: 2405 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: REFS 4k horror story

Veeam Logoby kubimike » Tue Feb 21, 2017 7:26 pm

@ tschwendemann You're problem sounds like the problem I Had. "DPC WATCHDOG 0x00000133" pop that sucker open in WINDBG, willing to bet its networking related!
kubimike
Expert
 
Posts: 244
Liked: 24 times
Joined: Fri Feb 03, 2017 2:34 pm
Full Name: MikeO

Re: REFS 4k horror story

Veeam Logoby Delo123 » Thu Feb 23, 2017 8:07 am

Smells like Broadcom...

So why do you guys keep trying 4k repositories instead on 64k on bigger arrays? Seems like to most unstable things to run in production for a long time currently...
Delo123
Expert
 
Posts: 355
Liked: 102 times
Joined: Fri Dec 28, 2012 5:20 pm
Full Name: Guido Meijers

Re: REFS 4k horror story

Veeam Logoby crackocain » Sun Feb 26, 2017 12:58 pm 1 person likes this post

I've experienced this issue 4k 27TB refs repository 1 month ago. After that now i'm formatting 64k all ReFS datastores :D Small or big.
EMC, IBM Storage Specialist - VMCEv9 - Dosbil - Turkey
crackocain
Service Provider
 
Posts: 67
Liked: 7 times
Joined: Mon Dec 14, 2015 8:20 pm
Location: Turkey
Full Name: Mehmet Istanbullu

Re: REFS 4k horror story

Veeam Logoby Mike Resseler » Mon Feb 27, 2017 7:08 am

Hey Mehmet,

Are the ones you already changed to 64k experiencing issues? (I assume not but thought I asked anyway ;-))
Mike Resseler
Veeam Software
 
Posts: 3381
Liked: 384 times
Joined: Fri Feb 08, 2013 3:08 pm
Location: Belgium, the land of the fries, the beer, the chocolate and the diamonds...
Full Name: Mike Resseler

Re: REFS 4k horror story

Veeam Logoby crackocain » Mon Feb 27, 2017 11:45 am 1 person likes this post

No. Working great.
EMC, IBM Storage Specialist - VMCEv9 - Dosbil - Turkey
crackocain
Service Provider
 
Posts: 67
Liked: 7 times
Joined: Mon Dec 14, 2015 8:20 pm
Location: Turkey
Full Name: Mehmet Istanbullu

Re: REFS 4k horror story

Veeam Logoby graham8 » Mon Feb 27, 2017 2:17 pm

Well, we just had another pseudo-crash (veeam backup server becomes unresponsive and remains that way in perpetuity until hard repowered). 32TB, 32GB ram, still on 4k.

Has anyone gotten anything new from MS on this issue that they could share with the rest of us? alesovodvojce, you said you were going to open a MS ticket. Have you learned anything from them yet?

rendest said that switching to 64kb did not fix their issue. Others have said it does fix their issue. Does 64kb actually fix things? At this point in the thread it seems somewhat inconclusive. I'm inclined to format, but I don't want to mess around with production servers unless I'm certain it will fix things.

@ Veeam Employees - what settings are available to limit the amount of load Veeam puts on the underlying storage? rendest mentioned throttling - is that only available in certain license types? Is there any way to perhaps disable multithreaded IO? What is the default thread count for Veeam? Is there no way a hotfix could be released for Veeam that monitors disk IO latency and backs off before it cripples the server? I'd appreciate any guidance anyone could offer us for how we could limit, as much as possible, the load Veeam puts on the storage subsystem to try to avoid crashing our servers until development (Microsoft or Veeam) can come up with some fix for the issue.

Also, under EventViewer->Microsoft->Windows->ReFS we regularly get these warnings:
An IO took more than 30000 ms to complete:
Process Id: 9932
Process name: VeeamAgent.exe
File name: 0000000000000C57 0000000000000405
File offset: 2181038080
IO Type: Write: Paging, NonCached, Sync
IO Size: 1048576 bytes
0 cluster(s) starting at cluster 0
Latency: 72954 ms
Volume Id: {7085173e-b757-4884-b34a-d23aa46d4941}
Volume name: D:

Is everyone else getting this? These only show up in the extended ("crimson") event channel for ReFS btw - browse to the path I mentioned to check on your repos. Also, we have identically-configured servers hosting VHDXs which are *not* getting those messages. We're getting those messages only on the primary Veeam repo and the offsite one.
graham8
Enthusiast
 
Posts: 59
Liked: 20 times
Joined: Wed Dec 14, 2016 1:56 pm

Re: REFS 4k horror story

Veeam Logoby j.forsythe » Mon Feb 27, 2017 2:24 pm

Hi,
I had to open up a case as well #02083290.
I have two repositorys one attached via iSCSI, the other one local SAS.
Both less than 20TB and 64k. Since this weekend the backup server crashed each time the backup starts, before it was running great for about 2 weeks.
j.forsythe
Influencer
 
Posts: 14
Liked: 4 times
Joined: Wed Jan 06, 2016 10:26 am
Full Name: John P. Forsythe

Re: REFS 4k horror story

Veeam Logoby rahrenstorff » Mon Feb 27, 2017 2:39 pm 1 person likes this post

Delo123 wrote:So why do you guys keep trying 4k repositories instead on 64k on bigger arrays? Seems like to most unstable things to run in production for a long time currently...

Just to confirm; the original appliances were configured with 4K and have experienced no issues. However, we are implementing 64K in our build process going forward.
rahrenstorff
Service Provider
 
Posts: 9
Liked: 2 times
Joined: Thu Apr 23, 2015 4:10 pm
Full Name: Rodd Ahrenstorff

PreviousNext

Return to Veeam Backup & Replication



Who is online

Users browsing this forum: Bing [Bot], Google Feedfetcher and 1 guest