Comprehensive data protection for all workloads
poulpreben
Service Provider
Posts: 960
Liked: 405 times
Joined: Jul 23, 2012 8:16 am
Full Name: Preben Berg
Contact:

Re: Windows 2019, large REFS and deletes

Post by poulpreben » Nov 07, 2019 7:20 pm

It is capable of it, but I can try to disable it as a test.

Were you able to reproduce the issue with a simple 'fio' test or something, or was it only Veeam that triggered the slowdown?

rhys.hammond
Veeam Vanguard
Posts: 57
Liked: 9 times
Joined: Apr 07, 2013 10:36 pm
Full Name: Rhys Hammond
Location: Brisbane , Australia
Contact:

Re: Windows 2019, large REFS and deletes

Post by rhys.hammond » Nov 08, 2019 12:21 am

Was told by another Veeam SP that regularly emptying the 'System Working Set' using RAMMap fixed their ReFS performance issues on 1809.
Just tried it but unfortunately, the customers disk speed is still 'saw-toothing' and performing very poorly until the next restart.

I'm going to try one last thing before wiping 1809 and loading, going to delete the 1 x 200TB ReFS volume and create 2 x 100TB ReFS volumes.
I don't want to create smaller volumes as ReFS block cloning doesn't work across volumes.
Veeam Certified Architect | Author of http://rhyshammond.com | Veeam Vanguard | vExpert

mkretzer
Expert
Posts: 566
Liked: 127 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer » Nov 09, 2019 5:08 pm

@poulpreben we only had issues with block cloned Veeam data but we did not did test with > 100 TBs of block cloned non-Veeam data which is necesarry for the issue to happen for us.

poulpreben
Service Provider
Posts: 960
Liked: 405 times
Joined: Jul 23, 2012 8:16 am
Full Name: Preben Berg
Contact:

Re: Windows 2019, large REFS and deletes

Post by poulpreben » Nov 09, 2019 5:29 pm 1 person likes this post

As an isolation step, we ended up taking the old volumes offline and creating new ones. We tested so many things to no avail, and in this case it was easy since the customer has only 30 days of retention.

I guess we’ll see once there is more than 100 TB data on the server again. It’s now running with the 2019-10 Cumulative Update as well. For now it seems super fast.

Ben.online
Lurker
Posts: 2
Liked: never
Joined: Nov 12, 2019 1:30 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by Ben.online » Nov 14, 2019 3:19 pm

Hello,

We are about to go in production with a Veeam repository Server 2019 build 1809 and this topic does not make me happy.
I thougt the ReFS issues were all resolved but sadly i found this when checking for the refs.sys version from the Server 2019.

a few thing i missed or are unclear to me.

- Are there any reports of these issues on Server 2016 after the .2457 update? And is 2016 just a good for ReFS as a patched 2019 a Gustov said?
- As i understand the server 2019 build 1903 works fine but is core only and support ends mid 2020 is that correct?
- What is best version of Refs.sys in server 2019 build 1809 and has anyone tried to use the refs.sys from build 1903 on a 1809 build?

Gostev
SVP, Product Management
Posts: 24972
Liked: 3628 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev » Nov 14, 2019 3:35 pm

1. 2016 is a safe choice, this topic is about issues with 2019.
2. Correct.
3. Microsoft will not support this driver swap. For 2019 LTSC, just make sure you have all Windows updates installed.

jasonede
Service Provider
Posts: 37
Liked: 9 times
Joined: Jan 04, 2018 4:51 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by jasonede » Nov 14, 2019 5:10 pm

Is the latest build of 2019 with all updates fine on ReFS? I'm a little unclear if all issues are resolved or if they're just with very large repositories.

mkretzer
Expert
Posts: 566
Liked: 127 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer » Nov 14, 2019 5:41 pm 1 person likes this post

2019 still has issues with larger volumes from what i can tell (even if it is not as bad as before). But 1903 has *really, really* been totally stable for us for the first time since we use REFS (2 years now). All our issues only went away with 1903. But we have backup copy destinations which are ~110 TB big which not once had any issue even with 2016.
So for smaller setups 2016, 2019 should both be find for small repos if you have enough RAM.

jasonede
Service Provider
Posts: 37
Liked: 9 times
Joined: Jan 04, 2018 4:51 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by jasonede » Nov 15, 2019 8:21 am

We're not planning on having massive repositories, probably around 50TB each, but would like to use ReFS (currently on NTFS) and Windows Server 2019 (we have to rebuild server anyway and 2019 makes more sense than 2016 at this stage). Are all the improvements/fixes in 1903 build going to be backported into the long term support build?

Gostev
SVP, Product Management
Posts: 24972
Liked: 3628 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev » Nov 15, 2019 10:32 am 1 person likes this post

You're asking a wrong company ;)

jasonede
Service Provider
Posts: 37
Liked: 9 times
Joined: Jan 04, 2018 4:51 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by jasonede » Nov 15, 2019 10:34 am

yes, but I suspect you'll know far more about what is happening with regards to ReFS than I'll be able to find out (as a member of general public) from MS as it impacts directly on veeam's operation.

Gostev
SVP, Product Management
Posts: 24972
Liked: 3628 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev » Nov 15, 2019 10:57 am

We work directly with the ReFS dev team, so we do get a lot of low-level information about the current and upcoming branches - of course, under NDA. However, backporting between branches is handled by the different team at Microsoft (Windows Servicing), which is from some kind of parallel universe.

mkretzer
Expert
Posts: 566
Liked: 127 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer » Nov 17, 2019 12:56 pm

The info they gave us (not under NDA) was that as Gostev hinted not everything will be backported to 2019. Things which does not affect alot of users (like problems users with REFS repos of 600 TB have - in other words us) will most likely not get fixed in 2019.

jamesharper-bsol
Service Provider
Posts: 15
Liked: 4 times
Joined: Jan 16, 2012 10:30 am
Full Name: James Harper
Contact:

Re: Windows 2019, large REFS and deletes

Post by jamesharper-bsol » Nov 18, 2019 4:51 pm

That's bonkers - MS really need to step it up.

I've just migrated ~200TB to a freshly installed 2019 (1809) with ReFS.sys version 10.0.17763.831 (on 2019-10 CU), 64k AU size.

At first speed was fine, then came the merges (not fast clone as active fulls haven't been run yet) - performance tanked (CPU maxed out, WPA showing ReFS.sys as the culprit).

The RAMMap 'Empty System Working Set' does help temporarily.

I'm scheduling an update to the 2019-11 CU and we'll see if there's any change.

EDIT - The file size and modification date for ReFS.sys in 2019-11 look the same between 2019-10 and 2019-11, I don't expect any change.

JaySt
Service Provider
Posts: 176
Liked: 26 times
Joined: Jun 09, 2015 7:08 pm
Full Name: JaySt
Contact:

Re: Windows 2019, large REFS and deletes

Post by JaySt » Nov 19, 2019 9:00 am

how frequently do you run RAMMap 'Empty System Working Set'?
Veeam Certified Engineer

Ben.online
Lurker
Posts: 2
Liked: never
Joined: Nov 12, 2019 1:30 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by Ben.online » Nov 19, 2019 9:46 am

@Gostev thank you for your reply and sorry i misspelled your name

So frustrating MS, Dell and Veeam advice is: install all latest updates despite the fact that this does not seem to fix the issue.
just ridiculous that it has to be this way with a "new" supported OS and a existing stable version of refs.sys. And MS in essence taking the position of we may fix it or maybe not!?

I think i'll take my changes with 1809 and start testing but will make a failover plan to switch to 1903, if even possible in our situation.

jamesharper-bsol
Service Provider
Posts: 15
Liked: 4 times
Joined: Jan 16, 2012 10:30 am
Full Name: James Harper
Contact:

Re: Windows 2019, large REFS and deletes

Post by jamesharper-bsol » Nov 19, 2019 11:09 am

Hi Jay,

RAMMap can't be scripted so I'm now running a program this chap created:
http://www.toughdev.com/content/2015/05 ... -metafile/

Download the zip, open the solution in VS, re-target to .NET 4.6, check over the code and build. I then have a scheduled task to run this every 20 mins.

This isn't a tenable situation for us, given this thread has been going since March and the hint that MS won't backport everything to 1809, we're currently reviewing whether to move to 1903 or back to 2016.

Does anyone have more details on what improvements 1903 has over 2016?

jasonede
Service Provider
Posts: 37
Liked: 9 times
Joined: Jan 04, 2018 4:51 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by jasonede » Nov 19, 2019 1:32 pm

Ben.online wrote:
Nov 19, 2019 9:46 am
@Gostev thank you for your reply and sorry i misspelled your name

So frustrating MS, Dell and Veeam advice is: install all latest updates despite the fact that this does not seem to fix the issue.
just ridiculous that it has to be this way with a "new" supported OS and a existing stable version of refs.sys. And MS in essence taking the position of we may fix it or maybe not!?

I think i'll take my changes with 1809 and start testing but will make a failover plan to switch to 1903, if even possible in our situation.
Have you tried these fixes that were suggested earlier on in this thread that made big differences for some people?

Would be nice if some other guys could test it and give a feedback. All you have to do is creating the following regkey and reboot the server:
Path: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem
REG_DWORD value: "RefsEnableLargeWorkingSetTrim"=1

Check if Trim is enabled by firing the command: "fsutil behavior query DisableDeleteNotify"

If DisableDeleteNotifiy is disabled (0) for ReFS on the 2019 server, enable it with the following commands:
"fsutil behavior set DisableDeleteNotify 1"
"fsutil behavior set DisableDeleteNotify ReFS 1"

Ours isn't in production yet and I'm trying to work out which is best and whether Win 2019 or try and go back to Win 2016. 1903 (windows server) is core only and with the requirement to upgrade regularly isn't great for enterprise.

Gostev
SVP, Product Management
Posts: 24972
Liked: 3628 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev » Nov 19, 2019 1:58 pm 1 person likes this post

If using SAC 1903 is not an option for you - then until Microsoft gets Windows Server 2019 back in order, it is indeed best to go back to tried and true Windows Server 2016.

jamesharper-bsol
Service Provider
Posts: 15
Liked: 4 times
Joined: Jan 16, 2012 10:30 am
Full Name: James Harper
Contact:

Re: Windows 2019, large REFS and deletes

Post by jamesharper-bsol » Nov 21, 2019 9:44 am

Hi Jason,

I set the RefsEnableLargeWorkingSetTrim key and rebooted - performance got worse - instead of IO at 500MB/s-1GB/s for 15mins then a lull, it started doing IO for 10 seconds, stop for 10 seconds and only peaking to ~200MB/s.

I then disabled TRIM on ReFS with "fsutil behavior set DisableDeleteNotify ReFS 1" (I left it on for NTFS) - this increased the length of the IO bursts to ~30 seconds - but then this morning I noticed that the large periods of no IO that I was having before (and used 'Empty System Working Set' to resolve) were back - running RAMMap/ClearFSCache.exe then brought performance back up to 500MB/s-1GB/s.

So with either of these modifications to TRIM behaviour (BTW I'm using local MegaRAID SAS 9380-8e cards with JBODs ~500TB and not an array), it's clear that eventually the bug will bite and you'll still need to clear down the system working set.

I'm going to move back to 2016 as I've got better things to do than babysit Microsoft's poor QA.

Post Reply

Who is online

Users browsing this forum: Bing [Bot], Google [Bot], rdrost and 41 guests