Comprehensive data protection for all workloads
Post Reply
poulpreben
Certified Trainer
Posts: 1024
Liked: 448 times
Joined: Jul 23, 2012 8:16 am
Full Name: Preben Berg
Contact:

Re: Windows 2019, large REFS and deletes

Post by poulpreben »

It is capable of it, but I can try to disable it as a test.

Were you able to reproduce the issue with a simple 'fio' test or something, or was it only Veeam that triggered the slowdown?
rhys.hammond
Veeam Software
Posts: 72
Liked: 15 times
Joined: Apr 07, 2013 10:36 pm
Full Name: Rhys Hammond
Location: Brisbane , Australia
Contact:

Re: Windows 2019, large REFS and deletes

Post by rhys.hammond »

Was told by another Veeam SP that regularly emptying the 'System Working Set' using RAMMap fixed their ReFS performance issues on 1809.
Just tried it but unfortunately, the customers disk speed is still 'saw-toothing' and performing very poorly until the next restart.

I'm going to try one last thing before wiping 1809 and loading, going to delete the 1 x 200TB ReFS volume and create 2 x 100TB ReFS volumes.
I don't want to create smaller volumes as ReFS block cloning doesn't work across volumes.
Veeam Certified Architect | Author of http://rhyshammond.com | Veeam Vanguard | vExpert
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer »

@poulpreben we only had issues with block cloned Veeam data but we did not did test with > 100 TBs of block cloned non-Veeam data which is necesarry for the issue to happen for us.
poulpreben
Certified Trainer
Posts: 1024
Liked: 448 times
Joined: Jul 23, 2012 8:16 am
Full Name: Preben Berg
Contact:

Re: Windows 2019, large REFS and deletes

Post by poulpreben » 1 person likes this post

As an isolation step, we ended up taking the old volumes offline and creating new ones. We tested so many things to no avail, and in this case it was easy since the customer has only 30 days of retention.

I guess we’ll see once there is more than 100 TB data on the server again. It’s now running with the 2019-10 Cumulative Update as well. For now it seems super fast.
Ben.online
Service Provider
Posts: 22
Liked: 2 times
Joined: Nov 12, 2019 1:30 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by Ben.online »

Hello,

We are about to go in production with a Veeam repository Server 2019 build 1809 and this topic does not make me happy.
I thougt the ReFS issues were all resolved but sadly i found this when checking for the refs.sys version from the Server 2019.

a few thing i missed or are unclear to me.

- Are there any reports of these issues on Server 2016 after the .2457 update? And is 2016 just a good for ReFS as a patched 2019 a Gustov said?
- As i understand the server 2019 build 1903 works fine but is core only and support ends mid 2020 is that correct?
- What is best version of Refs.sys in server 2019 build 1809 and has anyone tried to use the refs.sys from build 1903 on a 1809 build?
Gostev
Chief Product Officer
Posts: 31455
Liked: 6646 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev »

1. 2016 is a safe choice, this topic is about issues with 2019.
2. Correct.
3. Microsoft will not support this driver swap. For 2019 LTSC, just make sure you have all Windows updates installed.
jasonede
Service Provider
Posts: 109
Liked: 24 times
Joined: Jan 04, 2018 4:51 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by jasonede »

Is the latest build of 2019 with all updates fine on ReFS? I'm a little unclear if all issues are resolved or if they're just with very large repositories.
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer » 1 person likes this post

2019 still has issues with larger volumes from what i can tell (even if it is not as bad as before). But 1903 has *really, really* been totally stable for us for the first time since we use REFS (2 years now). All our issues only went away with 1903. But we have backup copy destinations which are ~110 TB big which not once had any issue even with 2016.
So for smaller setups 2016, 2019 should both be find for small repos if you have enough RAM.
jasonede
Service Provider
Posts: 109
Liked: 24 times
Joined: Jan 04, 2018 4:51 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by jasonede »

We're not planning on having massive repositories, probably around 50TB each, but would like to use ReFS (currently on NTFS) and Windows Server 2019 (we have to rebuild server anyway and 2019 makes more sense than 2016 at this stage). Are all the improvements/fixes in 1903 build going to be backported into the long term support build?
Gostev
Chief Product Officer
Posts: 31455
Liked: 6646 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev » 1 person likes this post

You're asking a wrong company ;)
jasonede
Service Provider
Posts: 109
Liked: 24 times
Joined: Jan 04, 2018 4:51 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by jasonede »

yes, but I suspect you'll know far more about what is happening with regards to ReFS than I'll be able to find out (as a member of general public) from MS as it impacts directly on veeam's operation.
Gostev
Chief Product Officer
Posts: 31455
Liked: 6646 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev »

We work directly with the ReFS dev team, so we do get a lot of low-level information about the current and upcoming branches - of course, under NDA. However, backporting between branches is handled by the different team at Microsoft (Windows Servicing), which is from some kind of parallel universe.
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer »

The info they gave us (not under NDA) was that as Gostev hinted not everything will be backported to 2019. Things which does not affect alot of users (like problems users with REFS repos of 600 TB have - in other words us) will most likely not get fixed in 2019.
jamesharper-bsol
Influencer
Posts: 15
Liked: 4 times
Joined: Jan 16, 2012 10:30 am
Full Name: James Harper
Contact:

Re: Windows 2019, large REFS and deletes

Post by jamesharper-bsol »

That's bonkers - MS really need to step it up.

I've just migrated ~200TB to a freshly installed 2019 (1809) with ReFS.sys version 10.0.17763.831 (on 2019-10 CU), 64k AU size.

At first speed was fine, then came the merges (not fast clone as active fulls haven't been run yet) - performance tanked (CPU maxed out, WPA showing ReFS.sys as the culprit).

The RAMMap 'Empty System Working Set' does help temporarily.

I'm scheduling an update to the 2019-11 CU and we'll see if there's any change.

EDIT - The file size and modification date for ReFS.sys in 2019-11 look the same between 2019-10 and 2019-11, I don't expect any change.
JaySt
Service Provider
Posts: 415
Liked: 75 times
Joined: Jun 09, 2015 7:08 pm
Full Name: JaySt
Contact:

Re: Windows 2019, large REFS and deletes

Post by JaySt »

how frequently do you run RAMMap 'Empty System Working Set'?
Veeam Certified Engineer
Ben.online
Service Provider
Posts: 22
Liked: 2 times
Joined: Nov 12, 2019 1:30 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by Ben.online »

@Gostev thank you for your reply and sorry i misspelled your name

So frustrating MS, Dell and Veeam advice is: install all latest updates despite the fact that this does not seem to fix the issue.
just ridiculous that it has to be this way with a "new" supported OS and a existing stable version of refs.sys. And MS in essence taking the position of we may fix it or maybe not!?

I think i'll take my changes with 1809 and start testing but will make a failover plan to switch to 1903, if even possible in our situation.
jamesharper-bsol
Influencer
Posts: 15
Liked: 4 times
Joined: Jan 16, 2012 10:30 am
Full Name: James Harper
Contact:

Re: Windows 2019, large REFS and deletes

Post by jamesharper-bsol »

Hi Jay,

RAMMap can't be scripted so I'm now running a program this chap created:
http://www.toughdev.com/content/2015/05 ... -metafile/

Download the zip, open the solution in VS, re-target to .NET 4.6, check over the code and build. I then have a scheduled task to run this every 20 mins.

This isn't a tenable situation for us, given this thread has been going since March and the hint that MS won't backport everything to 1809, we're currently reviewing whether to move to 1903 or back to 2016.

Does anyone have more details on what improvements 1903 has over 2016?
jasonede
Service Provider
Posts: 109
Liked: 24 times
Joined: Jan 04, 2018 4:51 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by jasonede »

Ben.online wrote: Nov 19, 2019 9:46 am @Gostev thank you for your reply and sorry i misspelled your name

So frustrating MS, Dell and Veeam advice is: install all latest updates despite the fact that this does not seem to fix the issue.
just ridiculous that it has to be this way with a "new" supported OS and a existing stable version of refs.sys. And MS in essence taking the position of we may fix it or maybe not!?

I think i'll take my changes with 1809 and start testing but will make a failover plan to switch to 1903, if even possible in our situation.
Have you tried these fixes that were suggested earlier on in this thread that made big differences for some people?

Would be nice if some other guys could test it and give a feedback. All you have to do is creating the following regkey and reboot the server:
Path: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem
REG_DWORD value: "RefsEnableLargeWorkingSetTrim"=1

Check if Trim is enabled by firing the command: "fsutil behavior query DisableDeleteNotify"

If DisableDeleteNotifiy is disabled (0) for ReFS on the 2019 server, enable it with the following commands:
"fsutil behavior set DisableDeleteNotify 1"
"fsutil behavior set DisableDeleteNotify ReFS 1"

Ours isn't in production yet and I'm trying to work out which is best and whether Win 2019 or try and go back to Win 2016. 1903 (windows server) is core only and with the requirement to upgrade regularly isn't great for enterprise.
Gostev
Chief Product Officer
Posts: 31455
Liked: 6646 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev » 1 person likes this post

If using SAC 1903 is not an option for you - then until Microsoft gets Windows Server 2019 back in order, it is indeed best to go back to tried and true Windows Server 2016.
jamesharper-bsol
Influencer
Posts: 15
Liked: 4 times
Joined: Jan 16, 2012 10:30 am
Full Name: James Harper
Contact:

Re: Windows 2019, large REFS and deletes

Post by jamesharper-bsol »

Hi Jason,

I set the RefsEnableLargeWorkingSetTrim key and rebooted - performance got worse - instead of IO at 500MB/s-1GB/s for 15mins then a lull, it started doing IO for 10 seconds, stop for 10 seconds and only peaking to ~200MB/s.

I then disabled TRIM on ReFS with "fsutil behavior set DisableDeleteNotify ReFS 1" (I left it on for NTFS) - this increased the length of the IO bursts to ~30 seconds - but then this morning I noticed that the large periods of no IO that I was having before (and used 'Empty System Working Set' to resolve) were back - running RAMMap/ClearFSCache.exe then brought performance back up to 500MB/s-1GB/s.

So with either of these modifications to TRIM behaviour (BTW I'm using local MegaRAID SAS 9380-8e cards with JBODs ~500TB and not an array), it's clear that eventually the bug will bite and you'll still need to clear down the system working set.

I'm going to move back to 2016 as I've got better things to do than babysit Microsoft's poor QA.
Andrew@MSFT
Technology Partner
Posts: 15
Liked: 31 times
Joined: Nov 19, 2019 5:31 pm
Full Name: Andrew Hansen
Contact:

Re: Windows 2019, large REFS and deletes

Post by Andrew@MSFT » 2 people like this post

DISCLAIMER: I work for Microsoft as a Program Manager on the Storage and File Systems Team – specifically the Resilient File System (ReFS).

First, wanted to give my sincerest THANK YOU! for choosing ReFS with Veeam as your preferred platform. Microsoft has worked directly with Veeam since their integration with ReFS Block Cloning technology to ensure your data integrity is of top priority. Our goal is to make the most performant, space efficient, reliable solution for our customers.

Can you explain the issue?

Veeam uses ReFS block cloning functionality to make backups reliable, fast and efficient. ReFS Block Cloning involves maintaining a reference count of each allocated block. Sometimes, performance can be affected when a system has a large number of cloned files and is doing large numbers of deletes, overwrites, etc. The more frequently your data is changing, and the more data you have, the larger the reference table. This tracking ensures your data remains consistent, available, and correct.

What is Microsoft doing about it?

Microsoft recognizes the issue and has invested in new optimizations for block cloning. These changes make cloning faster and more efficient. We are considering multiple options to get these optimizations to our customers. I will post again in January 2020 when I have more details.

What can I do now if I am experiencing this issue?

Ensure Trim is disabled "fsutil behavior set DisableDeleteNotify ReFS 1"
Create smaller volumes. This can help with the amount of data churn.
Engage with Microsoft product support. By opening a support case, you get a dedicated resource to help with your specific needs.
jasonede
Service Provider
Posts: 109
Liked: 24 times
Joined: Jan 04, 2018 4:51 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by jasonede »

Andrew,

Thank you for this update. It's good to know that there is work happening to resolve this issue.

We've decided to stick with NTFS (on windows 2019) until we can be sure that all major issues with ReFS have been resolved and it's stable with larger volumes and backup workload, especially as we've a lot of data changing regularly in some cases.
soncscy
Veteran
Posts: 643
Liked: 312 times
Joined: Aug 04, 2019 2:57 pm
Full Name: Harvey
Contact:

Re: Windows 2019, large REFS and deletes

Post by soncscy »

Hi Andrew,
Andrew@MSFT wrote: Nov 22, 2019 10:49 pm DISCLAIMER: I work for Microsoft as a Program Manager on the Storage and File Systems Team – specifically the Resilient File System (ReFS).


Create smaller volumes. This can help with the amount of data churn.
What volume size are you envisioning with this? I know that likely you don't want to (can't?) put a cap on this, but is there any hint you can offer where this cut-off is for 2019?
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer »

Andrew,

great having you on the forum now as well. Since we switched to 1903+smaller volumes (3x 200 TB instead of 1x 600 TB) as you and your colleagues reccomended everything is running perfectly for us.
Still, having a seperate server for 1903 is not optimal in the long run. Does your statement mean that you plan to implement all the fixes from 1903/1909 in 2019?

Markus
JaySt
Service Provider
Posts: 415
Liked: 75 times
Joined: Jun 09, 2015 7:08 pm
Full Name: JaySt
Contact:

Re: Windows 2019, large REFS and deletes

Post by JaySt » 1 person likes this post

that's the only relevant question here imho.
Veeam Certified Engineer
JPMS
Expert
Posts: 103
Liked: 31 times
Joined: Nov 02, 2019 6:19 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by JPMS »

mkretzer wrote: Nov 24, 2019 3:56 pm Does your statement mean that you plan to implement all the fixes from 1903/1909 in 2019?
For us SAC is not an option for a server solution. Until REFS works properly in its LTSB release then Windows Server 2019 is just not an option worth considering.
JaySt
Service Provider
Posts: 415
Liked: 75 times
Joined: Jun 09, 2015 7:08 pm
Full Name: JaySt
Contact:

Re: Windows 2019, large REFS and deletes

Post by JaySt »

@Markus
How much RAM you got on that 1903 repository?
Veeam Certified Engineer
mkretzer
Veeam Legend
Posts: 1140
Liked: 387 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer » 1 person likes this post

2,2 TB. But only because we had an old ESX hardware laying around. But it works quite well with that amount :-)
JaySt
Service Provider
Posts: 415
Liked: 75 times
Joined: Jun 09, 2015 7:08 pm
Full Name: JaySt
Contact:

Re: Windows 2019, large REFS and deletes

Post by JaySt »

ok. that's plenty :). I'm still struggling with the Veeam best practices for 1GB of RAM for every 1TB data on a ReFS volume.
I want to know if it is a best practice as a result of ReFS doing things not optimal (so more like a workarround). For example: would a properly working refs.sys (seems like the version on 1903 is doing quite well) still require that amounts of RAM?
Placing hundreds of TBs of data on a server maintaining that best practice has a weird financial skew regarding the RAM requirements imho.
Veeam Certified Engineer
Gostev
Chief Product Officer
Posts: 31455
Liked: 6646 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev »

There's no such best practice these days, it only existed until September 2018 Windows updates. Since then, you can just follow Veeam system requirements for repository RAM.
Post Reply

Who is online

Users browsing this forum: Amazon [Bot], Bing [Bot], ENBS, MarioZ and 192 guests