Comprehensive data protection for all workloads
Post Reply
Gostev
Chief Product Officer
Posts: 31805
Liked: 7299 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev »

JeremiahDDS wrote: Mar 06, 2020 3:58 pm Can you provide this private patch?
Microsoft specifically restricts this. You will need to open a case with Microsoft to get it. It will be a no-cost support case, because it is associated with the bug.
Andrew@MSFT
Technology Partner
Posts: 15
Liked: 31 times
Joined: Nov 19, 2019 5:31 pm
Full Name: Andrew Hansen
Contact:

Re: Windows 2019, large REFS and deletes

Post by Andrew@MSFT » 2 people like this post

Thanks everyone for your patience here!! Microsoft is working on a resolution and targeting a solution to be available in late March.

Special thanks to those who have validated the private and given us feedback along to the way.

I'll keep everyone posted on the release of the official patch.

If you are currently experience issues, we recommend the following:

• Ensure Trim is disabled

Code: Select all

 fsutil behavior set DisableDeleteNotify ReFS 1
• Set RefsEnableLargeWorkingSetTrim = 1

Code: Select all

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\FileSystem
Value Name: RefsEnableLargeWorkingSetTrim
Value Type: REG_DWORD
Value Data: 1
• Create smaller volumes. This can help with the amount of data churn.
• Engage with Microsoft product support. By opening a support case, you get a dedicated resource to help with your specific needs.
evilaedmin
Expert
Posts: 176
Liked: 30 times
Joined: Jul 26, 2018 8:04 pm
Full Name: Eugene V
Contact:

Re: Windows 2019, large REFS and deletes

Post by evilaedmin »

Andrew@msft is it correct that calls associated with this bug will be cost-free? Our org has Premiere.
poulpreben
Certified Trainer
Posts: 1025
Liked: 448 times
Joined: Jul 23, 2012 8:16 am
Full Name: Preben Berg
Contact:

Re: Windows 2019, large REFS and deletes

Post by poulpreben » 1 person likes this post

Yes, my case was refunded immediately after confirming that my issue was resolved by the hotfix.
mkretzer
Veeam Legend
Posts: 1203
Liked: 417 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer » 1 person likes this post

Ours as well!
JaySt
Service Provider
Posts: 454
Liked: 86 times
Joined: Jun 09, 2015 7:08 pm
Full Name: JaySt
Contact:

Re: Windows 2019, large REFS and deletes

Post by JaySt »

Andrew, do we need to revert the settings after installing the patch when it’s available?
Are these settings recommended post patch install?
Veeam Certified Engineer
mkretzer
Veeam Legend
Posts: 1203
Liked: 417 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer »

They told us to keep RefsEnableLargeWorkingSetTrim. But our 1903 Repo runs perfectly even without RefsEnableLargeWorkingSetTrim.
dasfliege
Service Provider
Posts: 275
Liked: 61 times
Joined: Nov 17, 2014 1:48 pm
Full Name: Florin
Location: Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by dasfliege » 1 person likes this post

I'm still on vacation, but as far as i have been informed by my colleagues, it's still working pretty good with the private hotfix #3. Memory usage is at a minimum, so i may wouldn't even had to double the RAM in our repo server. Hope that the fix will be published to all of you having problems pretty soon, so you can also be in love with ReFS again 😉
mkretzer
Veeam Legend
Posts: 1203
Liked: 417 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer »

Here as well - from my contacts with other customers it seems like every customer who got the hotfix is having no more issues!!
JeremiahDDS
Service Provider
Posts: 31
Liked: 7 times
Joined: Mar 28, 2019 12:52 am
Full Name: Jeremiah Glover
Contact:

Re: Windows 2019, large REFS and deletes

Post by JeremiahDDS »

When is it expected that the private patch will become public?
Gostev
Chief Product Officer
Posts: 31805
Liked: 7299 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev »

There's a message about this directly from the ReFS PM just a few posts above.
azizm
Influencer
Posts: 10
Liked: never
Joined: Mar 21, 2019 12:37 am
Full Name: Aziz M
Contact:

Re: Windows 2019, large REFS and deletes

Post by azizm »

Glad I stumbled across this post, we have zero trust in our Veeam server running backups by itself. Windows Server 2019 1809 with 100TB+ ReFS repositories, latest patches applied.

We opened a case with HPE & Veeam support early last year, no solution was found. We spun up multiple virtual proxies and our hangups disappeared by utilizing these proxies. If we do not utilize virtual proxies, our Veeam server will hang (no RDP, high CPU/mem, no ping).

Experienced this issue on Windows Server 2016 as well as Windows Server 2019 1809 with ReFS repos. I'll ask Microsoft Premier for that fix.

Since opening that case last year, I did increase the amount of RAM to more than 1GB per TB of data. This entire thread seems awful familiar, hope it works out.

Edit:

Hmmm, might just wait for the official release.
Andrew@MSFT
Technology Partner
Posts: 15
Liked: 31 times
Joined: Nov 19, 2019 5:31 pm
Full Name: Andrew Hansen
Contact:

Re: Windows 2019, large REFS and deletes

Post by Andrew@MSFT » 5 people like this post

It's here! Performance enhancements for ReFS released today in KB 4531331.

https://support.microsoft.com/en-us/hel ... -kb4541331

This update includes all the changes in the private fixes validated by customers in this thread.

If you are on WS2019, we recommend you apply KB 4541331.

Thank you all for your patience, and please let us know of your experiences!
FrancWest
Veteran
Posts: 528
Liked: 104 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Windows 2019, large REFS and deletes

Post by FrancWest »

Thanks for the update. Do we still need to apply the registry settings? Or should it work fine with the defaults?
Andrew@MSFT
Technology Partner
Posts: 15
Liked: 31 times
Joined: Nov 19, 2019 5:31 pm
Full Name: Andrew Hansen
Contact:

Re: Windows 2019, large REFS and deletes

Post by Andrew@MSFT » 1 person likes this post

Yes. The registry tweaks are still recommended.

• Ensure Trim is disabled

Code: Select all

 fsutil behavior set DisableDeleteNotify ReFS 1
• Set RefsEnableLargeWorkingSetTrim = 1

Code: Select all

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\FileSystem
Value Name: RefsEnableLargeWorkingSetTrim
Value Type: REG_DWORD
Value Data: 1
PeterC
Enthusiast
Posts: 46
Liked: 12 times
Joined: Apr 10, 2018 2:24 pm
Full Name: Peter Camps
Contact:

Re: Windows 2019, large REFS and deletes

Post by PeterC »

Thank you for the info, have been waiting for this one!!!
poulpreben
Certified Trainer
Posts: 1025
Liked: 448 times
Joined: Jul 23, 2012 8:16 am
Full Name: Preben Berg
Contact:

Re: Windows 2019, large REFS and deletes

Post by poulpreben »

Andrew, can you please confirm if the final hotfix is still on track for release by the end of this month?
Gostev
Chief Product Officer
Posts: 31805
Liked: 7299 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev »

@poulpreben can you clarify what do you mean by the "final hotfix"? KB4531331 seems pretty final to me :D
poulpreben
Certified Trainer
Posts: 1025
Liked: 448 times
Joined: Jul 23, 2012 8:16 am
Full Name: Preben Berg
Contact:

Re: Windows 2019, large REFS and deletes

Post by poulpreben »

I need new glasses. D'oh. Sorry :-)
bbuchan
Service Provider
Posts: 10
Liked: 5 times
Joined: May 19, 2016 3:45 pm
Full Name: Bryan Buchan
Contact:

Re: Windows 2019, large REFS and deletes

Post by bbuchan » 1 person likes this post

Little back story, We recently stepped into the ReFS world after watching for a while and waiting for 2019 to become stable. We skipped 2016 because we require Dedup AND ReFS. During my initial POC testing I saw no issues with GFS restore points or performance. Since then, we officially added an Archival solution to our catalog and sold it to a customer, and also pointed about half of our Cloud Connect traffic to the extents running ReFS. Once all the retention built up and we started seeing merges running we started seeing a massive performance decrease. Our dedup jobs were barely progressing and many of our merges were taking more than 24 hours our hanging up all together.

I have been following this thread for a while and trying the fixes and not seeing much improvement but chalking that up to the fact that no one else here has mentioned they are also running dedup. In addition, once our first archival customer's jobs attempted to generate a GFS restore point we started getting an error as soon as it would start the fast clone:
Failed to merge full backup file Error: Agent: Failed to process method {Transform.CompileFIB}: The request is not supported.
Failed to duplicate extent. Target file: RelativePath:\...
I created support case 03914893 to troubleshoot the issue. We never reached a solution, although still working.
03/18 I installed KB4531331. Immediately noticed a significant decrease in merge times. All of our local and remote merges completed overnight. Many of them still took 5 - 10 hours (@Andrew@MSFT, do you think that is because of the dedup overhead? At this point I would call this "fast-enough clone", but no where near the sub 5min block cloning should be capable of.) I almost feel like with Dedup mixed with ReFS, block cloning should be even faster, it should be virtually 100% pointer manipulation. Thoughts anyone?
Andrew@MSFT
Technology Partner
Posts: 15
Liked: 31 times
Joined: Nov 19, 2019 5:31 pm
Full Name: Andrew Hansen
Contact:

Re: Windows 2019, large REFS and deletes

Post by Andrew@MSFT » 1 person likes this post

This is indeed dedup overhead. In short, if you try to clone a deduped file, dedup will inline-rehydate the file before forwarding the cloning api. This can be expensive, and slow depending on the system...

Also, deduping a cloned file, would increase storage footprint, at least initially, till all the clones have been dehydrated by dedup and their chunks deduplicated.
bbuchan
Service Provider
Posts: 10
Liked: 5 times
Joined: May 19, 2016 3:45 pm
Full Name: Bryan Buchan
Contact:

Re: Windows 2019, large REFS and deletes

Post by bbuchan » 1 person likes this post

Andrew@MSFT,

Are there any plans for performance improvements or tighter integration between ReFS and Dedup? Is it possible to ever get to the point where it is purely pointer manipulation? A cloned file essentially results in the same result as two identical (or mostly identical) files that have been fully deduped correct?
Andrew@MSFT
Technology Partner
Posts: 15
Liked: 31 times
Joined: Nov 19, 2019 5:31 pm
Full Name: Andrew Hansen
Contact:

Re: Windows 2019, large REFS and deletes

Post by Andrew@MSFT » 1 person likes this post

Thanks for your feedback! I can't comment on future plans here unfortunately.
PeterC
Enthusiast
Posts: 46
Liked: 12 times
Joined: Apr 10, 2018 2:24 pm
Full Name: Peter Camps
Contact:

Re: Windows 2019, large REFS and deletes

Post by PeterC »

We installed the KB4531331 and had high hopes that we finally would resolve some performance issues we are having with our Server 2019 repository server. But it looks like it is not the resolution for us. It has improved a little bit but surely that what we hoped it would bring.
We are still experimenting with the task limits settings on this Apollo 4200, but at the moment it still looks like when a few merges start during the regular backups the performance decreases dramatically.

The throughput of network traffic is constant during the backups, but when merges start the traffic gets very irregular and drops to 0 randomly. Depending on the number of jobs we have seen throughput as much as 19 Gbps(!) but as soon as the merges start it will drop to Kbps and never spike above 200 Mbps anymore. For longer periods it will even be 0 Kbps. When merges are done it will get better again.

We still have some registry settings in place;

ReFS DisableDeleteNotify = 1
RefsDisableDeleteNotification = 1
RefsDisableLastAccessUpdate = 1
RefsEnableLargeWorkingSetTrim = 1
RefsNumberOfChunksToTrim = 128

Can someone confirm if we have to change or remove some of these settings after the last update?
Any suggestion would be very much appreciated!
nmdange
Veteran
Posts: 528
Liked: 144 times
Joined: Aug 20, 2015 9:30 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by nmdange » 1 person likes this post

I just applied the update and can report that merge times are significantly improved now! It did seem like the first merge after the update was still slow for a couple of jobs, but right now the latest merge on all jobs was fast.
The throughput of network traffic is constant during the backups, but when merges start the traffic gets very irregular and drops to 0 randomly. Depending on the number of jobs we have seen throughput as much as 19 Gbps(!) but as soon as the merges start it will drop to Kbps and never spike above 200 Mbps anymore
Merge is only working locally on the repository, there is no data to transfer to a different server, so network traffic isn't really a good indication of there being a problem or not. I only have the two changes Andrew mentioned, not the other ones. Might be worth removing them to see if that helps.
PeterC
Enthusiast
Posts: 46
Liked: 12 times
Joined: Apr 10, 2018 2:24 pm
Full Name: Peter Camps
Contact:

Re: Windows 2019, large REFS and deletes

Post by PeterC »

PeterC wrote: Mar 20, 2020 7:42 amThe throughput of network traffic is constant during the backups, but when merges start the traffic gets very irregular and drops to 0 randomly. Depending on the number of jobs we have seen throughput as much as 19 Gbps(!) but as soon as the merges start it will drop to Kbps and never spike above 200 Mbps anymore. For longer periods it will even be 0 Kbps. When merges are done it will get better again.
I mean the throughput of the other running backupjobs that are still transferring data to the repository at the same time other jobs start merging.

What we also notice that during the backups/mergers the ReFS volume is very slow in the Windows explorer, deleting files or (empty) folders takes several minutes. I will see if i can remove the other registry settings. But i would be surprised if this is going to change a lot.
poulpreben
Certified Trainer
Posts: 1025
Liked: 448 times
Joined: Jul 23, 2012 8:16 am
Full Name: Preben Berg
Contact:

Re: Windows 2019, large REFS and deletes

Post by poulpreben » 3 people like this post

One of the most impressive results we have seen after applying the hotfix:
  • 60 VM job with only SQL Servers
  • 104 TB source data
  • 60 TB VBKs
  • Synthetic full with fast clone: 1 hour 8 minutes.
Prior to applying the hotfix, this job would never finish.
mkretzer
Veeam Legend
Posts: 1203
Liked: 417 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer » 1 person likes this post

The difference in synthetic backup hours on our weekend synthetics between 1903 and 1909 is also nice:

1903: 10:25:35
1909: 3:49:47
JeremiahDDS
Service Provider
Posts: 31
Liked: 7 times
Joined: Mar 28, 2019 12:52 am
Full Name: Jeremiah Glover
Contact:

Re: Windows 2019, large REFS and deletes

Post by JeremiahDDS » 4 people like this post

I can confirm that the hotfix fixes major issues with ReFS on Windows Server 2019 1809. I was generally sitting at 50+ jobs in progress (because they were taking so long to merge), 128GB+ of RAM and disk queue was high. After hotfix and reboot with no other changes RAM is generally at 64GB, disk queue low and now sitting at 10- jobs generally running.
PeterC
Enthusiast
Posts: 46
Liked: 12 times
Joined: Apr 10, 2018 2:24 pm
Full Name: Peter Camps
Contact:

Re: Windows 2019, large REFS and deletes

Post by PeterC » 3 people like this post

We finally got to the sweet spot of the best setting for the task limit on our Server 2019 repository. Now we can also see that the hotfix makes a huge difference.

Normally we would have around 40 jobs still running at 09:00 am. Around 14:00 they would be finished. Since the hotfix and correct task limit setting the jobs are all done at 05:30 - 06:00 am.

For now we still have the extra registry settings in place, first we want to be sure this better performance continuous. After that maybe we can revert some settings, but for now we are very happy to have the backup window back to what it used to be.
Post Reply

Who is online

Users browsing this forum: Bing [Bot], ddujakovic, Google [Bot], ken.tyrrell, saschak and 151 guests