Comprehensive data protection for all workloads
Post Reply
EvoGeek
Service Provider
Posts: 8
Liked: never
Joined: Jun 26, 2019 5:55 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by EvoGeek »

@rhys.hammond Are things still going well with a few more days under your belt on the August patch for Server 2019?

I'm ready to install the OS for our next Repo and I'm debating 1903 or Server 2019. If they've fixed the issue in Server 2019, we'll go that route as we'd prefer to stay in the LTSC for our repos. However the fix to ReFS is more important so we're willing to go 1903 if we have to.
mkretzer
Veeam Legend
Posts: 1203
Liked: 417 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer » 3 people like this post

I heard back from microsoft.

I asked specifically about the 2019 - 1903 difference.
2019 received backports with the last patches. But these backports are still far away from the REFS stability/quality of 1903.

We did some tests and attached a 600 TB volumes filled with 1,2 PB of block cloned backups to a 2019 repo server and then to a 1903 server. Then we deleted about 200 TB of backups.

2019 was able to free up about 4 tb in 3 hours. Also backups running at the same time were still severely impacted in performance, but did not stall completely.
Then we attached this filesystem to our 1903 server (wile the deletion in the filesystem was still going on). In the same time (3 hours) 1903 was able to free up nearly 60 TB! Also even with bottleneck in Veeam still beeing "target" write speed was factor 4-6 faster than under 2019 at that time!
rhys.hammond
Veeam Software
Posts: 75
Liked: 16 times
Joined: Apr 07, 2013 10:36 pm
Full Name: Rhys Hammond
Location: Brisbane , Australia
Contact:

Re: Windows 2019, large REFS and deletes

Post by rhys.hammond »

Hey Evo,

Cherry-picking some of the jobs we have running,

Backup Job protecting 2 VMs, total VM consumed size is 30.9TB – took 3 hours to create synth full (fast clone) with the latest August patches on 1809.
Compare this to our previous times using unpatched 1809 with the same job only progressing to 85% (partial fast clone) after 62hours before we cancelled the job.
So definitely a marked improvement for this customer. To confirm, 2 x 145TB performance extents in a SOBR was the backup target.

However, we are still seeing some strange performance for the offsite SOBR which consists of 1 x 200TB performance extent.
Both times recorded were with the latest August patches installed on 1809, we saw 5hrs to create a GFS synth full on a BCJ .
Compare this to 25hrs witnessed just this weekend passed. The only difference was reinstalling SCCM client.

Could just be a coincidence but we'll be uninstalling the SCCM client again for this weekend run to see if that brings back our 5hr GFS synth full creation times again.
Veeam Certified Architect | Author of http://rhyshammond.com | Veeam Vanguard | vExpert
rhys.hammond
Veeam Software
Posts: 75
Liked: 16 times
Joined: Apr 07, 2013 10:36 pm
Full Name: Rhys Hammond
Location: Brisbane , Australia
Contact:

Re: Windows 2019, large REFS and deletes

Post by rhys.hammond »

thanks for the update mkretzer, glad to see data isn't lost if we do end up wiping and moving from 1809 to 1903.
Veeam Certified Architect | Author of http://rhyshammond.com | Veeam Vanguard | vExpert
EvoGeek
Service Provider
Posts: 8
Liked: never
Joined: Jun 26, 2019 5:55 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by EvoGeek »

Thanks to both of you for your updates. Very helpful.
Steve-nIP
Service Provider
Posts: 129
Liked: 59 times
Joined: Feb 06, 2018 10:08 am
Full Name: Steve
Contact:

Re: Windows 2019, large REFS and deletes

Post by Steve-nIP » 1 person likes this post

mkretzer wrote: Aug 31, 2019 6:27 am I heard back from microsoft.

I asked specifically about the 2019 - 1903 difference.
2019 received backports with the last patches. But these backports are still far away from the REFS stability/quality of 1903.

We did some tests and attached a 600 TB volumes filled with 1,2 PB of block cloned backups to a 2019 repo server and then to a 1903 server. Then we deleted about 200 TB of backups.

2019 was able to free up about 4 tb in 3 hours. Also backups running at the same time were still severely impacted in performance, but did not stall completely.
Then we attached this filesystem to our 1903 server (wile the deletion in the filesystem was still going on). In the same time (3 hours) 1903 was able to free up nearly 60 TB! Also even with bottleneck in Veeam still beeing "target" write speed was factor 4-6 faster than under 2019 at that time!
This is kind of absurd, Microsoft really need to get their act together - most people run on the long term support channel versions of Server OSes... these fixes need to 100% come to 2019
b.vanhaastrecht
Service Provider
Posts: 880
Liked: 164 times
Joined: Aug 26, 2013 7:46 am
Full Name: Bastiaan van Haastrecht
Location: The Netherlands
Contact:

Re: Windows 2019, large REFS and deletes

Post by b.vanhaastrecht »

I agree 100%. Cant Veeam put some pressure on the ReFS team to backport it 100% to 2019?
======================================================
Veeam ProPartner, Service Provider and a proud Veeam Legend
ejenner
Veteran
Posts: 636
Liked: 100 times
Joined: Mar 23, 2018 4:43 pm
Full Name: EJ
Location: London
Contact:

Re: Windows 2019, large REFS and deletes

Post by ejenner »

This article gives some interesting info on ways to tune ReFS.

http://www.checkyourlogs.net/?p=62683
mkretzer
Veeam Legend
Posts: 1203
Liked: 417 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer »

... this does only help with dedup. Our problems are even without dedup.
But i wonder: Since with dedup files are not really deleted until GC is ruinning - would these delete issues not happen the same way as without dedup?
ejenner
Veteran
Posts: 636
Liked: 100 times
Joined: Mar 23, 2018 4:43 pm
Full Name: EJ
Location: London
Contact:

Re: Windows 2019, large REFS and deletes

Post by ejenner »

Apologies, wasn't aware of that.

Good point. You may find if you had deduplication enabled that you'd be able to tune the deletion process as described in the blog. Even set it to happen at a certain time to see if you can see if it is the filesystem which is causing the problems.

After searching the same blog I found another. Could be something interesting there. He describes using powershell to change processor and memory usage.

http://www.checkyourlogs.net/?p=57793
dasfliege
Service Provider
Posts: 276
Liked: 61 times
Joined: Nov 17, 2014 1:48 pm
Full Name: Florin
Location: Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by dasfliege » 1 person likes this post

We ran into the same problem on our 200TB ReFS repo recently, as we did a fresh installation of Server 2019 1803 and attached our existing 2016 Storage Space volume to it.
I've found out today that even if the windows update history showed me that KB4511553 (August Cumulative) has been installed successfully, it still listed it for download and installation when i did a search for updates on the internet. As i've opened a case with microsoft premier support before, we've also checked for the ReFS driver version. We had version 10.0.17763.379 when we had the issue with slow performing ReFS. After i've installed the KB4511553 again, the ReFS driver has been updated to 10.0.17763.652 -> And the issues were gone!

Seems like the cumulative update KB4511553 haven't been installed completely during the first run, or they changed anything in it between 21 of August and today. Never saw that behavior before, so i thought it may be worth a share with you guys.
ejenner
Veteran
Posts: 636
Liked: 100 times
Joined: Mar 23, 2018 4:43 pm
Full Name: EJ
Location: London
Contact:

Re: Windows 2019, large REFS and deletes

Post by ejenner »

Not sure if it's of any use to anyone but our ReFS driver versions were updated with a Windows Cumulative Update at the beginning of the month. I just checked out of curiosity to see which version we were on currently and found they had been updated. We're on 2016 though. Current version on our systems is 10.0.14393.3179
dasfliege
Service Provider
Posts: 276
Liked: 61 times
Joined: Nov 17, 2014 1:48 pm
Full Name: Florin
Location: Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by dasfliege »

Wanted to give an update on our situation i described above. Thought that the problems are fixes by updating the ReFS driver to the latest version, but as soon as there were parallel jobs running against our repo, f.e. a few backupjobs and merges, the performance dropps extremely. If i compare the disk activity in Resource Monitor between a 2019 and 2016 server with the exact same hardware, i see drops to 0 disk activity on the 2019 almost every second. Seems like it does not manage to write to disk in a constant stream. I'm still treating the case with MS support.
rhys.hammond
Veeam Software
Posts: 75
Liked: 16 times
Joined: Apr 07, 2013 10:36 pm
Full Name: Rhys Hammond
Location: Brisbane , Australia
Contact:

Re: Windows 2019, large REFS and deletes

Post by rhys.hammond »

Quick update from me, our performance has nose-dived off a cliff. I'm currently watching the .vbk file size increase at a rate of 3-4MBs.
The past 3 weekends we've seen "merging oldest restore points into full backup file [GFS]" take multiple days instead of the usual hours.
Currently running Refs.sys v10.0.17763.719 (still on 2019 build 1809).
Veeam Certified Architect | Author of http://rhyshammond.com | Veeam Vanguard | vExpert
mkretzer
Veeam Legend
Posts: 1203
Liked: 417 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer » 1 person likes this post

I can again only recommend upgrading to 1903. It really feels like NTFS performance with all the REFS benefits! I did not believe it myself at first.

Don't let the old refs.sys from 1903 fool you: It is far superior then 2019!
rhys.hammond
Veeam Software
Posts: 75
Liked: 16 times
Joined: Apr 07, 2013 10:36 pm
Full Name: Rhys Hammond
Location: Brisbane , Australia
Contact:

Re: Windows 2019, large REFS and deletes

Post by rhys.hammond »

So this customer has over 720TB worth of backup data on the ReFS volume, consuming only 155TB of space.
They've just started offloading the GFS fulls to cloud tier, RPS has estimated after all eligible backup files are moved the total backup size should reduce from 720TB to around 110TB.

This ReFS volume was never designed to store anything more than 110TB but it's taken a long time to get the upgraded internet links ready.

Gostev mentioned in his weekly digest "this issue is not something you should worry about unless your ReFS volumes are north of 100TB. At least this seems to be the threshold where the impact becomes noticeable enough for some customers to open support cases and/or post on forums."

So we're waiting to see how the performance goes after shrinking back to 110TB.

Ill report back after the offload is complete, it will take a few weeks.
Veeam Certified Architect | Author of http://rhyshammond.com | Veeam Vanguard | vExpert
Seve CH
Enthusiast
Posts: 89
Liked: 35 times
Joined: May 09, 2016 2:34 pm
Full Name: JM Severino
Location: Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Seve CH »

rhys.hammond wrote: Sep 02, 2019 1:00 am Compare this to 25hrs witnessed just this weekend passed. The only difference was reinstalling SCCM client.

Could just be a coincidence but we'll be uninstalling the SCCM client again for this weekend run to see if that brings back our 5hr GFS synth full creation times again.
Hi Rhys.

We faced trouble with ReFS, W2016 and the SCCM client long time ago. All problems went away as soon as we uninstalled the SCCM agent. That leaded to the SCCM comment in the KB known issues:
https://www.veeam.com/kb2792

I think we will stay with W2016 for a while even if patching is painful slow. We use 2x 250TB ReFS repos.

It would be great to backport the ReFS driver of 1903, but MS didn't do so for the servicing stack (Windows Update) and W2016 / W10 LTSB so I do not expect them to backport the exact ReFS driver to W2019. That's the point of LTSB/LTSC: no major changes, only bugfixes.

Thanks you all for keeping us informed :-)

Regards
Seve.
dasfliege
Service Provider
Posts: 276
Liked: 61 times
Joined: Nov 17, 2014 1:48 pm
Full Name: Florin
Location: Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by dasfliege » 2 people like this post

Another update from Microsoft premier support. Finally our case has been recognized as a general problem with ReFS on 2019 server :-)

Quote:
We are now observing similar other issues where ReFS block cloning functionality is resulting degraded performance on Windows Server 2019. I am checking with Product Group as it seems there are aware and working on it. We just need to validate if we are hitting same issue. Parallelly we are getting Xperf reviewed which might give us better insight.
EvoGeek
Service Provider
Posts: 8
Liked: never
Joined: Jun 26, 2019 5:55 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by EvoGeek »

That's great news!
fsr
Enthusiast
Posts: 30
Liked: 1 time
Joined: Mar 27, 2019 5:28 pm
Full Name: Fernando Rapetti
Contact:

Re: Windows 2019, large REFS and deletes

Post by fsr »

And just in case you're thinking about using 2019 as an Hyper-V host: don't. Take a look at this: https://social.technet.microsoft.com/Fo ... rverhyperv
dasfliege
Service Provider
Posts: 276
Liked: 61 times
Joined: Nov 17, 2014 1:48 pm
Full Name: Florin
Location: Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by dasfliege »

The latest hint i've got from Microsoft:
Check if Trim is enabled by firing the command: "fsutil behavior query DisableDeleteNotify"

If DisableDeleteNotifiy is disabled (0) for ReFS on the 2019 server, enable it with the following commands:
"fsutil behavior set DisableDeleteNotify 1"
"fsutil behavior set DisableDeleteNotify ReFS"

It didn't really help in my scenario. Is there someone else who can verify?
mkretzer
Veeam Legend
Posts: 1203
Liked: 417 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer »

For us DisableDeleteNotify=1 helped in limiting the backend storage load. But the REFS issues only went away with 1903.
For us ReFS is truely usable only with 1903. And i just heard from microsoft that it will get even better (faster fast clones for example) with 1909.

And from what i understood MS will never solve all the issues in 2019 as they only fix critical issues there.

There is just no way back (well technically there is as the filesystem structure does not change) once you try 1903 - the difference is just so huge!
mwant
Enthusiast
Posts: 29
Liked: 1 time
Joined: Oct 04, 2011 10:33 am
Full Name: m want
Contact:

Re: Windows 2019, large REFS and deletes

Post by mwant »

Can anyone from Veeam confirm the 1903 information above by mkretzer? I am planning for a new 2019 BR and not sure I have have access to 1903 as we don't have software assurance. If it is true then is it better to got back to 2016?
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Windows 2019, large REFS and deletes

Post by DonZoomik »

I'm marking this thread as well.
I'm planning a few ~200-300T repositories with 128G RAM but as they will host Veeam backup service as well so I can't use 1903.
Gostev
Chief Product Officer
Posts: 31806
Liked: 7300 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev » 1 person likes this post

@mwant confirming the 1903 information, although our knowledge is coming from the same Microsoft support case :D
Going back to 2016 is not necessary at this time, as fully patched 2019 is comparable. Both are worse than 1903 on large volumes.
nmdange
Veteran
Posts: 528
Liked: 144 times
Joined: Aug 20, 2015 9:30 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by nmdange »

DonZoomik wrote: Oct 07, 2019 12:39 pm I'm marking this thread as well.
I'm planning a few ~200-300T repositories with 128G RAM but as they will host Veeam backup service as well so I can't use 1903.
If you are going to be running a standalone server with local storage to run both Veeam itself and the backup repository, I would suggest running Hyper-V on the physical host and then running multiple VMs, 1 for Veeam and 1 for the repository and possibly a 3rd for the Veeam SQL database. This is the way I run it and it has several benefits: preventing ReFS on the backup repository from taking up all the memory on the server, allowing the repository to be moved to new hardware without losing fast clone, allowing the repository to run on a different Windows release than Veeam, allowing for deduplication on run on the host level without interfering with ReFS block clone within the repository VM, and allowing easy backup/replication of the Veeam service VM.

I'd also suggest increasing the RAM to 256gb to give you plenty of room.
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Windows 2019, large REFS and deletes

Post by DonZoomik »

Virtualizing the repository? I can kind of see the benefits but virtualizing a 250T VM seems crazy (one of things in my mind that I wouldn't virtualize), especially when you have to add 62 4TB VDHX-s that have to be spread over several 64T underlying file systems for deduplication to work... Migrating it to new hardware, possible but with deduplication it'd take maybe... a month (while rehydrating any savings)? IMHO it's a huge complexity increase and not worth the benefits.
I've been thinking of keeping Veeam in a VM but then I'd rather keep it somewhere else. RAM is cheap and can probably be scavenged from other systems as needed.
The project is maybe a month or two out from any real steps so I'll keep an eye on this thread until then.
dasfliege
Service Provider
Posts: 276
Liked: 61 times
Joined: Nov 17, 2014 1:48 pm
Full Name: Florin
Location: Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by dasfliege »

So my case has been escalated to ReFS development team. I'm still asked to deliver extensive xperf logs, memory dumps, etc almost daily. But maybe, at some stage, that leads to a solution. I also have a case open with veeam support, which is on hold right now. Actually it seems like veeam isn't really interested to have this issue solved, even though 1903 isn't the solution for everyone. Would be glad to receive some more help from veeam, as it is quite a pain to reproduce the problem by creating the right amount of load on the repository, to be able to collect usable dumps for microsoft to analyze. I guess veeam should have so tools at hand to do such stuff.
mkretzer
Veeam Legend
Posts: 1203
Liked: 417 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer »

In defense of Veeam: they put us in direct contact to ReFS developers even when the ReFS issues are basically a OS issue. They helped us solve the issue in a month or so which I tried to solve with Microsoft support for 2 years in many support cases now.

Veeam is dependent on Microsoft to solve this as we as customers are... Microsoft politic about fixing issues like this in long term support versions is not very customer-friendly... But at least there is a solution you can buy for money...
rhys.hammond
Veeam Software
Posts: 75
Liked: 16 times
Joined: Apr 07, 2013 10:36 pm
Full Name: Rhys Hammond
Location: Brisbane , Australia
Contact:

Re: Windows 2019, large REFS and deletes

Post by rhys.hammond » 2 people like this post

thought it would be good to share this here,
issues on windows server 2019 (with ReFS) reported by another forum member; post343367.html#p343367

"bout 2 weeks in one of the physicals servers Hung…..not blue screened just hung
No rdp connectivity and from the ILO no mouse or keyboard movement, NO response at all.
Powered off and back on……backups continued where left off."

Seems like Windows Defender was the main culprit in their case,
"there has been talk that the cause seems to be Windows Av and REFS related (we use REFS Volumes)."
Veeam Certified Architect | Author of http://rhyshammond.com | Veeam Vanguard | vExpert
Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 287 guests