Comprehensive data protection for all workloads
Locked
veeeammeupscotty
Enthusiast
Posts: 33
Liked: 2 times
Joined: May 05, 2017 3:06 pm
Full Name: JP
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by veeeammeupscotty » 2 people like this post

The changelog makes it seem so trivial. You know, just doing a bit more thorough unmapping and removing some idle containers, no big deal.
- Improves ReFS performance by more thoroughly unmapping multiple views of a file. See KB4090104 for additional tunable registry parameters to address large ReFS metadata streams.

- Improves ReFS performance by removing idle containers from its hash table.
If anybody switches from the beta refs driver to this version, can you let me know how it goes? I'll do the same if I get to it first.
Mike Resseler
Product Manager
Posts: 8042
Liked: 1262 times
Joined: Feb 08, 2013 3:08 pm
Full Name: Mike Resseler
Location: Belgium
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Mike Resseler »

I'm surely hoping our bumpy ride now comes to an end. For all of you who are patching your servers, please let us know the results. First post seems good, let's hope more come :-)
BartP
Veeam Software
Posts: 230
Liked: 62 times
Joined: Aug 31, 2015 8:24 am
Full Name: Bart Pellegrino
Location: Netherlands
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by BartP » 1 person likes this post

Installed the update: so far it seems it's a lot better.
Disk latency went down by a HUGE amount already, let's wait for the merges this evening.

PS: couldn't help myself and did a backup; merges are fast and latency is low. CPU is still quite busy, but less than half of the RAM usage
Bart Pellegrino,
Technical Account Manager - EMEA
soehl
Enthusiast
Posts: 57
Liked: 8 times
Joined: May 09, 2011 12:43 pm
Full Name: Sebastian
Location: Germany
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by soehl » 2 people like this post

Looking very good so far, no outages, fast speed for merging and general backup. :-)
Slingfox
Service Provider
Posts: 5
Liked: 1 time
Joined: Dec 06, 2017 7:02 pm
Full Name: Chris Imrie
Location: United Kingdom
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Slingfox » 1 person likes this post

Before the patch a fast clone was taking around 1hour 43minutes (from last nights job summary). Applied the patch and the same fast clone today is 7minutes 5seconds.

Quite a significant improvement as far as we're concerned; although I'll need to monitor this over the next week to ensure it stays consistent.
Mike Resseler
Product Manager
Posts: 8042
Liked: 1262 times
Joined: Feb 08, 2013 3:08 pm
Full Name: Mike Resseler
Location: Belgium
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Mike Resseler »

At this moment I like what I am reading (which is great since it is Friday evening ;-))

Keep it coming. For those already responded. It would be great if you would post your findings somewhere next week when a few runs have been done.

And I'm going to start the weekend (in a few hours) with a big smile ;-)
joespirit88
Enthusiast
Posts: 39
Liked: 5 times
Joined: Jul 04, 2017 12:53 pm
Full Name: Joe Spirit
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by joespirit88 »

So far we have definitely seen an improvement since the update, however we have also seen better performance in the past after rebooting the Repo server.

Will update after the weekends full runs.

What are peoples thoughts on the registry key options noted in the KB? Does veeam have any recommendations after your discussions with clients on the beta?
SBarrett847
Service Provider
Posts: 315
Liked: 41 times
Joined: Feb 02, 2016 5:02 pm
Full Name: Stephen Barrett
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by SBarrett847 »

Applied it to my Cloud Connect Repo Server - initial test runs are looking good - the real test will be all the GFS runs over the weekend though...
rvvliet78
Novice
Posts: 3
Liked: 1 time
Joined: Apr 26, 2017 9:43 am
Full Name: Rick van Vliet
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by rvvliet78 »

I plan to update this sometime next week and I'll post my findings.
kubimike
Veteran
Posts: 373
Liked: 41 times
Joined: Feb 03, 2017 2:34 pm
Full Name: MikeO
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by kubimike »

jayscarff wrote:Same version here! Testing some backup jobs currently, copy jobs kick off in the morning which will hit the repository with multiple GFS at the same time!

First job finished, pre patch the fast clone took 2 hours, post patch, 25 minutes. :D
See if you get the same speed results when the job runs again WITHOUT rebooting. In the past the job was fast, then any subsequent job after was slow.
kubimike
Veteran
Posts: 373
Liked: 41 times
Joined: Feb 03, 2017 2:34 pm
Full Name: MikeO
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by kubimike »

Im afraid to take the leap from beta 2 out to a production driver! Oh the sleepless nights !
Gostev
Chief Product Officer
Posts: 31429
Liked: 6633 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Gostev »

veeeammeupscotty wrote:The change log makes it seem so trivial. You know, just doing a bit more thorough unmapping and removing some idle containers, no big deal.
I too really enjoyed the way they put it. It's really ingenious because I could physically feel discomfort in my brain while reading this :|

Cognitive dissonance
In the field of psychology, cognitive dissonance is the mental discomfort experienced by a person who simultaneously holds two or more contradictory beliefs, ideas, or values.
Gostev
Chief Product Officer
Posts: 31429
Liked: 6633 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Gostev » 1 person likes this post

joespirit88 wrote:What are peoples thoughts on the registry key options noted in the KB? Does veeam have any recommendations after your discussions with clients on the beta?
I asked Microsoft folks the same question a few weeks ago. These values are for corner cases and normally should not be used. Just installing the patch should be sufficient in most cases.
joespirit88
Enthusiast
Posts: 39
Liked: 5 times
Joined: Jul 04, 2017 12:53 pm
Full Name: Joe Spirit
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by joespirit88 »

Thanks for confirming Gostev!
operations
Service Provider
Posts: 12
Liked: never
Joined: Nov 25, 2017 6:49 pm
Full Name: operations
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by operations »

Would be also nice to know the repo size and larget VM size for those that upgraded.
soehl
Enthusiast
Posts: 57
Liked: 8 times
Joined: May 09, 2011 12:43 pm
Full Name: Sebastian
Location: Germany
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by soehl »

Our repos have a size from around 50TB to arround 100TB, all repos based on RAID 60 with 2 Stripes. The newest repos have a ssd r/w-cache. (HPE SmartCache)
The largest VMs that we have in Veeam are around 14TB.
jayscarff
Service Provider
Posts: 114
Liked: 12 times
Joined: Nov 15, 2016 6:56 pm
Location: Cayman Islands
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by jayscarff »

operations wrote:Would be also nice to know the repo size and larget VM size for those that upgraded.
If there is a recommend LUN size for Veeam that would be great to know, I've a 400TB lun for my refs backup volume..MS sizing limits are...
https://docs.microsoft.com/en-us/window ... s-overview
Jason
VMCE
Gostev
Chief Product Officer
Posts: 31429
Liked: 6633 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Gostev »

Before the patch, you really wanted to avoid large LUNs like the one you mentioned. The patch changes the game, but no one will tell you the new recommendations as it has just been released. I do know that one of the Veeam users that Microsoft was working with very closely had 400TB ReFS volume.
DaveWatkins
Veteran
Posts: 370
Liked: 97 times
Joined: Dec 13, 2015 11:33 pm
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by DaveWatkins » 1 person likes this post

We're running 3 x ~62TB LUN's. Specifically kept them under 64TB because of the various MS/Windows things that have issues bigger than that. Patch seems to have helped fairly dramatically with latency on the LUN's. We'd had various registry entries set before the patch which got us stable (if not really very fast) that I've removed after applying the patch.

Merge times also seem improved using Fast/Block Clone
billcouper
Service Provider
Posts: 150
Liked: 30 times
Joined: Dec 18, 2017 8:58 am
Full Name: Bill Couper
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by billcouper »

I have only had one repo server lock up since the February patch and I feel it was unrelated to REFS anyway.

Fast clones were always fast and are still fast, but overall the performance of the volumes feels lower then before the patch. Read/write speeds feel slow. Moving data from one volume to another is painfully slow. Perhaps MS rate limited them to avoid locking up servers? I will see if I still have any detailed performance data from before the update and compare.
kubimike
Veteran
Posts: 373
Liked: 41 times
Joined: Feb 03, 2017 2:34 pm
Full Name: MikeO
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by kubimike »

I noticed is Gostev’s Sunday Email it says uninstall the pre release driver. I assume the process would be to stop all Veeam services boot to recovery mode move out the pre release drives and copy which version of the driver back ? Or does that not matter ?? Reboot then patch ?
anton
Novice
Posts: 7
Liked: 1 time
Joined: Oct 04, 2011 7:22 am
Full Name: Anton van der Linden
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by anton » 1 person likes this post

Also in our enviroment a major improvement.
Merges go back from 1 hour to 15 minutes; also in the second run.

Before the patch we had the following registry settings on all repositories (2 backup repositories (210TB + 90 TB), 1 copy repository (112TB))
HKLM\SYSTEM\CurrentControlSet\Control\FileSystem\RefsEnableLargeWorkingSetTrim = 1
HKLM\SYSTEM\CurrentControlSet\Control\FileSystem\RefsNumberOfChunksToTrim = 32
HKLM\SYSTEM\CurrentControlSet\Control\FileSystem\RefsDisableCachedPins = 1
HKLM\SYSTEM\CurrentControlSet\Control\FileSystem\RefsProcessedDeleteQueueEntryCountThreshold = 512

I noticed that this morning I only saw the first setting:
HKLM\SYSTEM\CurrentControlSet\Control\FileSystem\RefsEnableLargeWorkingSetTrim = 1

The others were gone.
I also removed this setting this morning; will let you know tomorrow if this influenced the performance.
rhiem
Novice
Posts: 8
Liked: 2 times
Joined: Feb 22, 2016 8:49 pm
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by rhiem » 2 people like this post

The Patch definitely fix the slow Fast Clone Process in our environment.

Before the Patch:

Job1: FastClone -> 1 to 2 Hours

After the Patch:

Job1: FastClone -> 6 to 8 Minutes

I will let you know if it is stable.
antipolis
Enthusiast
Posts: 73
Liked: 9 times
Joined: Oct 26, 2016 9:17 am
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by antipolis » 1 person likes this post

seems better here as well... ~8 hours > ~2 hours

need to confirm over the next few weeks, I'm tempted to temporarly re-enable synthetics on my biggest job to have a better idea of the improvements
mweissen13
Enthusiast
Posts: 93
Liked: 54 times
Joined: Dec 28, 2017 3:22 pm
Full Name: Michael Weissenbacher
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by mweissen13 » 1 person likes this post

Luckily enough we never had any lock-ups before the patch, but our Repos are fairly small (20TB max) and we always used 64KiB cluster size and plenty of RAM. Now with the patch applied the performance seems to be better, but that was always the case for some days after a reboot. We will see after a few weeks if the performance stays good for a prolonged time.
LeoKurz
Veeam ProPartner
Posts: 28
Liked: 7 times
Joined: Mar 16, 2011 8:36 am
Full Name: Leonhard Kurz
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by LeoKurz »

WSUS offers "2018-02 Cumulative Update... (KB4074590)"
Update Calatog offers "2018-02 Cumulative Update... (KB4077525)"

Is it save to import the later patch into WSUS and deplioy it from there?

__Leo
suprnova
Enthusiast
Posts: 38
Liked: never
Joined: Apr 08, 2016 5:15 pm
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by suprnova »

Seems like everyone has better luck than me. I migrated most of my backups back to NTFS after 8 months of battling ReFS so it's difficult for me to test overall improvement, but I just tested out a delete on a patched repo. While it's a slight improvement (I didn't need to reset the repo), the repository drive goes offline for the duration of the delete. I am still using the usual registry keys. I also tested out a synthetic full with block clone to another repo and it completely froze up 30 hours into it with CPU at 100% (I did have to reset this one).

I'm definitely not comfortable enough to recommend ReFS even with this latest patch.
Gostev
Chief Product Officer
Posts: 31429
Liked: 6633 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Gostev »

@suprnova if you are still observing system freezes, then most likely you did not install the patch correctly. From what I remember, this freeze issue was caused a bug in the OS memory manager - an NTFS-specific optimization that was acting up with ReFS volumes, and the patch does address one.

Also, the presence of "the usual registry keys" indicates you may have tried some older version of the patch before at some point, and it may be still there messing things up (guess this is why ReFS team insisted all that old stuff must be removed/uninstalled before installing the patch).

If I were you, I would just start from clean OS install - this is the only way to really make sure you're using the patch in the way that was tested by Microsoft QC.
Ctek
Service Provider
Posts: 83
Liked: 13 times
Joined: Nov 11, 2015 3:50 pm
Location: Canada
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by Ctek » 1 person likes this post

I applied the patch to 1 server for DEV servers, I can't really comment on performance as I did not do fulls with it (I'll do it next week) but what I do see on my monitoring is that the RAM dips during the night are less drastic. This means that on my end at least, on only 1 server, there is lower RAM usage overall during an intensive backup Window. Once properly tested, i'll report back on bigger production servers.
VMCE
jameskilbynet
Enthusiast
Posts: 32
Liked: 7 times
Joined: Jan 14, 2015 11:18 am
Full Name: James Kilby
Contact:

Re: REFS issues (server lockups, high CPU, high RAM)

Post by jameskilbynet »

We are still seeing some stability issues post this patch. Ours is 160TB REFS volume with approx 80TB in use. We have 128Gb of ram and this is a storage space ( mirror setup) with Nvme cache. We see issues with large data ingestion ie active full or evac of another repo towards the REFS one. We will open another call with Veeam/MS tomorrow
Locked

Who is online

Users browsing this forum: Bing [Bot], Google [Bot], RobinPGI and 92 guests