Comprehensive data protection for all workloads
dellock6
VeeaMVP
Posts: 6166
Liked: 1971 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by dellock6 »

rgarvelink wrote:As stated in this thread, 4Gb of memory per core is the recommendation wouldn't OP need 64 Gb just for the Veeam operations assuming he's at 16 threads and is hitting the recommendation of 1 core for every thread? We know that ReFS prioritizes data availability over everything else and it appears to do so via memory consumption. We might just need to take that into consideration when sizing repositories.
I just wanted to highlight this part, as you are spot on and some people often overlook this part. 4GB per job (sorry my bad as I wrote core following the reply, thanks Anton) is the requested sizing for Veeam repository services, not the entire system. If you have additional services, like the ReFS components, you need to add these additional requirements into the sum of the needed memory. For 16 cores, thus handling usually 16 concurrent jobs, you need at least 64GB for just Veeam, plus the OS and its services.

Thanks for remiding this!

Luca
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Gostev
Chief Product Officer
Posts: 31816
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by Gostev »

Guys, you probably meant 4GB per concurrent job (not per core) for the repository. One core = one task (if you follow our recommendation), however a single job usually uses many tasks.
suprnova
Enthusiast
Posts: 38
Liked: never
Joined: Apr 08, 2016 5:15 pm
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by suprnova »

I added a new 64k repository and we had this issue again. The VM was reset and appears to be responsive, when before it would freeze after 30 seconds.

Here's the error message from the job that was running:

Synthetic full backup creation failed Error: Agent: Failed to process method {Transform.CompileFIB}: A file system block being referenced has already reached the maximum reference count and can't be referenced any further. Failed to duplicate extent
Gostev
Chief Product Officer
Posts: 31816
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by Gostev »

The error is quite strange as it would require having around 8000 of backup files in the chain to trigger. Perhaps it happens due to system instability, rather than running into the actual BlockClone API limitation. Please let me know the support case ID for this issue, as our devs would like to collect more information from debug logs. Thanks!
matteo.tosino
Novice
Posts: 3
Liked: never
Joined: Mar 21, 2017 9:52 am
Full Name: Matteo Tosino
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by matteo.tosino »

hi, i have the same error (the maximum reference, from the last reply of suprnova) while Copy job merge, after 2 months of backup chain (7 Days, 4 Weeks, 3 Months, of 5 vms in one backup job).
This is the second time: the first time i open a case but with no solution
The support try to:
1. remove configuration of the copy job, and import again creaeting a new backup repository in a new location, impossible in my case (i have only one volume; i have try within the same volume, the error remain)
2. a volume defrag, but the defrag activity hung after few minutes with no disk or CPU activity).
3. refer to windows error code 0xC000048C STATUS_BLOCK_TOO_MANY_REFERENCES, but there is no documentation
I have closed the ticket because i read (don't remember where, sorry) to use ReFS with Storage Space as backup repository for Veeam. I try it for some days, but it's too slow respect of an hardware (hp smart array) management of my backup storage (JBOD).

Now i have again the same problem, what do you suggest?
I'm sure is a ReFS related problem, because it's born with this change in my infrastructure. Before to come back to NTFS, do you know some trick to solve this problem?
kubimike
Veteran
Posts: 391
Liked: 56 times
Joined: Feb 03, 2017 2:34 pm
Full Name: MikeO
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by kubimike »

@matteo.tosino can I chime in and ask if you've tried doing an active-full on the job to see if it goes away ?
matteo.tosino
Novice
Posts: 3
Liked: never
Joined: Mar 21, 2017 9:52 am
Full Name: Matteo Tosino
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by matteo.tosino »

good idea! i have started an active full just now, i'll report here
thanks
mickyv
Novice
Posts: 9
Liked: never
Joined: Apr 04, 2017 2:38 am
Full Name: Michael V
Location: Adelaide, Australia
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by mickyv »

Hey matteo.tosino, did this issue go away for you?
matteo.tosino
Novice
Posts: 3
Liked: never
Joined: Mar 21, 2017 9:52 am
Full Name: Matteo Tosino
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by matteo.tosino »

no, i gave up and i went back to NTFS :(
Ctek
Service Provider
Posts: 84
Liked: 13 times
Joined: Nov 11, 2015 3:50 pm
Location: Canada
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by Ctek »

suprnova wrote:I added a new 64k repository and we had this issue again. The VM was reset and appears to be responsive, when before it would freeze after 30 seconds.

Here's the error message from the job that was running:

Synthetic full backup creation failed Error: Agent: Failed to process method {Transform.CompileFIB}: A file system block being referenced has already reached the maximum reference count and can't be referenced any further. Failed to duplicate extent
This exact error message issue started happening in our test environment overtime. ReFS 64K volume.
VMCE
sitruk
Novice
Posts: 6
Liked: never
Joined: Dec 01, 2016 4:54 pm
Full Name: Kurtis
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by sitruk »

I am so happy I found this thread. I have been battling this issue for a couple weeks and it took me this long to narrow down the cause. I even reimaged my physical server thinking Windows was corrupt.

Physical Server
Win Server 2016 x64
Intel Xeon Quad Core
32GB RAM
2x ReFS Repositories 10TB and 20TB both formatted with 64K cluster
Added to virtual Veeam B&R 6.5 update 2 server as a proxy

I have one job with all my VMs where most are very small. I ordered the VMs to backup smallest first and they all complete successfully. Performance is great, but once I get to the large VM the RAM is slowly consumed until full. Then the physical server locks up due to a lack of resources. The job fails and the RAM is left consumed. I can sometimes remote in and reboot the server to reclaim the RAM.

Windows recently released a fix for ReFS memory usage. https://support.microsoft.com/en-us/hel ... windows-10

Unfortunately, when I attempt to install this fix it says it is not applicable to my computer. I am leaning more towards formatting my repositories with NTFS and calling it a day.
graham8
Enthusiast
Posts: 59
Liked: 20 times
Joined: Dec 14, 2016 1:56 pm
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by graham8 »

sitruk wrote:once I get to the large VM the RAM is slowly consumed until full. Then the physical server locks up due to a lack of resources. The job fails and the RAM is left consumed. I can sometimes remote in and reboot the server to reclaim the RAM.
Well, if you look over at veeam-backup-replication-f2/refs-4k-hor ... 40629.html you'll find that plenty have had this issue with the default of 4k. Apparently, from comments, 64k has been an issue as well - just much less frequently. What you're describing is definitely what I was seeing with 4k. What's the size of your largest job?
sitruk wrote:Windows recently released a fix for ReFS memory usage. https://support.microsoft.com/en-us/hel ... windows-10 .. Unfortunately, when I attempt to install this fix it says it is not applicable to my computer.
Odd about not being applicable. YMMV, but in our case, even with all the options enabled for us at their most strict settings, the problem still occurred.
sitruk wrote:I am leaning more towards formatting my repositories with NTFS and calling it a day.
I've had a MS case open about the issue for months now. I got a bunch of manually-initiated (when the server locks due to resource consumption) memory dumps sent to them, and then after that point I had to reload with something that I could rely on since this wasn't a test server. It's been about a month, I think, since I got the memory dumps sent to MS. I got a reply after a few weeks that they had analyzed them, saying that they think they knew what is going on, and that they wanted to do testing on our server. I had to point out that this is a production server and that Microsoft needs to do their testing internally. I asked what the cause was, though, because we have other ReFS+2016+Veeam servers (though those have more ram and have been more stable), but I haven't gotten a reply after asking a few times.

So...if you can deal with the backup space inflation of NTFS, then yes, I think that's probably wise. They may fix the issue "soon", but I personally wouldn't want to bet on it.
sitruk
Novice
Posts: 6
Liked: never
Joined: Dec 01, 2016 4:54 pm
Full Name: Kurtis
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by sitruk »

graham8 wrote:Well, if you look over at veeam-backup-replication-f2/refs-4k-hor ... 40629.html you'll find that plenty have had this issue with the default of 4k. Apparently, from comments, 64k has been an issue as well - just much less frequently. What you're describing is definitely what I was seeing with 4k. What's the size of your largest job?

Odd about not being applicable. YMMV, but in our case, even with all the options enabled for us at their most strict settings, the problem still occurred.

I've had a MS case open about the issue for months now. I got a bunch of manually-initiated (when the server locks due to resource consumption) memory dumps sent to them, and then after that point I had to reload with something that I could rely on since this wasn't a test server. It's been about a month, I think, since I got the memory dumps sent to MS. I got a reply after a few weeks that they had analyzed them, saying that they think they knew what is going on, and that they wanted to do testing on our server. I had to point out that this is a production server and that Microsoft needs to do their testing internally. I asked what the cause was, though, because we have other ReFS+2016+Veeam servers (though those have more ram and have been more stable), but I haven't gotten a reply after asking a few times.

So...if you can deal with the backup space inflation of NTFS, then yes, I think that's probably wise. They may fix the issue "soon", but I personally wouldn't want to bet on it.
My largest job is almost 12TB but the RAM issue occurs on a 1TB backup. I think that update isn't applicable because it was superseded by another one. However, the registry keys mentioned in the KB are not present on my server. Were you able to try changing those REG keys as per the KB? Also, do you know if the ReFS storage needs to be configured within Storage Spaces for it to function properly?

The space savings seem to be significant as well as the incremental full performance. So getting ReFS to work would be pretty awesome.
graham8
Enthusiast
Posts: 59
Liked: 20 times
Joined: Dec 14, 2016 1:56 pm
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by graham8 »

You have to create the registry keys - they aren't there by default.
sitruk
Novice
Posts: 6
Liked: never
Joined: Dec 01, 2016 4:54 pm
Full Name: Kurtis
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by sitruk »

I realized that after looking over the page again. I actually ended up passing my disks through and I'm giving Storage Spaces a shot.
sitruk
Novice
Posts: 6
Liked: never
Joined: Dec 01, 2016 4:54 pm
Full Name: Kurtis
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by sitruk »

graham8 wrote:You have to create the registry keys - they aren't there by default.
Using storage spaces seems to prevent memory from filling up but it writes to the disk in bursts which results in horrible write performance.
Hauke
Influencer
Posts: 23
Liked: 4 times
Joined: Apr 16, 2015 11:25 am
Full Name: Hauke Ihnen
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by Hauke »

Same issue here.
The storage with ReFS crashed today, not responding to connections anymore, Veeam jobs are hanging endless.
After reset of the storage it works - for a few minutes, then the memory growed to 100% usage, CPU too, and again, not responding anymore.
After a next reboot the same.
And after another reboot - now the ReFS drive is gone. Trying to access it -> freese. Opening disk management -> freese.
Other disks on the same raid are working, so this is not a drive issue.
The Windows 2016 was fully patched to the latest version.

Will now switch back to NTFS. With 2012 R2. Server 2016 is still much to buggy and not ready for production, I had a lot of other issues with it too.
rkovhaev
Veeam Software
Posts: 39
Liked: 21 times
Joined: May 17, 2010 6:49 pm
Full Name: Rustam
Location: hockey night in canada
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by rkovhaev » 2 people like this post

Just for visibility I am sharing details for the error that was mentioned a few times in this thread
"A file system block being referenced has already reached the maximum reference count and can't be referenced any further."

We execute DeviceIoControl() with FSCTL_DUPLICATE_EXTENTS_TO_FILE control code to clone block and Windows kernel returns to us
ERROR_BLOCK_TOO_MANY_REFERENCES (0x0000015b).
We reproduced issue without our software with a few lines of code
https://github.com/rustylife/apitest/bl ... sclone.cpp

This one looks like undocumented REFS limitation.
Case with MS support has been opened.
dellock6
VeeaMVP
Posts: 6166
Liked: 1971 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by dellock6 »

Great finding Rustam!
Sounds like one block can only be referenced X times, did you find what's this limit? It's interesting as GFS or synthetic fulls may hit this limit.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
rkovhaev
Veeam Software
Posts: 39
Liked: 21 times
Joined: May 17, 2010 6:49 pm
Full Name: Rustam
Location: hockey night in canada
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by rkovhaev »

dellock6 wrote: did you find what's this limit?
no, still waiting for MS
christopher-swe
Service Provider
Posts: 21
Liked: 1 time
Joined: Dec 14, 2016 6:54 am
Full Name: Christopher Svensson
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by christopher-swe »

Hi,

I also have run across this problem. We have two Veeam B&R servers with two storage nodes running Windows 2016 and ReFS. Both setup have identical hardware.
I only have this problem with one job, and this job unfortunately contains a whopping 240VM's
Other jobs on the same storage have about 50VM's and I haven't seen this problem with them.

The jobs runs once per night and are set to create synthetic full every Sunday and has been running for two month now.

Can the amount of VM per job affect this?
dellock6
VeeaMVP
Posts: 6166
Liked: 1971 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by dellock6 »

I feel like no, especially with per-vm chains enabled. Each VM has its own chain, so blocks can only be cloned inside the same chain. Having 1 vm or 1000 vm should not make any difference if I'm not mistaken. It may make much more difference the retention that has been selected for a given job, and even more inside this chain how many times a block is cloned, like with synthetic fulls or GFS points.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
mkretzer
Veeam Legend
Posts: 1203
Liked: 417 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by mkretzer »

@dellock6 With per-VM our deletion of restore points took hours and blocked the storage completely. Without per-VM we did not have that specific issue.
ivordillen
Enthusiast
Posts: 62
Liked: never
Joined: Nov 03, 2011 2:55 pm
Full Name: Ivor Dillen
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by ivordillen »

I have this issue on one job with a retention of 120 and I am thinking I have 110 synthetic fulls...
Delo123
Veteran
Posts: 361
Liked: 109 times
Joined: Dec 28, 2012 5:20 pm
Full Name: Guido Meijers
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by Delo123 » 1 person likes this post

If there really is a hard limit, it would mean ReFS would pracically be useless for retention. I assume Veeam is already in contact with Microsoft regarding this?
ivordillen
Enthusiast
Posts: 62
Liked: never
Joined: Nov 03, 2011 2:55 pm
Full Name: Ivor Dillen
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by ivordillen »

After a few failures I brought the retention back to 100 and all is working well.
rkovhaev
Veeam Software
Posts: 39
Liked: 21 times
Joined: May 17, 2010 6:49 pm
Full Name: Rustam
Location: hockey night in canada
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by rkovhaev »

Per Microsoft premier support (case no 117062915966967), maximum reference count in REFS kernel module is 8175, which is really low limit
https://docs.microsoft.com/en-us/window ... ck-cloning

Chances to get ERROR_BLOCK_TOO_MANY_REFERENCES (0x0000015b) during REFS fast-clone are pretty high actually.
nmdange
Veteran
Posts: 528
Liked: 144 times
Joined: Aug 20, 2015 9:30 pm
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by nmdange » 1 person likes this post

Wouldn't that mean you'd have to have do 8175 separate fast clones? In a typical once-a-day backup that'd be 8175 days or 22 years! Or can a single fast clone reference the same block multiple times? Even if you were doing a backup once per hour, you'd need to have almost a years worth of hourly backups saved. Typically, I'd expect you only keep hourly/daily backups for a month or two, and long term you'd have GFS backups monthly/quarterly/yearly which are much smaller in number.
ds2
Enthusiast
Posts: 82
Liked: 19 times
Joined: Jul 16, 2015 6:31 am
Full Name: Rene Keller
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by ds2 »

What will veeam do in this case? You know how offten you map this block. So you can write this block to a new location short before runing against this limit.
rkovhaev
Veeam Software
Posts: 39
Liked: 21 times
Joined: May 17, 2010 6:49 pm
Full Name: Rustam
Location: hockey night in canada
Contact:

Re: 9.5/ReFS/Server 2016 Memory Consumption

Post by rkovhaev »

per-vm backup chain and disabled "inline dedup" in job properties should help to avoid this issue.
if you are already affected by it you can always temporarily disable REFS fast-clone and let the job do merge with ReadFile() WriteFile() instead of DeviceIoControl()
Post Reply

Who is online

Users browsing this forum: AdsBot [Google], AlexL, Bing [Bot] and 47 guests