-
- Enthusiast
- Posts: 57
- Liked: 8 times
- Joined: Jul 13, 2009 12:50 pm
- Full Name: Mark
- Location: The Netherlands
- Contact:
Re: REFS 4k horror story
Our Veeam repository's(SAN with one 2008R2 and one Windows 2012R2 server) with ~50TB of Veeam backups on it is going to need replacement very soon.
Same question as David, but I'm also wondering if it's smart to replace it with one repository with Windows 2016 server with ReFS( 64K ) environment.
Is it ready for prime time with volumes of 70-80TB?
Same question as David, but I'm also wondering if it's smart to replace it with one repository with Windows 2016 server with ReFS( 64K ) environment.
Is it ready for prime time with volumes of 70-80TB?
-
- Veteran
- Posts: 361
- Liked: 109 times
- Joined: Dec 28, 2012 5:20 pm
- Full Name: Guido Meijers
- Contact:
Re: REFS 4k horror story
Hi Mark,
we are using 2016 REFS 64K repository for some 2 months now. 189TB Volume with currently 23,4TB of Data on it until now. No glitch, everything looking stable.
However we still make a second backup to 2012R2 Dedupe volumes and another copy job to be sure... I wouldn't trust ReFS 100% yet if that's your "only" repository, too early...
we are using 2016 REFS 64K repository for some 2 months now. 189TB Volume with currently 23,4TB of Data on it until now. No glitch, everything looking stable.
However we still make a second backup to 2012R2 Dedupe volumes and another copy job to be sure... I wouldn't trust ReFS 100% yet if that's your "only" repository, too early...
-
- Novice
- Posts: 6
- Liked: 2 times
- Joined: Feb 16, 2017 12:25 pm
- Full Name: Ondřej Kraus
- Contact:
Re: REFS 4k horror story
Same here, wondering if I should switch to ReFS or not. Looking forward for today's veeam webinar: "Scaling backup repositories with Veeam & Microsoft ReFS"
-
- Enthusiast
- Posts: 63
- Liked: 9 times
- Joined: Nov 29, 2016 10:09 pm
- Contact:
Re: REFS 4k horror story
Confirming that bug still exists despite the update. It killed our VBR server right now again, whilst we have updated all servers 20 hours ago with patch that seems to target also ReFS. But not this issue. We are going to invest time to find other solutions. Regretting we had chosen ReFS filesystem.
-
- Veeam Legend
- Posts: 1203
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: REFS 4k horror story
@alesovodvojce Killed how? Bluescreen or hang/blackscreen?
-
- Veteran
- Posts: 391
- Liked: 56 times
- Joined: Feb 03, 2017 2:34 pm
- Full Name: MikeO
- Contact:
Re: REFS 4k horror story
@ alesovodvojce Im assuming you're 4k ?
-
- Enthusiast
- Posts: 59
- Liked: 20 times
- Joined: Dec 14, 2016 1:56 pm
- Contact:
Re: REFS 4k horror story
I'm with you alesovodvojce. I switched to Veeam + Server 2016 + ReFS because I wanted a more supported/mainstream (space-efficient) solution that didn't rely as much on me personally to support unix systems and elaborate custom scripting, but what was in place before (ZFS + ZFS Send + GFS Snapshots) is looking better and better, in spite of less people being able to support it.
I guess it'll get resolved eventually, but it's disheartening when you make a move to try to be more responsible and put things in a more mainstream, supported place with major companies like Microsoft (obviously this isn't Veeam's fault of course) and end up with something far hackier and flakier than the "custom scripted with free stuff" setup that was in place before.
I guess it'll get resolved eventually, but it's disheartening when you make a move to try to be more responsible and put things in a more mainstream, supported place with major companies like Microsoft (obviously this isn't Veeam's fault of course) and end up with something far hackier and flakier than the "custom scripted with free stuff" setup that was in place before.
-
- Enthusiast
- Posts: 63
- Liked: 9 times
- Joined: Nov 29, 2016 10:09 pm
- Contact:
Re: REFS 4k horror story
ReFS 4k and VM halt resulting force poweroff - that is the answer for @mkretzer @kubimike question
Btw after months of investigation and lately finding out it is a bug in Microsoft's product,
to lower the chance of trouble frequency, if you are on ReFS:
- use 64k cluster size. I have no exp, but IT pros here experience less troubles with it now
- use minimum concurrent jobs. Our experience. The less jobs are running in parallel, the less stress of ReFS resulting timeouts, resulting hangs
- impose rate limits on storage. In veeam. For backup repository, in properties, you can impose read/write limiting.
We have imposed 1/10th of speed = 70MBs. This avoids stresses of repo also.
In our case, first this bug "kills" the guest with remote repo. At the bkf file consolidation phase. This waiting transaction then stresses and later "kills" primary backup server, that initiated backup copy job. When we impose concurrency, rate limits, and disable backup jobs, thd ReFS can work for week without hang, even more.
I am writing here to warn you, and in a hope that somebody might cross a solution...
Btw after months of investigation and lately finding out it is a bug in Microsoft's product,
to lower the chance of trouble frequency, if you are on ReFS:
- use 64k cluster size. I have no exp, but IT pros here experience less troubles with it now
- use minimum concurrent jobs. Our experience. The less jobs are running in parallel, the less stress of ReFS resulting timeouts, resulting hangs
- impose rate limits on storage. In veeam. For backup repository, in properties, you can impose read/write limiting.
We have imposed 1/10th of speed = 70MBs. This avoids stresses of repo also.
In our case, first this bug "kills" the guest with remote repo. At the bkf file consolidation phase. This waiting transaction then stresses and later "kills" primary backup server, that initiated backup copy job. When we impose concurrency, rate limits, and disable backup jobs, thd ReFS can work for week without hang, even more.
I am writing here to warn you, and in a hope that somebody might cross a solution...
-
- Product Manager
- Posts: 8191
- Liked: 1322 times
- Joined: Feb 08, 2013 3:08 pm
- Full Name: Mike Resseler
- Location: Belgium
- Contact:
Re: REFS 4k horror story
Hi Alesovodvojce,
Thank you for your golden tips! They are really appreciated. As some stated here, we are waiting for MSFT to fix this. This seems to be the only real solution to wait for. That and your tips (and please use 64K if you are starting now).
@graham8: As said, this is really painful but I still believe (but please let it be fixed soon ) that ReFS has a bright future, not only in combination with Veeam but also as a file system that hosts VMs (the checkpoint merge is extremely impressive and fast).
From our side, Gostev will continue to "bug" MSFT for this for sure.
This thread will be continued...
Thank you for your golden tips! They are really appreciated. As some stated here, we are waiting for MSFT to fix this. This seems to be the only real solution to wait for. That and your tips (and please use 64K if you are starting now).
@graham8: As said, this is really painful but I still believe (but please let it be fixed soon ) that ReFS has a bright future, not only in combination with Veeam but also as a file system that hosts VMs (the checkpoint merge is extremely impressive and fast).
From our side, Gostev will continue to "bug" MSFT for this for sure.
This thread will be continued...
-
- Veeam Legend
- Posts: 1203
- Liked: 417 times
- Joined: Dec 17, 2015 7:17 am
- Contact:
Re: REFS 4k horror story
One more thing; our feeling is that it works better without per-VM chains because at least the filesystem does not have to track so many files. Overall slowness of the filesystem seems better that way.
-
- Chief Product Officer
- Posts: 31814
- Liked: 7302 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: REFS 4k horror story
All, great news! It appears that KB4013429 does include the fix for this issue, however you need to enable the newly added registry value to activate this new behavior. This makes sense, we also like to introduce major behavior modifiers this way before making them default.
I have the instructions, but I was asked not to share them broadly, because the ReFS dev team is planning to release the detailed blog post today that will have full context and details. If you absolutely cannot wait, shoot me a PM.
I have the instructions, but I was asked not to share them broadly, because the ReFS dev team is planning to release the detailed blog post today that will have full context and details. If you absolutely cannot wait, shoot me a PM.
-
- Veteran
- Posts: 391
- Liked: 56 times
- Joined: Feb 03, 2017 2:34 pm
- Full Name: MikeO
- Contact:
Re: REFS 4k horror story
@Gostev great news!
-
- Service Provider
- Posts: 84
- Liked: 13 times
- Joined: Nov 11, 2015 3:50 pm
- Location: Canada
- Contact:
Re: REFS 4k horror story
As a side question, why would someone use 4K Allocation size for ReFS repository? Doesn't 64K makes more sense since pretty much all the files in there will be of considerable size. Am I missing something? Unless it was left at default while creating the repository partitions.
VMCE
-
- Enthusiast
- Posts: 72
- Liked: 16 times
- Joined: Jul 16, 2012 1:54 pm
- Full Name: Harold Adams
- Contact:
Re: REFS 4k horror story
I guess I can wait until the blog post. Gostev, I hope you will give us a link to that blog post on this thread? (Thanks as always)
-
- Chief Product Officer
- Posts: 31814
- Liked: 7302 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: REFS 4k horror story
Sure thing, I will share the link unless someone beats met to that.
@Ctek because of 10% more used space with 64K clusters due to alignment requirement of BlockClone API. Otherwise you're right, 64KB makes sense considering the workload and very large volume sizes.
@Ctek because of 10% more used space with 64K clusters due to alignment requirement of BlockClone API. Otherwise you're right, 64KB makes sense considering the workload and very large volume sizes.
-
- Service Provider
- Posts: 33
- Liked: 1 time
- Joined: Jun 13, 2016 6:51 am
- Full Name: Søren Emig
- Contact:
Re: REFS 4k horror story
Greate news
However, I am a little confused. I have concluded (from reading this forum) that 64K has the same problem, although the likelihood of running into the bug is less. Am I correct?
….will this hotfix+registry fix this as well?
I hope so since I am planning to build a large REFS repository
However, I am a little confused. I have concluded (from reading this forum) that 64K has the same problem, although the likelihood of running into the bug is less. Am I correct?
….will this hotfix+registry fix this as well?
I hope so since I am planning to build a large REFS repository
-
- Product Manager
- Posts: 8191
- Liked: 1322 times
- Joined: Feb 08, 2013 3:08 pm
- Full Name: Mike Resseler
- Location: Belgium
- Contact:
Re: REFS 4k horror story
@Soren,
Yes, the likelihood will be lower to run into this bug. And yes, this hotfix + registry should fix both so...
Yes, the likelihood will be lower to run into this bug. And yes, this hotfix + registry should fix both so...
-
- Veteran
- Posts: 391
- Liked: 56 times
- Joined: Feb 03, 2017 2:34 pm
- Full Name: MikeO
- Contact:
Re: REFS 4k horror story
@Mike Resseler "Fix both soon" can you explain what you mean by that please?
-
- Chief Product Officer
- Posts: 31814
- Liked: 7302 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: REFS 4k horror story
He did not say what you are quoting the issue may impact any server regardless of ReFS cluster size, so the fix applies to all ReFS deployments.
-
- Veteran
- Posts: 391
- Liked: 56 times
- Joined: Feb 03, 2017 2:34 pm
- Full Name: MikeO
- Contact:
Re: REFS 4k horror story
Huh? I read what he typed seems so So for us that have 64k size will we need any registry tweaks or can we just install the latest KB ? TIA
-
- Chief Product Officer
- Posts: 31814
- Liked: 7302 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: REFS 4k horror story
KB + registry tweaks
-
- Veteran
- Posts: 391
- Liked: 56 times
- Joined: Feb 03, 2017 2:34 pm
- Full Name: MikeO
- Contact:
Re: REFS 4k horror story
@Gostev, coo and where can I find these top secret registry fixes please sir?
-
- Veteran
- Posts: 370
- Liked: 97 times
- Joined: Dec 13, 2015 11:33 pm
- Contact:
Re: REFS 4k horror story
I guess we'll find out with the blog post but I'm confused that the fix for something that's causing blue screens and system crashes requires the user to make a registry change. Assuming that's the actual case, it seems insane to meGostev wrote:All, great news! It appears that KB4013429 does include the fix for this issue, however you need to enable the newly added registry value to activate this new behavior. This makes sense, we also like to introduce major behavior modifiers this way before making them default.
I have the instructions, but I was asked not to share them broadly, because the ReFS dev team is planning to release the detailed blog post today that will have full context and details. If you absolutely cannot wait, shoot me a PM.
-
- Enthusiast
- Posts: 82
- Liked: 19 times
- Joined: Jul 16, 2015 6:31 am
- Full Name: Rene Keller
- Contact:
Re: REFS 4k horror story
I can't unterstand why this is such a secret.
If it is a fix for a issue, why it isn't enabled by default? Why there is a need of changing a reg-key?
I'm afraid that there will be side effects by enabaling the key.
If it is a fix for a issue, why it isn't enabled by default? Why there is a need of changing a reg-key?
I'm afraid that there will be side effects by enabaling the key.
-
- Enthusiast
- Posts: 63
- Liked: 9 times
- Joined: Nov 29, 2016 10:09 pm
- Contact:
Re: REFS 4k horror story
We are trying the fix as a remedy and evaluating it to avoid further speculations. Thanks for offering private sharing of the fix.
As we are really affected, please anyone go ahead to share vital information regarding remedies. While I understand others' frustration (mine is big also), it does not help and there will be enough time to discuss opinions later. Thanks and wish us a brighter backups soon!
As we are really affected, please anyone go ahead to share vital information regarding remedies. While I understand others' frustration (mine is big also), it does not help and there will be enough time to discuss opinions later. Thanks and wish us a brighter backups soon!
-
- Chief Product Officer
- Posts: 31814
- Liked: 7302 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: REFS 4k horror story
All, here is the official KB article from Microsoft > FIX: Heavy memory usage in ReFS on Windows Server 2016 and Windows 10
Please don't forget to install KB4013429 before applying the registry values, and remember to reboot the server after doing so.
Finally, please do remember to share what option has worked for you!
Please don't forget to install KB4013429 before applying the registry values, and remember to reboot the server after doing so.
Finally, please do remember to share what option has worked for you!
-
- Service Provider
- Posts: 33
- Liked: 1 time
- Joined: Jun 13, 2016 6:51 am
- Full Name: Søren Emig
- Contact:
Re: REFS 4k horror story
@Gostev
Thank you
I would great if Veeam could spell out some recommendation on this one, perhaps in conjunction with some guidelines for building a repository server.
We are planning a 350TB REFS repository and I’m a little scared moving forward
Thank you
I would great if Veeam could spell out some recommendation on this one, perhaps in conjunction with some guidelines for building a repository server.
We are planning a 350TB REFS repository and I’m a little scared moving forward
-
- Service Provider
- Posts: 60
- Liked: 19 times
- Joined: Dec 23, 2014 4:04 pm
- Contact:
Re: REFS 4k horror story
Was expecting a blogpost from Microsoft with some in-depth explanation and guidance on the different options. Is this still upcoming?
-
- Novice
- Posts: 5
- Liked: never
- Joined: Mar 07, 2017 5:57 am
- Full Name: Rich
- Contact:
Re: REFS 4k horror story
@skumflum
I'm in exactly the same position, weighing up different options for similarly sized repository and would lean towards ReFS if this is fixed
I'm in exactly the same position, weighing up different options for similarly sized repository and would lean towards ReFS if this is fixed
-
- Chief Product Officer
- Posts: 31814
- Liked: 7302 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: REFS 4k horror story
@Søren, Richard - we've tested 500TB repository before this fix was available, and it worked like charm. Don't be scared just based on this forum thread - keep in mind huge amount of users Veeam has, and the fact that people without issues rarely or never come to forums to share that all works well for them I can tell you compared to the number of customers that are testing/using ReFS, the number of actual issues is fairly low as long as there's enough RAM on the repository server and 64K clusters.
@WimVD my bad for calling it blog post, I expected some article - I did not know what format it will be published in.
@WimVD my bad for calling it blog post, I expected some article - I did not know what format it will be published in.
Who is online
Users browsing this forum: Bing [Bot] and 60 guests