graham8 wrote:Latest update. Confirmed no backups/copies/etc were taking place, and deleted two 6.5TB VBK files from the server. As usual with ReFS, available disk space only slowly began to make itself available. With all three of the MS workaround options in place, memory usage for this operation climbed over 100%. Then the usual occurred...numlock stopped responding, mouse stopped moving, disk activity lights stopped. I did multiple rounds of initiating manual memory dumps. This time, unlike all the other times this has occurred, the problem isn't working itself out by disabling all disk-activity-related services/tasks/etc (Veeam, server shares, scheduled data integrity scans, etc). Within 2-3 minutes, the server becomes unresponsive now with each boot cycle.
Updated Microsoft, but unless they come back to us with some way to set the volume read-only so that it temporarily stops whatever bug is occurring (even if it means it doesn't free the disk space) so we can recover the data from the volume, then it looks like we have permanently lost backup history and will need to nuke this and put in some completely different solution. And again, the volume itself is fine - the data is all accessible...just only for 2-3 minutes until the ReFS driver nukes the server.
I'll submit the memory dumps to Microsoft, so hopefully that at least helps them towards a long-term resolution to the underlying bug.
richardkraal wrote:I've got simmilar problems with our new veeam testsetup, made a call @ MS, but they won't help me with troubleshooting.
This is what they told me, we (MS) don't support veeam, use Windows Backup instead to test with one concurrent backup tasks at the time. As you can expect this is not giving any problems... because this load is peanuts
They don't wanna help. Veeam support tells us that we have to go Microsoft as they see it as performance issue at the host....
going nuts here
kubimike wrote:@graham8 you're lucky that you got dump files. I couldn't get it to trigger and make one with verifier turned on. With it turned off it wasn't giving a complete picture and would create files easily and crash. With that said microsoft couldnt help unless I could get it to crash with verififer ON.
mkretzer wrote:Ok. Bad news fronm our installation.
Even with 384 GB RAM we started to get the latency messages, the filesystem started "hanging" and WMI monitoring stopped working. It resolved itself after a while but my optimism is slowly going away. At the time there where no synthetic operations going on which is strange, only normal backup writes...
The thing is it took nearly one month and > 100 TB of backed up data. But in the end REFS seems to be somewhat unstable no matter what you do... It also does not seem to be RAM related, there is >200 Gb avaiable.
richardkraal wrote:ok, that's bad news.
I was in the mood for buying a lot of ram, hoping that would resolve the issue.
Users browsing this forum: jfmccue and 11 guests