I've got an environment that has a Win2019 backup repository with ReFS. The server experienced a power loss and now ReFS volume is failing to mount. The Windows Event viewer log reports:
Volume D: is formatted as ReFS but ReFS is unable to mount it; ReFS encountered status The volume repair was not successful..
The file system detected a checksum error and was not able to correct it. The name of the file or folder is "Container Table".
The file system detected a checksum error and was not able to correct it. The name of the file or folder is "Duplicate Container Table".
The file system detected a global metadata corruption and was not able to repair it on volume D:. Attempting a readonly volume mount may succeed.
This server has been used for nearly 2 years without issue, approx 50TB in use (100TB capacity), and storage corruption guard enabled weekly for all jobs. Storage is directly attached using SuperMicro AVAGO 3108 MegaRAID, 16 disks, and everything is healthy.
PS C:\> refsutil salvage -D d: c:\salvage -v -x
Microsoft ReFS Salvage [Version 10.0.11070]
Copyright (c) 2015 Microsoft Corp.
Local time: 6/24/2021 14:45:30
Option(s) specified: -v -x
Error: Initialization failed.
Error: The specified request is not a valid operation for the target device.
Run time = 0 seconds.
Is there anything else that can be done? I'd be happy to have it mounted read-only to get some key data off asap..!
RAID controllers without battery-backed cache can lose data on power loss, and it seems you got very unlucky because in your case, it was a critical piece of file system metadata that was lost - which is why the OS can no longer mount the volume.
I assume you don't have a copy of your backups as the 3-2-1 rule of backups requires? Restoring from a copy would certainly be the easiest here.
Otherwise, I'm afraid at this point there are only two options:
1. Open a support case with Microsoft and see if they can do anything to help you mount the volume.
2. Look for a specialized data recovery company and be ready to dish out some $$$.
Hey Gostev, thank you for your quick reply. A BBU is reported by the MegaRAID Storage Manager as online and Optimal. I'm beginning to doubt it's the hardware reliability though
The customer environment wasn't too worried about the Veeam backups (they have multiple copies) but the backup location was unfortunately used as a data archive location for less critical archived data, which they'd like to recover if possible They've explored the Microsoft support option and that doesn't appear to be avaliable. I'm having success with ReclaimMe File Recovery but it's a bit cumbersome to use, but better than nothing.
Thatnks again for your response - I was hoping there was some magical toolchain or flags to refsutil that might temproarily get it back online but that doesn't appear to be the case