Comprehensive data protection for all workloads
Post Reply
SeandG
Influencer
Posts: 15
Liked: 1 time
Joined: Jan 30, 2017 10:38 am
Contact:

Disk errors on Windows server 2016 with deduplication

Post by SeandG »

Hi all,

We have recently upgraded our B&R servers to Windows Server 2016, we are using Windows deduplication and we have installed hotfix KB3216755 to deal with deduplication corruption issues.
We use our backup repositories only for backups, we do not save or write any other data to those drives.

We seem to be experiencing a high level of volume issues that have to be fixed with CHKDSK.
The errors we find seem to always be "The Volume Bitmap is incorrect".

On the 7th of Feb we checked all our repositories and found issues on these drives:
Server Drive
ABK01 F:
BBK01 F:
CBK01 F:
DBK02 I:
DBK02 H:

On the 8th we had no reported issues.

On the 14th we have these:
Server Drive
ABK01 F:
ABK01 G:
CBK01 F:
DBK01 F:

So we have the drive "F:" on 3 different servers reporting "The Volume Bitmap is incorrect" twice within a week and this has happened before, that is why we started testing and logging. I think drive letter F: is coincidental, that drive is Tier 2 storage on all the servers.
HP diagnostics and utilities are not reporting any issues with the servers.
The servers are fairly up to date as far as Windows updates and HP firmware and drivers are concerned. Windows was updated last week and HP firmware and drivers about a month ago.
Veeam B&R is version 9.5 with update 1.
We have not seen this behaviour on drives that are not deduplicated.

Is anyone else experiencing this?

Do you think changing the repositories to "Align backup file data blocks" could help? Should we do this anyway on our deduplicated volumes? Are any downsides?

I have logged a call on this and the call ID is: 02068714

Thanks,
Sean
MOBO
Influencer
Posts: 18
Liked: 5 times
Joined: Jan 24, 2015 7:26 am
Full Name: Morten Boegeskov
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by MOBO »

from Veeam Community Forums Digest for MOBO [Jan 23 - Jan 29, 2017]
THE WORD FROM GOSTEV
Major data corruption warning for those of you who have already jumped the much improved Windows Server 2016 deduplication for production use (the rest can take a deep breath). Last week, we have started to receive multiple reports on corruptions of backup files hosted on Windows Server 2016 NTFS volumes with Data Deduplication feature enabled. Luckily, the issue was easy to spot due to the system event log event (rarely the case by the way, as most storage-level corruptions go undetected - which is why it is extremely important to have storage-level corruption guard enabled in the advanced backup job settings at least when you are trying out new things).

I've already received the official confirmation from Microsoft that this is the know issue (ID 10165851) which is scheduled to be addressed in the next Windows Server 2016 servicing update. There are actually two separate issues, both leading to file corruption when using deduplication on very large files. One issue occurs when files grow to 2.2TB or larger, and another one causes loss of checksums for files with "smaller sizes" - this is the actual wording of the official note, so I have no idea how small. As such, I highly recommend assuming that all your existing backup can be damaged, and performing an active full backup to a repository backed by a volume without deduplication feature enabled. Needless to say, since those of you who are affected already have a Windows Server 2016 based repository, I highly recommend that you use ReFS.
veeam-backup-replication-f2/corrupted-f ... 40406.html
SeandG
Influencer
Posts: 15
Liked: 1 time
Joined: Jan 30, 2017 10:38 am
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by SeandG »

@MOBO

Thanks, we went through all that pain, these symptoms are different though, but I guess they could be related. According to MS KB3216755 is preview, but it does fix the corruption issues.
As much as Veeam advise ReFS, there is no deduplication on ReFS and with the amount of times we have had to do Active Full backups our storage repositories would have been full a very long time ago.
We are left to wonder if 2016 is actually ready for production, I think not, but it is a bit too late for us.
SeandG
Influencer
Posts: 15
Liked: 1 time
Joined: Jan 30, 2017 10:38 am
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by SeandG »

Microsoft pointed me to this article:
https://support.microsoft.com/en-us/hel ... for-file-5

Does the Veeam backup copy use VSS? Could this be the cause of the problems we are having?

I guess it could that the preview KB3216755 doesn't cover all deduplication corruption issues...

Thanks
Mike Resseler
Product Manager
Posts: 8191
Liked: 1322 times
Joined: Feb 08, 2013 3:08 pm
Full Name: Mike Resseler
Location: Belgium
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by Mike Resseler »

I might be completely wrong here, but MSFT pointed you to this article? You are running 2016 right? That hotfix is only for older OS-es and is long time ago fixed and implemented in the current OS-es
SeandG
Influencer
Posts: 15
Liked: 1 time
Joined: Jan 30, 2017 10:38 am
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by SeandG »

Yes MSFT pointed me to that article and I also noticed it was for older OS, I was just wondering if somehow this fault could have been re-introduced with Server 2016. Anyway I'm in touch with MS, hopefully something will come up.

It is just weird that we will fix the disk with CHKDSK and sometimes on the next it already has issues again. The only things that currently write to the disk are Veeam backups and, of course, the deduplication process.

Thanks
Gostev
Chief Product Officer
Posts: 31814
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by Gostev »

"Volume Bitmap is incorrect" error message when using CHKDSK
If anyone else is seeing this issue, let me know as I need to put you in contact directly with the dedupe development team at Microsoft through the special channel - they are currently investigating and need more real-world environments to research this potential bug.

When posting, please note if you're OK with me sharing your forum registration email with the deduplication team at Microsoft - I won't share anyone's contact by default.
VladV
Expert
Posts: 224
Liked: 25 times
Joined: Apr 30, 2013 7:38 am
Full Name: Vlad Valeriu Velciu
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by VladV »

Hi Gostev,

On 19/12/2016 I experienced this issue and opened a case with Veeam. Case # 02013712. When running chkdsk on the affected drive without any switches, on stage 3 Examining security descriptors ... I got The Volume Bitmap is incorrect without any other errors.

The reason for opening the case was this error:

Code: Select all

19/12/2016 11:30:55 :: Processing XXXXX Error: All instances of the storage metadata are corrupted. Failed to restore file from local backup. VFS link: [summary.xml]. Target file: [MemFs://frontend::CDataTransferCommandSet::RestoreText_{fb87560a-f686-444a-add0-96e012ea3897}]. CHMOD mask: [721]. Agent failed to process method {DataTransfer.RestoreText
This volume is backed by a Nexenta ZFS appliance with end to end CRC check and Sync always option enabled so we excluded any issues with the storage itself. The dedup logs are clean also.

Let me know if you need anything else.

Thanks
SeandG
Influencer
Posts: 15
Liked: 1 time
Joined: Jan 30, 2017 10:38 am
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by SeandG »

For us running CHKDSK /SCAN doesn't report the error, if we run it without switches or with /F we get "Volume Bitmap is incorrect" on a regular basis.
To detect this fault you may have to run CHKDSK without any switches.

Cheers
Sean
Gostev
Chief Product Officer
Posts: 31814
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by Gostev »

@Vlad are you OK with me sharing your contact details with the deduplication team at Microsoft?
VladV
Expert
Posts: 224
Liked: 25 times
Joined: Apr 30, 2013 7:38 am
Full Name: Vlad Valeriu Velciu
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by VladV »

Sure.
AISD
Lurker
Posts: 1
Liked: never
Joined: Feb 15, 2016 6:56 pm
Full Name: Nicholas Rutherford
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by AISD »

We have this issue as well, I sent an email to dedupfeedback@microsoft.com earlier today. I opened a support request on 1/25 related to this issue. I closed it after installing KB3216755, which seemed to keep them calm for a while. However I’m seeing this occur again. I’m ready to go back to 2012 R2 so I can have greater reliability, but I will stay on 2016 if you need another person to do some testing with for any reason.

3/6/2017 4:42:22 AM :: Full backup file merge failed Error: Agent: Failed to process method {Transform.Patch}: Data error (cyclic redundancy check).
Failed to flush file buffers. File: [D:\Backups\File Servers Job\FILESHARE3.vm-96D2017-01-09T171528.vbk].
3/6/2017 5:22:27 AM :: Error: Agent: Failed to process method {Transform.Patch}: Data error (cyclic redundancy check).
Failed to flush file buffers. File: [D:\Backups\File Servers Job\FILESHARE3.vm-96D2017-01-09T171528.vbk].

Today approximately 12:00 pm

Event ID 12805
Data Deduplication service found 3 corruption(s) on volume D:\. 0 corruption(s) are fixed. 1 user file(s) are corrupted. 0 user file(s) are fixed. For the corrupted file list, see the Microsoft/Windows/Deduplication/Scrubbing events.

Event ID 12800
Data Deduplication service detected corruption in "D:\Backups\File Servers Job\FILESHARE3.vm-96D2017-01-10T171514.vbk". The corruption cannot be repaired.

Event ID 12802
Data Deduplication service detected a corrupted item (Bad checksum - 8, 0x190000, 0x0, 0x4B9, recall bitmap body) in Deduplication Chunk Store on volume D:\. See the event details for more information.
Data Deduplication service detected a corrupted item (Bad checksum - 7, 0x190000, 0x0, 0x4B9, recall bitmap body) in Deduplication Chunk Store on volume D:\. See the event details for more information.
Data Deduplication service detected a corrupted item (Bad checksum - Recall bitmap, 0x190000, 0x0, 0x4B9, recall bitmap body) in Deduplication Chunk Store on volume D:\. See the event details for more information.

Running chkdsk it give error:
“The Volume Bitmap is incorrect.”
sdis68
Novice
Posts: 3
Liked: never
Joined: Jun 02, 2016 9:08 am
Full Name: Nicolas Riss
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by sdis68 »

Hi,

Same problem here, with the same event id. We have one job that produce a VBK greater than 2,2Tb, and after deduplication it corrupted.

I try to install KB3216755, I will see if it solve the problem.
SeandG
Influencer
Posts: 15
Liked: 1 time
Joined: Jan 30, 2017 10:38 am
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by SeandG » 1 person likes this post

Microsoft have acknowledged this to be a bug, they are not sure if it is a bug with deduplication or with NTFS, I will let you know what happens.
Mike Resseler
Product Manager
Posts: 8191
Liked: 1322 times
Joined: Feb 08, 2013 3:08 pm
Full Name: Mike Resseler
Location: Belgium
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by Mike Resseler »

@nicolas,

Please don't install KB3216755 if that server holds also a SQL installation. It could really mess up your environment :-(

@Sean: Keep us informed, that would be highly appreciated!
somethingsomething
Novice
Posts: 6
Liked: never
Joined: Feb 20, 2017 3:22 pm
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by somethingsomething »

We're running Server 2016 w/NTFS Deduped and we've been bitten by "All instances of the metadata are corrupted" 3 times now.
I also ran a chkdsk and I see the "The Volume Bitmap is incorrect" error. But if I do a chkdsk /scan all I get is"Windows has scanned the file system and found no problems. / No further action is required."
KB3216755 did nothing to fix the problem.

I don't see anything in the event logs that appears to be dedup related.

@gostev Please feel free to share details with MS if I can help.

Case # 02104227
DonZoomik
Service Provider
Posts: 372
Liked: 120 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by DonZoomik »

somethingsomething wrote:"All instances of the metadata are corrupted"
It seems to go away by itself. After a few hours it works fine.
I've seen 2012R2 dedup lock VBM files that causes backup to fail, could be related...
RGijsen
Expert
Posts: 127
Liked: 29 times
Joined: Oct 10, 2014 2:06 pm
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by RGijsen »

Same issues here. But another thread for this exists: veeam-backup-replication-f2/corrupted-f ... 06-30.html
somethingsomething
Novice
Posts: 6
Liked: never
Joined: Feb 20, 2017 3:22 pm
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by somethingsomething »

DonZoomik wrote:
It seems to go away by itself. After a few hours it works fine.
I've seen 2012R2 dedup lock VBM files that causes backup to fail, could be related...
Before I applied the patch, I ran into the problem --- and it did not work fine later. Even on subsequent backups I still had the corruption.
It might have gone away after the third time I ran into the problem (post-hotfix), but I didn't want to meddle anymore.

I still have the scandisk error, so I'm done taking risks for now.

Interestingly enough, on another volume that is deduped but not being used by VEEAM, I've had no issues. Scandisk doesn't report any errors.

REFS, here I come. It's had some bugs but so far it's been more stable.
Delo123
Veteran
Posts: 361
Liked: 109 times
Joined: Dec 28, 2012 5:20 pm
Full Name: Guido Meijers
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by Delo123 »

We have seen these locks on 2012R2 only in the beginning when first experimenting with dedupe, in 100% of the cases (at our site) they were caused by Deduplication set to VDI Server instead of general purpose file server. in 2016 virtualized file server should be the option to choose.
RGijsen
Expert
Posts: 127
Liked: 29 times
Joined: Oct 10, 2014 2:06 pm
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by RGijsen »

We have our repository set to general purpose file server, yet we do face the metadata corruption issue. So far the only option that worked for me is to start a new active full. Waiting / rebooting never made the issue 'disappear'. I currently have a Veeam ticket on this, 02112363, but as expected it all points at Microsoft dedupe being the culprit here.

One thing I have to mention though, at the moment we are running out of space on our repository. That means that we dedup files 0 days of age. Also as we start the offsite copy as soon as a job is finished, the actual file would be in use. I've enabled OptimizeInUseFiles. I've just disabled that to see if that would make a difference. However, if I'm not mistaken we had that disabled initially. Will keep you informed.
RGijsen
Expert
Posts: 127
Liked: 29 times
Joined: Oct 10, 2014 2:06 pm
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by RGijsen »

Update; we've ran for about one and a half week without Dedupe, and everything worked perfectly well. I enabled dedupe this weekend, and tonight already I again had the corrupt metadata issues again. So it really seems to be dedupe that is bugging here. Yet another MS problem. They really rushed out Server 2016 way too soon.
SeandG
Influencer
Posts: 15
Liked: 1 time
Joined: Jan 30, 2017 10:38 am
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by SeandG »

MS provided us with a private hotfix and since the implementation of the hotfix about 2 weeks ago we stopped having NTFS corruption. So far running CHKDSK on our volumes stopped reporting errors.

Cheers,
Sean
Mike Resseler
Product Manager
Posts: 8191
Liked: 1322 times
Joined: Feb 08, 2013 3:08 pm
Full Name: Mike Resseler
Location: Belgium
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by Mike Resseler »

Sean,

That sounds like good news! Any idea if MSFT will put that hotfix in the next rollup update? Or is it still under review?
SeandG
Influencer
Posts: 15
Liked: 1 time
Joined: Jan 30, 2017 10:38 am
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by SeandG »

Hi Mike,

This is the last message I have from Microsoft:
We have started the processing of generating a public fix of this change. We can let you know when it will be available once the update release date is determined

If I get more info, I will place it here

Thanks
zuldan
Enthusiast
Posts: 45
Liked: 5 times
Joined: Feb 15, 2017 9:51 am
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by zuldan »

Looks like the NTFS bug causing corruption when sparse files is used by deduplication (aka "The Volume Bitmap is incorrect") will patched next Tuesday.
Addressed issue where NTFS sparse files were unexpectedly truncated (NTFS sparse files are used by Data Deduplication—deduplicated files may be unexpectedly corrupted as a result). Also updated chkdsk to detect which files are corrupted.
https://support.microsoft.com/en-au/help/4032188
zuldan
Enthusiast
Posts: 45
Liked: 5 times
Joined: Feb 15, 2017 9:51 am
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by zuldan »

Well this is really weird. Only the latest Windows 10 CU contains the NTFS fix not the latest 2016 CU.

Dedup is used mostly on 2016 server, not Windows 10. Totally weird.

https://support.microsoft.com/en-us/help/4025334
mih
Novice
Posts: 8
Liked: never
Joined: Nov 03, 2014 11:25 am
Contact:

Re: Disk errors on Windows server 2016 with deduplication

Post by mih »

the update for Server 2016 is the one mentioned here:
https://blogs.technet.microsoft.com/fil ... kb4025334/

but yeah, the text "Addressed issue where NTFS sparse files were unexpectedly truncated (NTFS sparse files are used by Data Deduplication—deduplicated files may be unexpectedly corrupted as a result). Also updated chkdsk to detect which files are corrupted. " is missing in the update for Server 2016

We have KB4025334 installed, and have not encountered the problem since.
Post Reply

Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 80 guests