Standalone backup agent for Microsoft Windows servers and workstations (formerly Veeam Endpoint Backup FREE)
Post Reply
solmssen
Influencer
Posts: 17
Liked: never
Joined: Jan 31, 2019 8:33 pm
Full Name: Andrew Solmssen
Contact:

Corrupt Data - Case #04723261

Post by solmssen »

Hi - I am getting warnings from Veeam that there is bad data in a backup, but I've run chkdsk numerous times, and it's not finding any issues.

Code: Select all

3/26/2021 4:00:02 AM :: Initializing
3/26/2021 4:00:25 AM :: Preparing for backup
3/26/2021 4:00:25 AM :: Backup file will be encrypted
3/26/2021 4:00:30 AM :: Creating VSS snapshot
3/26/2021 4:00:47 AM :: Calculating digests
3/26/2021 4:01:02 AM :: EFI system partition (disk 0) (100.0 MB) 100.0 MB read at 10 MB/s
3/26/2021 4:01:12 AM :: (C:) (1.8 TB) 18.0 GB read at 280 MB/s [CBT]
3/26/2021 4:02:18 AM :: Recovery partition (disk 0) (505.0 MB) 419.0 MB read at 419 MB/s
3/26/2021 4:02:20 AM :: Finalizing
3/26/2021 4:02:23 AM :: Incremental backup created
3/26/2021 4:02:29 AM :: Backup files health check has been completed
3/26/2021 6:50:04 AM :: Disk 0 of machine BITBOY-I9 is corrupted, possible reason: Storage I/O issue. Corrupted data is located in the following backup files: Backup Job BITBOY-I92021-03-18T042742.vbk
3/26/2021 6:50:16 AM :: Email notification was sent
3/26/2021 6:50:14 AM :: Processing finished with errors at 3/26/2021 6:50:14 AM
3/26/2021 6:50:19 AM :: Health Check Retry: 1
3/26/2021 6:50:19 AM :: Initializing
3/26/2021 6:50:45 AM :: Preparing for backup
3/26/2021 6:50:46 AM :: Backup file will be encrypted
3/26/2021 6:50:51 AM :: Creating VSS snapshot
3/26/2021 6:51:08 AM :: Calculating digests
3/26/2021 6:51:23 AM :: EFI system partition (disk 0) (100.0 MB) 100.0 MB read at 10 MB/s
3/26/2021 6:51:33 AM :: (C:) (1.8 TB) 1.1 TB read at 1 GB/s
3/26/2021 7:09:52 AM :: Recovery partition (disk 0) (505.0 MB) 421.0 MB read at 421 MB/s
3/26/2021 7:09:54 AM :: Finalizing
3/26/2021 7:09:57 AM :: Incremental backup created
3/26/2021 7:10:04 AM :: Email notification was sent
3/26/2021 7:10:03 AM :: Processing finished at 3/26/2021 7:10:03 AM
What should I be looking for here? Backup probably stresses storage I/O as much as anything I'm doing so I'm curious if you're seeing problems, you know?

Gostev
SVP, Product Management
Posts: 28162
Liked: 4971 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Corrupt Data - Case #04723261

Post by Gostev »

How did you run chkdsk?

solmssen
Influencer
Posts: 17
Liked: never
Joined: Jan 31, 2019 8:33 pm
Full Name: Andrew Solmssen
Contact:

Re: Corrupt Data - Case #04723261

Post by solmssen »

from an elevated prompt, I ran chkdsk c: /f - this said the disk could not be locked and it would run at boot, which it did, and results were posted to eventlog from source wininit as below:

Code: Select all

Checking file system on C:
The type of the file system is NTFS.

A disk check has been scheduled.
Windows will now check the disk.                         

Stage 1: Examining basic file system structure ...
  947200 file records processed.                                                         
File verification completed.
 Phase duration (File record verification): 4.83 seconds.
  13815 large file records processed.                                    
 Phase duration (Orphan file record recovery): 0.00 milliseconds.
  0 bad file records processed.                                      
 Phase duration (Bad file record checking): 0.67 milliseconds.

Stage 2: Examining file name linkage ...
  43888 reparse records processed.                                       
  1238186 index entries processed.                                                        
Index verification completed.
 Phase duration (Index verification): 13.30 seconds.
  0 unindexed files scanned.                                         
 Phase duration (Orphan reconnection): 1.38 seconds.
  0 unindexed files recovered to lost and found.                     
 Phase duration (Orphan recovery to lost and found): 517.35 milliseconds.
  43888 reparse records processed.                                       
 Phase duration (Reparse point and Object ID verification): 58.80 milliseconds.

Stage 3: Examining security descriptors ...
Cleaning up 6479 unused index entries from index $SII of file 0x9.
Cleaning up 6479 unused index entries from index $SDH of file 0x9.
Cleaning up 6479 unused security descriptors.
Security descriptor verification completed.
 Phase duration (Security descriptor verification): 53.56 milliseconds.
  145494 data files processed.                                            
 Phase duration (Data attribute verification): 0.72 milliseconds.
CHKDSK is verifying Usn Journal...
  35666592 USN bytes processed.                                                            
Usn Journal verification completed.
 Phase duration (USN journal verification): 90.94 milliseconds.

Windows has scanned the file system and found no problems.
No further action is required.

1952875578 KB total disk space.
1217868656 KB in 703169 files.
    404140 KB in 145495 indexes.
         0 KB in bad sectors.
   1116946 KB in use by the system.
     65536 KB occupied by the log file.
 733485836 KB available on disk.

      4096 bytes in each allocation unit.
 488218894 total allocation units on disk.
 183371459 allocation units available on disk.
Total duration: 20.34 seconds (20347 ms).

Internal Info:
00 74 0e 00 17 f3 0c 00 b0 a5 16 00 00 00 00 00  .t..............
fe aa 00 00 72 00 00 00 00 00 00 00 00 00 00 00  ....r...........

Windows has finished checking your disk.
Please wait while your computer restarts.
An online scan (chkdsk c: /scan) just now showed similar results:

Code: Select all

C:\Windows\system32>chkdsk c: /scan
The type of the file system is NTFS.

Stage 1: Examining basic file system structure ...
                                                                                                                                                                  947200 file records processed.                                                
File verification completed.
 Phase duration (File record verification): 6.53 seconds.
                                                                                                                                                                  13199 large file records processed.
 Phase duration (Orphan file record recovery): 0.00 milliseconds.
                                                                                                                                                                  0 bad file records processed.
 Phase duration (Bad file record checking): 0.68 milliseconds.

Stage 2: Examining file name linkage ...
                                                                                                                                                                  43914 reparse records processed.
                                                                                                                                                                  1245990 index entries processed.                                              
Index verification completed.
 Phase duration (Index verification): 14.26 seconds.
                                                                                                                                                                  0 unindexed files scanned.
 Phase duration (Orphan reconnection): 1.57 seconds.
                                                                                                                                                                  0 unindexed files recovered to lost and found.
 Phase duration (Orphan recovery to lost and found): 0.92 milliseconds.
                                                                                                                                                                  43914 reparse records processed.
 Phase duration (Reparse point and Object ID verification): 74.48 milliseconds.

Stage 3: Examining security descriptors ...
Security descriptor verification completed.
 Phase duration (Security descriptor verification): 22.65 milliseconds.
                                                                                                                                                                  149396 data files processed.
 Phase duration (Data attribute verification): 0.68 milliseconds.
CHKDSK is verifying Usn Journal...
                                                                                                                                                                  34233480 USN bytes processed.                                                 
Usn Journal verification completed.
 Phase duration (USN journal verification): 96.01 milliseconds.

Windows has scanned the file system and found no problems.
No further action is required.

1952875578 KB total disk space.
1222991264 KB in 717912 files.
    413232 KB in 149397 indexes.
         0 KB in bad sectors.
   1115278 KB in use by the system.
     65536 KB occupied by the log file.
 728355804 KB available on disk.

      4096 bytes in each allocation unit.
 488218894 total allocation units on disk.
 182088951 allocation units available on disk.
Total duration: 22.57 seconds (22573 ms).

Gostev
SVP, Product Management
Posts: 28162
Liked: 4971 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Corrupt Data - Case #04723261

Post by Gostev »

Looking at the total process duration of mere 20 seconds, meaning you did not perform the full disk scan, which is essential to detect an issue with particular disk blocks.

Gostev
SVP, Product Management
Posts: 28162
Liked: 4971 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Corrupt Data - Case #04723261

Post by Gostev »

I just read quickly through chkdsk /? output, and it looks like checking the entire disk surface for bad sectors is only performed when using /R parameter.

solmssen
Influencer
Posts: 17
Liked: never
Joined: Jan 31, 2019 8:33 pm
Full Name: Andrew Solmssen
Contact:

Re: Corrupt Data - Case #04723261

Post by solmssen »

I did a chkdsk c: /r and got this output:

Code: Select all

Checking file system on C:
The type of the file system is NTFS.

A disk check has been scheduled.
Windows will now check the disk.                         

Stage 1: Examining basic file system structure ...
  947200 file records processed.                                                         
File verification completed.
 Phase duration (File record verification): 4.99 seconds.
  13196 large file records processed.                                    
 Phase duration (Orphan file record recovery): 0.00 milliseconds.
  0 bad file records processed.                                      
 Phase duration (Bad file record checking): 0.66 milliseconds.

Stage 2: Examining file name linkage ...
  43914 reparse records processed.                                       
  1245972 index entries processed.                                                        
Index verification completed.
 Phase duration (Index verification): 13.51 seconds.
  0 unindexed files scanned.                                         
 Phase duration (Orphan reconnection): 1.41 seconds.
  0 unindexed files recovered to lost and found.                     
 Phase duration (Orphan recovery to lost and found): 55.27 milliseconds.
  43914 reparse records processed.                                       
 Phase duration (Reparse point and Object ID verification): 94.15 milliseconds.

Stage 3: Examining security descriptors ...
CHKDSK is compacting the security descriptor stream
Cleaning up 223 unused security descriptors.
  149387 data files processed.                                            
 Phase duration (Data attribute verification): 0.72 milliseconds.
CHKDSK is verifying Usn Journal...
  34945632 USN bytes processed.                                                            
Usn Journal verification completed.
 Phase duration (USN journal verification): 84.53 milliseconds.

Stage 4: Looking for bad clusters in user file data ...
  947184 files processed.                                                                
File data verification completed.
 Phase duration (User file recovery): 25.16 minutes.

Stage 5: Looking for bad, free clusters ...
  182197648 free clusters processed.                                                        
Free space verification is complete.
 Phase duration (Free space recovery): 0.00 milliseconds.
Correcting errors in the Volume Bitmap.

Windows has made corrections to the file system.
No further action is required.

1952875578 KB total disk space.
1222561688 KB in 717657 files.
    412204 KB in 149390 indexes.
         0 KB in bad sectors.
   1111094 KB in use by the system.
     65536 KB occupied by the log file.
 728790592 KB available on disk.

      4096 bytes in each allocation unit.
 488218894 total allocation units on disk.
 182197648 allocation units available on disk.
Total duration: 25.50 minutes (1530373 ms).

Internal Info:
00 74 0e 00 e4 3a 0d 00 56 17 17 00 00 00 00 00  .t...:..V.......
14 ab 00 00 76 00 00 00 00 00 00 00 00 00 00 00  ....v...........

Windows has finished checking your disk.
Please wait while your computer restarts.

Gostev
SVP, Product Management
Posts: 28162
Liked: 4971 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Corrupt Data - Case #04723261

Post by Gostev »

Are you checking the correct drive? You should be checking the volume where this file is located BITBOY-I92021-03-18T042742.vbk

solmssen
Influencer
Posts: 17
Liked: never
Joined: Jan 31, 2019 8:33 pm
Full Name: Andrew Solmssen
Contact:

Re: Corrupt Data - Case #04723261

Post by solmssen »

Then this error message is not very clear?
>>> 3/26/2021 4:02:29 AM :: Backup files health check has been completed
>>> 3/26/2021 6:50:04 AM :: Disk 0 of machine BITBOY-I9 is corrupted, possible reason: Storage I/O issue. Corrupted data is located in the following backup files: Backup Job BITBOY-I92021-03-18T042742.vbk

Disk 0 of this machine is the main storage drive with boot partition and C drive. The named file in question is on the external backup drive, which is listed as Disk 1 in Disk Management MMC. I thought this message was saying that it was comparing data from source to backup and finding problems with corrupt data on disk 0. If the corrupt data is in fact on the backup disk, this is the second or third time this has happened to me with multiple computers and multiple drives. Can you clarify here what this error actually means? Is Veeam checking backup health and finding internal corruption in the large VBK file or noting that data on the source drive is corrupt somehow? Also, note the time stamps - it says the health check is completed at 4:02:29, and then complains almost 3 hours later about the corruption. What is actually happening here?

Mildur
Veeam Legend
Posts: 674
Liked: 307 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian Kessler
Contact:

Re: Corrupt Data - Case #04723261

Post by Mildur »

Corrupted data is located in the following backup files: Backup Job BITBOY-I92021-03-18T042742.vbk
You backup file contains corrupted data, not your productive machine. Veeam detected corrupted blocks inside you backup file.

Most likely, that some blocks on your backup repository disk are corrupted.
Therefore, run checkdisk against your backup repository disk, not your productive machine.

solmssen
Influencer
Posts: 17
Liked: never
Joined: Jan 31, 2019 8:33 pm
Full Name: Andrew Solmssen
Contact:

Re: Corrupt Data - Case #04723261

Post by solmssen »

I just did a chkdsk on my backup drive, and it didn't find an error. My concern here is that this is a brand-new backup disk. I was getting similar but not identical errors with the previous backup disk, which I tested every way I could think of and found no errors. I replaced it anyway, and here we are again. It worries me. Is there a way to run a health check manually and perhaps learn more? There is a KB about this, but the named executable isn't on my machine running just the Agent in free mode. I do have a full Veeam B&R 10.x community edition install in my lab, and I could move the backup files there to test them if necessary. I've done several test restores, including of a large VHDX file, but have not done a full image restore to a physical disk - is there a way to restore to a VHDX? Thanks to all for your help!

chkdsk results:

Code: Select all

C:\Windows\system32>chkdsk x: /r
The type of the file system is NTFS.

Chkdsk cannot run because the volume is in use by another
process.  Chkdsk may run if this volume is dismounted first.
ALL OPENED HANDLES TO THIS VOLUME WOULD THEN BE INVALID.
Would you like to force a dismount on this volume? (Y/N) y
Volume dismounted.  All opened handles to this volume are now invalid.
Volume label is BACKUP-5TB.

Stage 1: Examining basic file system structure ...
  256 file records processed.
File verification completed.
 Phase duration (File record verification): 510.00 milliseconds.
  0 large file records processed.
 Phase duration (Orphan file record recovery): 1.05 milliseconds.
  0 bad file records processed.
 Phase duration (Bad file record checking): 2.20 milliseconds.

Stage 2: Examining file name linkage ...
  12 reparse records processed.
  300 index entries processed.
Index verification completed.
 Phase duration (Index verification): 150.67 milliseconds.
  0 unindexed files scanned.
 Phase duration (Orphan reconnection): 0.67 milliseconds.
  0 unindexed files recovered to lost and found.
 Phase duration (Orphan recovery to lost and found): 0.84 milliseconds.
  12 reparse records processed.
 Phase duration (Reparse point and Object ID verification): 2.09 milliseconds.

Stage 3: Examining security descriptors ...
Security descriptor verification completed.
 Phase duration (Security descriptor verification): 38.82 milliseconds.
  23 data files processed.
 Phase duration (Data attribute verification): 0.97 milliseconds.
CHKDSK is verifying Usn Journal...
  12810304 USN bytes processed.
Usn Journal verification completed.
 Phase duration (USN journal verification): 145.88 milliseconds.

Stage 4: Looking for bad clusters in user file data ...
  240 files processed.
File data verification completed.
 Phase duration (User file recovery): 5.89 hours.

Stage 5: Looking for bad, free clusters ...
  620373795 free clusters processed.
Free space verification is complete.
 Phase duration (Free space recovery): 0.00 milliseconds.

Windows has scanned the file system and found no problems.
No further action is required.

  4769256 MB total disk space.
  2345699 MB in 59 files.
  92 KB in 24 indexes.
  0 KB in bad sectors.
  228111 KB in use by the system.
  65536 KB occupied by the log file.
  2423335 MB available on disk.

  4096 bytes in each allocation unit.
1220929791 total allocation units on disk.
 620373796 allocation units available on disk.
Total duration: 5.89 hours (21219550 ms).

Gostev
SVP, Product Management
Posts: 28162
Liked: 4971 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Corrupt Data - Case #04723261

Post by Gostev »

Bad news about your backup storage then, as it's having the worst type of corruption: silent bit rot. These are not detectable on the file system level or without special measures, as storage simply returns wrong data (different from what was written to it) for whatever reason.

Veeam is able to detect these sort of issues because for each data block we write, we also store a checksum of the content we wrote into the given block. Then, the backup file health check process (aka storage-level corruption guard) reads backup data from disk and compares what was returned with the corresponding checksum. This process cannot be started manually, but you can schedule it to run more often in the backup job settings.

There's no point in doing test restores at the moment, as based on the error in the original post for sure some data blocks are corrupted. But yes, Veeam Backup & Replication supports exporting Veeam Agent backups to VHDX.

I recommend you return and replace your backup storage.

solmssen
Influencer
Posts: 17
Liked: never
Joined: Jan 31, 2019 8:33 pm
Full Name: Andrew Solmssen
Contact:

Re: Corrupt Data - Case #04723261

Post by solmssen »

Hi Gostev - thanks for your help. I ran a backup health check last night and it was fine?

Code: Select all

4/6/2021 4:00:03 AM :: Initializing 
4/6/2021 4:00:47 AM :: Preparing for backup 
4/6/2021 4:00:47 AM :: Backup file will be encrypted 
4/6/2021 4:00:54 AM :: Creating VSS snapshot 
4/6/2021 4:01:17 AM :: Calculating digests 
4/6/2021 4:01:32 AM :: EFI system partition (disk 0) (100.0 MB) 100.0 MB read at 10 MB/s
4/6/2021 4:01:42 AM :: (C:) (1.8 TB) 20.1 GB read at 241 MB/s [CBT]
4/6/2021 4:03:07 AM :: Recovery partition (disk 0) (505.0 MB) 419.0 MB read at 419 MB/s
4/6/2021 4:03:09 AM :: Finalizing 
4/6/2021 4:03:15 AM :: Incremental backup created 
4/6/2021 4:03:24 AM :: Backup files health check has been completed 
4/6/2021 6:52:11 AM :: Email notification was sent 
4/6/2021 6:52:09 AM :: Processing finished at 4/6/2021 6:52:09 AM 

Gostev
SVP, Product Management
Posts: 28162
Liked: 4971 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Corrupt Data - Case #04723261

Post by Gostev »

This is because to save time, the health check only verifies data blocks belonging to the latest restore point - as opposed to every data block in every existing backup file. So obviously, no errors in the last pass does not mean you can safely continue using your backup storage, as we already know it misbehaves.

Post Reply

Who is online

Users browsing this forum: No registered users and 10 guests