We are currently doing DR testing.
Scenario: restoring application items (All SQL DBs from a source Windows Agent on physical host) to a new physical Windows Agent host.
If I perform this restore from the performance tier, all 7TB of the DBs restore without any issues.
If I perform the same restore by putting the performance tier into maintenance mode and forcing to restore from capacity tier, I get the same failure but at differing points throughout the restore process.
The failure can occurr in restoring one database, or another. 1:30m into the job or 8h into the job.
The consistency is that the target server for the restore logs out a bunch of warnings and errors to the system log, and then the job fails wherever it is, and hard fails the rest of the DBs.
All events are in the system log, an example of each of them (140,137,50)
Code: Select all
Log Name: System
Source: Microsoft-Windows-Ntfs
Date: 05/05/2021 22:34:28
Event ID: 140
Task Category: None
Level: Warning
Keywords: (8)
User: SYSTEM
Computer: x
Description:
The system failed to flush data to the transaction log. Corruption may occur in VolumeId: C:\VeeamFLR\Ux_e6c6f41a\Volume0, DeviceName: \Device\HarddiskVdkVolume33.
({Drive Not Ready}
The drive is not ready for use; its door may be open. Please check drive %hs and make sure that a disk is inserted and that the drive door is closed.)
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Microsoft-Windows-Ntfs" Guid="{3ff37a1c-a68d-4d6e-8c9b-f79e8b16c482}" />
<EventID>140</EventID>
<Version>0</Version>
<Level>3</Level>
<Task>0</Task>
<Opcode>0</Opcode>
<Keywords>0x8000000000000008</Keywords>
<TimeCreated SystemTime="2021-05-05T22:34:28.008137000Z" />
<EventRecordID>14777</EventRecordID>
<Correlation />
<Execution ProcessID="11832" ThreadID="9600" />
<Channel>System</Channel>
<Computer>x</Computer>
<Security UserID="S-1-5-18" />
</System>
<EventData>
<Data Name="VolumeId">C:\VeeamFLR\x_e6c6f41a\Volume0</Data>
<Data Name="DeviceName">\Device\HarddiskVdkVolume33</Data>
<Data Name="Error">0xc00000a3</Data>
</EventData>
</Event>
Code: Select all
Log Name: System
Source: Ntfs
Date: 05/05/2021 22:34:28
Event ID: 137
Task Category: (2)
Level: Error
Keywords: Classic
User: N/A
Computer: x
Description:
The default transaction resource manager on volume C:\VeeamFLR\x_e6c6f41a\Volume0 encountered a non-retryable error and could not start. The data contains the error code.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Ntfs" />
<EventID Qualifiers="49156">137</EventID>
<Level>2</Level>
<Task>2</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2021-05-05T22:34:28.009429900Z" />
<EventRecordID>14778</EventRecordID>
<Channel>System</Channel>
<Computer>x</Computer>
<Security />
</System>
<EventData>
<Data>
</Data>
<Data>C:\VeeamFLR\x_e6c6f41a\Volume0</Data>
<Binary>1C0004000200300002000000890004C000000000A30000C000000000000000000000000000000000A30000C0</Binary>
</EventData>
</Event>
Code: Select all
Log Name: System
Source: Ntfs
Date: 05/05/2021 22:34:50
Event ID: 50
Task Category: None
Level: Warning
Keywords: Classic
User: N/A
Computer: x
Description:
{Delayed Write Failed} Windows was unable to save all the data for the file . The data has been lost. This error may be caused by a failure of your computer hardware or network connection. Please try to save this file elsewhere.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Ntfs" />
<EventID Qualifiers="32772">50</EventID>
<Level>3</Level>
<Task>0</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2021-05-05T22:34:50.167607900Z" />
<EventRecordID>14863</EventRecordID>
<Channel>System</Channel>
<Computer>x</Computer>
<Security />
</System>
<EventData>
<Data>
</Data>
<Data>
</Data>
<Binary>04000400020030000000000032000480000000006E0200C0000000000000000000000000000000006E0200C0</Binary>
</EventData>
</Event>
Can anyone shine any light on this?