-
- Influencer
- Posts: 13
- Liked: never
- Joined: Nov 24, 2009 12:43 am
- Full Name: Francois Conil
- Contact:
Replica hung after network connectivity issues
Hi,
We are running a replica over WAN (2Mbps dedicated link). It usually runs fine (completes in an odd 3-4h), but last week it failed mid way due to a timeout.
I didn't notice it earlier and the retry kicked in.
Now the replica job is stuck at 40% and "checking for previous backups".
I can't stop the job, and I have up to 10 delta files for a given disk (although vsphere is only showing two chained snapshots, consolidate helper and veeam backup), which makes me think more than twice before attempting anything rash like restarting the backup server or the target.
Is there a way to stop the job and consolidate my disks without panning my virtual server completely? As the need for offsite replicas might suggest, this is quite a critical server. There has been no activity on the target server (replica site) since the connection died mid way.
We are running a replica over WAN (2Mbps dedicated link). It usually runs fine (completes in an odd 3-4h), but last week it failed mid way due to a timeout.
I didn't notice it earlier and the retry kicked in.
Now the replica job is stuck at 40% and "checking for previous backups".
I can't stop the job, and I have up to 10 delta files for a given disk (although vsphere is only showing two chained snapshots, consolidate helper and veeam backup), which makes me think more than twice before attempting anything rash like restarting the backup server or the target.
Is there a way to stop the job and consolidate my disks without panning my virtual server completely? As the need for offsite replicas might suggest, this is quite a critical server. There has been no activity on the target server (replica site) since the connection died mid way.
-
- Chief Product Officer
- Posts: 31775
- Liked: 7274 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Replica hung after network connectivity issues
Hello Francois, it would be best to work with our technical support directly on issues like this. They will be able to understand what the job is doing after reviewing the logs, and recommend the best course of actions.
Generally, restarting Veeam server in any situation should not cause any issues to original VMs, as Veeam Backup performs read-only access to production storage, reading the snapshot data. Of course, this may leave the snapshots behind as Veeam does not get a chance to issue command to remove them - but you should be able to remove them manually.
Also, can you let me know if you are running the latest Veeam Backup version? I believe that the issue with hung replciation due to network drop/timeout, while existed before, was fixed in one of the more recent releases.
Generally, restarting Veeam server in any situation should not cause any issues to original VMs, as Veeam Backup performs read-only access to production storage, reading the snapshot data. Of course, this may leave the snapshots behind as Veeam does not get a chance to issue command to remove them - but you should be able to remove them manually.
Also, can you let me know if you are running the latest Veeam Backup version? I believe that the issue with hung replciation due to network drop/timeout, while existed before, was fixed in one of the more recent releases.
-
- Influencer
- Posts: 13
- Liked: never
- Joined: Nov 24, 2009 12:43 am
- Full Name: Francois Conil
- Contact:
Re: Replica hung after network connectivity issues
We're running 4.0.
There has been no write access to the destination server since the network failure.
No creation or deletion of snapshots since the incident.
There has been no write access to the destination server since the network failure.
No creation or deletion of snapshots since the incident.
-
- Chief Product Officer
- Posts: 31775
- Liked: 7274 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Replica hung after network connectivity issues
That's what I thought... this issue is listed as resolved in version 4.1 released October last year.
http://www.veeam.com/files/release_note ... _notes.pdf
http://www.veeam.com/files/release_note ... _notes.pdf
• Replication and backup jobs hang if SSH connection to the target server drops.
-
- Influencer
- Posts: 13
- Liked: never
- Joined: Nov 24, 2009 12:43 am
- Full Name: Francois Conil
- Contact:
Re: Replica hung after network connectivity issues
That seems spot on.
Here is the last lines of my job log (no modification since job failure)
Here is the last lines of my job log (no modification since job failure)
Code: Select all
16.07.2010 01:02:07] <17> Info (Server) Service output: [2010-07-16 01:02:07.348 04772 info 'App'] Successfully released all resources.\n
[16.07.2010 01:02:07] <17> Info (Server) Service output: [2010-07-16 01:02:07.364 04772 trivia 'SOAP'] Sending soap request to [TCP:127.0.0.1:443]: logout\n
[16.07.2010 01:02:07] <05> Info (Server) Service error: An existing connection was forcibly closed by the remote host
[16.07.2010 01:02:07] <05> Info (Server) Service error: --tr:Cannot write data to the socket. Data size: [1048576].
[16.07.2010 01:02:07] <05> Info (Server) Service error: --tr:Failed to serialize data area. ID: [289]. Offset: [2336227328].
[16.07.2010 01:02:07] <05> Info (Server) Service error: --tr:Failed to send next file block. Block identity: [Data block. Start offset: [2336227328], Length: [1048576], Area ID: [289].].
[16.07.2010 01:02:07] <05> Info (Server) Service error: --tr:Unable to asynchronously write data block. Block identity: [Data block. Start offset: [2336227328], Length: [1048576], Area ID: [289].].
[16.07.2010 01:02:07] <05> Info (Server) Service error: --tr:Processing of asynchronous write requests has failed. Output file: [File blocks transmission channel (sender).].
[16.07.2010 01:02:07] <05> Info (Server) Service error: --tr:Failed to process conveyored task.
[16.07.2010 01:02:07] <05> Info (Server) Service error: --tr:FIB uploader: Unable to upload FIB. FIB path: [BLOCKS_READER: DISK=VDDK:[disk] server/server.vmx, CTK=VSPHERE_CTK://viConn=127.0.0.1/VM=vm-2031/Snapshot=snapshot-3163].
[16.07.2010 01:02:07] <05> Info (Server) Service error: Failed to process VM disk backup. VMDK path: [vddk://<vddkConnSpec><viConn name="127.0.0.1" authdPort="443" vicPort="443" /><vmxPath vmRef="vm-2031" datacenterRef="datacenter-2" datacenterInventoryPath="Datacenter" snapshotRef="snapshot-3163" datastoreName="disk" path="server/server.vmx" /><vmdkPath datastoreName="disk" path="server/server-000007.vmdk" /><transports seq="san;nbd" /><readBuffer size="2097152" /></vddkConnSpec>].
[16.07.2010 01:02:08] <04> Info (Server) Service: closed
[16.07.2010 01:02:25] <04> Info [Ssh] Connection::Error, Error: An existing connection was forcibly closed by the remote host, Message: An existing connection was forcibly closed by the remote host
[16.07.2010 01:02:28] <32> Warning SSH2 WatchDog is stopped: An existing connection was forcibly closed by the remote host
Who is online
Users browsing this forum: Google [Bot], Semrush [Bot] and 77 guests