Discussions specific to the VMware vSphere hypervisor
pacmantravis
Novice
Posts: 6
Liked: never
Joined: Dec 24, 2009 8:32 pm
Full Name: Travis Nieves

WAN Replication failing

Post by pacmantravis » Apr 08, 2010 5:03 am

We are trying to replicate VMs from one data center to another (vSphere ESXi hosts on both sides) and the replication job keeps failing after around 70-90GB of transfer.

The speeds and latency between the two data centers is actually pretty decent. Average throughput of around 20-30Mbs each way and latency is around 10-12ms. The connection is going through an IPSec VPN tunnel.

We have tried replicating different VMs using all replication methods and all of them have the same error:

Code: Select all

Client error: TCP error, code: [10054].An existing connection was forcibly closed by the remote host
I've opened a ticket with support, but haven't really gotten too far with them (they are working their butts off to try and help us out though).

I was just wondering if anyone here had the same issues and if so, what you did to fix it.

We have tried large FTP and large FastSCP and WinSCP transfers and the transfers work fine; it's just the replication jobs that aren't working.

Vitaliy S.
Product Manager
Posts: 23886
Liked: 1756 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: WAN Replication failing

Post by Vitaliy S. » Apr 08, 2010 9:39 am

Hello Travis,

Have you tried setting up the replication job for one VM only, is it also failing? As it might be some kind of a connectivity issue I would also suggest trying replica seeding option, so you could transfer only changed blocks on the next run via your link betweet two datacenters.

And yes, please continue working with support.

Thank you!

Gostev
SVP, Product Management
Posts: 26122
Liked: 4067 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: WAN Replication failing

Post by Gostev » Apr 08, 2010 12:40 pm

Travis, did you try to replicate the same VM locally (not across WAN) and see if the issue is gone. If this is the case, then this would indicate that the issue is somehow connected with your WAN/VPN link. It would be good test in all cases to locate where the issue sits.

pacmantravis
Novice
Posts: 6
Liked: never
Joined: Dec 24, 2009 8:32 pm
Full Name: Travis Nieves

Re: WAN Replication failing

Post by pacmantravis » Apr 08, 2010 8:33 pm

Hi All,

The replication job is set for just one VM at the moment. We have successfully replicated locally. so the issue does seem to be isolated the the transfer between the two hosts in the VPN link.

However, it's odd that will large file transfers for other protocols work, the replication process does not. Also, being that these are both ESXi hosts, there doesn't seem like much can be done on the hosts themselves to help fix this issue.

We may contact some WAN optimization vendors to try and see if their products can help out.

Gostev
SVP, Product Management
Posts: 26122
Liked: 4067 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: WAN Replication failing

Post by Gostev » Apr 08, 2010 9:13 pm

Travis, have you seen this thread as it sounds exactly like your issue:
Problems with making replication work across a WAN/VPN?

pacmantravis
Novice
Posts: 6
Liked: never
Joined: Dec 24, 2009 8:32 pm
Full Name: Travis Nieves

Re: WAN Replication failing

Post by pacmantravis » Apr 08, 2010 11:31 pm

Yes, I have.

Unfortunately, most of the recommendations I have seen are for ESX servers only, not esxi. We have reduced the MTU settings on the Veeam server itself to 1400, however, still a no-go.

Ideally we would want to change the MTU settings on the ESXi hosts themselves, but that does not seem possible.

dreddaway
Novice
Posts: 6
Liked: never
Joined: Mar 10, 2010 11:02 pm
Full Name: David Reddaway
Contact:

Re: WAN Replication failing

Post by dreddaway » Jun 08, 2010 8:10 pm

Check out this thread:
VEEAM with Riverbed

joel0920
Novice
Posts: 8
Liked: 1 time
Joined: Jan 13, 2011 3:45 pm
Full Name: Jonny Elvelin
Contact:

An existing connection was forcibly closed by the remote hos

Post by joel0920 » Dec 28, 2011 8:15 am 1 person likes this post

[merged]

Replicating between two esxi 4.1 host over a 3Mbit wan link gives this error every time


2011-12-28 02:26:46 :: Error: Client error: An existing connection was forcibly closed by the remote host
Unable to retrieve next block transmission command. Number of already processed blocks: [120821].
Failed to replicate content of the disk [vddk://<vddkConnSpec><viConn name="192.168.0.8" authdPort="902" vicPort="443" /><vmxPath vmRef="48" datacenterRef="ha-datacenter" datacenterInventoryPath="ha-datacenter" snapshotRef="48-snapshot-617" datastoreName="xxxxxxxxxx" path="Windows Server 2003 Standard Edi/Windows Server 2003 S

The replication has been set up with Veeam 5 and worked (but slow) for several month

Now trying to speed it up with veeam 6
the job starts and is going 10 times faster but stops before it finnish, different amount every time.
at most i have processed 194 Gb out of 200 (transferred 14Gb in changes)

I have installed a clean Windows 7 VM at the source end to funktion as proxy
and tried with a 2003 backupserver and a 2008R2 SP1 veeam backup server at the destination side.
backupservers as VM:s in a separate ESX
i have a previously copied version in the destination datastore that i use as MAP replica
so just the changes need to be transfered over the wan

verified that i have 1500 MTU over the line both ways
and no blocking what so ever, all tcp , udp and icmp allowed.

help needed
/Jonny

Vitaliy S.
Product Manager
Posts: 23886
Liked: 1756 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: WAN Replication failing

Post by Vitaliy S. » Dec 28, 2011 8:36 am

Looks like your WAN connection is pretty unstable, that is why your replication job stops randomly. Try using WAN accelerators (search these forums for Hyper-IP or Riverbed), should help.

joel0920
Novice
Posts: 8
Liked: 1 time
Joined: Jan 13, 2011 3:45 pm
Full Name: Jonny Elvelin
Contact:

Re: WAN Replication failing

Post by joel0920 » Dec 28, 2011 8:45 am

I dont know about others but mine Wan is stable and i have a Vcenter connection to the source (over the wan link) and that is never failing, even when the replication stops.

foggy
Veeam Software
Posts: 19064
Liked: 1696 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: WAN Replication failing

Post by foggy » Dec 28, 2011 9:12 am

Johnny, please proceed investigating this with our technical support team. It's hard to troubleshoot such issues via forums and full logs are definitely required. Thanks.

Gostev
SVP, Product Management
Posts: 26122
Liked: 4067 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: WAN Replication failing

Post by Gostev » Dec 28, 2011 10:48 am

Connection does not drop, it is actively closed by one of the agents. Need to see the logs why.

joel0920
Novice
Posts: 8
Liked: 1 time
Joined: Jan 13, 2011 3:45 pm
Full Name: Jonny Elvelin
Contact:

Re: WAN Replication failing

Post by joel0920 » Dec 28, 2011 3:50 pm

When the head is stupid.......

Im on my way of solving this, it seeams that it was the windows 7 (proxy) thats disconnect session
okey, in a standard windows 7 installation, what is default on? yes, sleep mode...
even if it,s installed in a VM server, it´s falling to sleep when it pleases..
and then it,, disconnects everything
but when i went in from Vcenter window and checked... yes it was there and kickning.
the startup from sleep is so quick that you dont see it..

So now it, s OFF and the replication is started, get back if this solved the problem..will take 10-15 hours to know for sure..

HAPPY NEW YEAR !

Gostev
SVP, Product Management
Posts: 26122
Liked: 4067 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: WAN Replication failing

Post by Gostev » Dec 28, 2011 4:09 pm

Ouch :D

joel0920
Novice
Posts: 8
Liked: 1 time
Joined: Jan 13, 2011 3:45 pm
Full Name: Jonny Elvelin
Contact:

Re: WAN Replication failing

Post by joel0920 » Jan 02, 2012 12:35 pm

Nope :-(

that was NOT the issue
stopped with the same error

left the job running and it was disconnected 2-3 times (retries)
and then it suddenly was running OK ?
and this was during the holiday so no config changes...
and no Sleeping..

now im confused, now it has worked for 2-3 times during business hours

Gostev
SVP, Product Management
Posts: 26122
Liked: 4067 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: WAN Replication failing

Post by Gostev » Jan 02, 2012 2:41 pm

Hi Jonny, this issue is environmental - some process actively terminates the network connection (OS shutdown, antivirus, etc). Thanks.

mazcredi
Influencer
Posts: 11
Liked: never
Joined: Dec 13, 2011 5:34 pm
Full Name: Mazen Credi
Contact:

Re: WAN Replication failing

Post by mazcredi » Jan 30, 2012 3:33 pm

I am having the EXACT same problem as Joel, even the WAN link is the same speed. If I do a BACKUP over the WAN, it works fine over a 20-hour period but then a subsequent replication of the exact same VM fails with this connection issue. I know it is tempting to say, well the WAN is not stable, but frankly we have a host of other automated transfer jobs over the WAN that never fail due to connectivity issues. I am working with support on this but it has been a few weeks now and after trying a few different things, we still have not come to a resolution.

@Joel, I see your last post from earlier this month, were you ever able to get this straightened out?

tsightler
VP, Product Management
Posts: 5617
Liked: 2429 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: WAN Replication failing

Post by tsightler » Jan 30, 2012 3:37 pm

Do you have multiple replications running? Are you proxies used for other tasks?

mazcredi
Influencer
Posts: 11
Liked: never
Joined: Dec 13, 2011 5:34 pm
Full Name: Mazen Credi
Contact:

Re: WAN Replication failing

Post by mazcredi » Jan 30, 2012 3:49 pm

No just a single one, both source and target proxies dedicated to this one job.

tsightler
VP, Product Management
Posts: 5617
Liked: 2429 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: WAN Replication failing

Post by tsightler » Jan 30, 2012 5:28 pm

Have you tried using Network mode on the target proxy?

mazcredi
Influencer
Posts: 11
Liked: never
Joined: Dec 13, 2011 5:34 pm
Full Name: Mazen Credi
Contact:

Re: WAN Replication failing

Post by mazcredi » Jan 30, 2012 10:55 pm

That is the mode configured for the target.

mysticjay
Novice
Posts: 3
Liked: never
Joined: Mar 03, 2012 2:14 pm
Full Name: Matthias
Contact:

Re: WAN Replication failing

Post by mysticjay » Mar 17, 2012 10:45 am

Is there any final solution for this?
My (stopped) Ubuntu-VM fails to do subsequent replicas.
The first is Ok, when I start the job again (5 restorepoints) it fails with "Eine vorhandene Verbindung wurde vom Remotehost geschlossen. Failed to process [sendSignature]" (an existing connection was closed by remote host).

Mystic

Vitaliy S.
Product Manager
Posts: 23886
Liked: 1756 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: WAN Replication failing

Post by Vitaliy S. » Mar 18, 2012 10:03 pm

Matthias, try replicating this server locally, if it does work, then most likely there is some kind of environmental issue that actively terminates the network connection while your job is targeted to an offsite location.

PK_GAA
Influencer
Posts: 11
Liked: never
Joined: Jun 14, 2012 1:51 pm
Full Name: Peter K.
Contact:

Re: WAN Replication failing

Post by PK_GAA » Aug 08, 2012 7:59 am

The An existing connection was forcibly closed by the remote host problem occured at our system too. It did not occur regularly but now and then, so i couldn't find out, why the connection got closed.
The wireless LAN connection to building where the replikation hardware is set up was online all the time. According to the lan monitor, the connection did not drop. Any ideas, what we could check, to find out what happened.

Some information about our system:
* 3 esxi hosts. 2 of them in HA-mode, the third is the backup host directly connected the the iscsi-backup-storage.
* on the production-system we run 10 VMs (including 1 storage-server, 1 sql-server, 1 vcenter client (which is also the veeam b&r server), 1 domain controller)
* Productive and backup hardware is set up in two buildings, connected via straight direct link wireless LAN (connection is monitored and 99.9% of time working)
* backup proxys are: the VBR server itself, the backup vcenter client, 1 VM at backup-side, 1 VM at prodictive-side, 1 physical machine. (would it be better to have more physical machines as proxys?)

Gostev
SVP, Product Management
Posts: 26122
Liked: 4067 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: WAN Replication failing

Post by Gostev » Aug 08, 2012 8:33 am

One idea I have - since you are using wireless LAN connection, it is possible that there was extended packet loss period due to interference. We are making some changes in Patch 1 for 6.1 (to be released in the next couple of weeks) that should make our engine more resistant to such packet loss.

PK_GAA
Influencer
Posts: 11
Liked: never
Joined: Jun 14, 2012 1:51 pm
Full Name: Peter K.
Contact:

Re: WAN Replication failing

Post by PK_GAA » Aug 08, 2012 8:50 am

sounds promissing. we are looking forward to that.

tietzjd
Influencer
Posts: 16
Liked: 2 times
Joined: Nov 09, 2011 3:00 pm
Full Name: Joe Tietz
Contact:

Re: WAN Replication failing

Post by tietzjd » Sep 18, 2012 7:27 pm

Still seeing this error on 6.1. I set a proxy up at the remote site with 8 CPU cores and we have 100mb direct connect QMOE (Fiber Line) connecting the 2 sites that is dedicated to backups.

So far even with WAN optimization (With in Veeam ) and Proxy on each side I not achieving more than 4MB/s speed and I keep get the Error: Client error: End of file Unable to retrieve next block transmission command. Number of already processed blocks: [157002].

I know I trying a rather large back up (500GB) but at 100mb direct connect from switch to switch and sup 1 MS latency I would not expect to lose the connection.

dellock6
Veeam Software
Posts: 5893
Liked: 1715 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: WAN Replication failing

Post by dellock6 » Sep 18, 2012 9:53 pm

Hi Joe,
please remember 100mbits line gives you a theoretical max speed of 12.5 MBs. Considering wan optimizations uses the highest compression rate, it could slow down replication right because it has to heavily compress data before sending it. So, regarding speed, you can also try and change to lan compression.
Line dropping is instead more worrying, can you try a simple copy/paste operation of a large file to check if you can reproduce the error, or if it's Veeam only?

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2020
Veeam VMCE #1

Gostev
SVP, Product Management
Posts: 26122
Liked: 4067 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: WAN Replication failing

Post by Gostev » Sep 18, 2012 10:51 pm

Hi Joe, do you have patch 1 installed?

tietzjd
Influencer
Posts: 16
Liked: 2 times
Joined: Nov 09, 2011 3:00 pm
Full Name: Joe Tietz
Contact:

Re: WAN Replication failing

Post by tietzjd » Sep 19, 2012 1:01 am

Gostev- Will add SP1 on next pass or failure. Luca- Will be running that test after this job, I rather run it till completion. Only worry is what will happen when normal backups kick off tonight. We will see.

Post Reply

Who is online

Users browsing this forum: No registered users and 28 guests