Comprehensive data protection for all workloads
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

replication and ESXi

Post by bc07 »

I try to get VM replication working with a reasonable speed from one ESXi 4.1 host over WAN to another ESXi 4.1 host.
WAN bandwidth is 10Mbit and average latency 110ms.
At the source location I have Veeam installed on a physical Win 2008 x64 server with quad core CPU and 8GB RAM.
The Replication Job is configured with Processing Mode Network, Compression Optimal, Storage WAN Target, CBT and application awareness.
The VM is a Win2003 server and the vmdk file has a size of 15GB.

The first replication took 7 hours and the WAN bandwidth usage for the replication maxed out at 4Mbit/s. I'm not sure why it did not use more (there was more available and no QoS limit configured), maybe because of the latency and block size the replication process uses.

The next replication took 5hours and 45minutes. WAN bandwidth usage was minimal (somewhere less than 1Mbit but hard to tell).
When I check the jobs runs for this replication job it shows for the .vrb file with date/time of the first run a data size of 980MB and restore point size of 512MB. for the .vbk file it shows data size of 15GB and restore point size 8MB.

I read in the forum about ESXi being a bad/slow backup target (VMware's fault). I also read somewhere in the forum to use a Linux VM as target instead but the question is how to get the data/VM's from the Linux VM to the data store.

So I thought maybe doing the replication for the other site (instead of pushing the data, pulling it). Installed Win7 x64 in a VM at the other site and installed Veeam and configured the replication with the same settings except storage is LAN target.
But replication fails after 7minutes with the message:
GetLocalText failed
Client error: File does not exist or locked. VMFS path: [[cx3_1] ****.vmx].
Please, try to download specified file using connection to the ESX server where the VM registered.
Failed to create NFC download stream. NFC path: [nfc://conn:172.16.0.78,nfchost:host-110,stg:datastore-30@*****.vmx].

Server error: End of file

The initial replication I could do with removable media. But if the incremental replication takes hours for a small VM then it would take days for file and mail server. Moving to ESX 4.1 (the one with the full service console) is not really an option.

What other options are out there besides spending $10k+ for depulication appliances? Does HyperIP speed up the "incremental" replication, even replication uses only a small amount of the available bandwidth?

Enrico
Vitaliy S.
VP, Product Management
Posts: 27325
Liked: 2778 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: replication and ESXi

Post by Vitaliy S. »

Hello Enrico,
bc07 wrote:At the source location I have Veeam installed on a physical Win 2008 x64 server with quad core CPU and 8GB RAM.
The Replication Job is configured with Processing Mode Network, Compression Optimal, Storage WAN Target, CBT and application awareness.
I don't think that it should be an issue with a quad core CPU, but could you please tell me what is the CPU load on this server while replication job runs? Have you considered using Virtual Appliance replication mode, as it should be a way faster than Network mode?
bc07 wrote:I read in the forum about ESXi being a bad/slow backup target (VMware's fault). I also read somewhere in the forum to use a Linux VM as target instead but the question is how to get the data/VM's from the Linux VM to the data store.
This recommendation refers to backup jobs, not replicas. For replication jobs this recommendation would look like - please use full blown ESX host, but this is not an option for you (as far as I got it right).
bc07 wrote:So I thought maybe doing the replication for the other site (instead of pushing the data, pulling it). Installed Win7 x64 in a VM at the other site and installed Veeam and configured the replication with the same settings except storage is LAN target.
But replication fails after 7minutes with the message:
GetLocalText failed
Client error: File does not exist or locked. VMFS path: [[cx3_1] ****.vmx].
Please, try to download specified file using connection to the ESX server where the VM registered.
Failed to create NFC download stream. NFC path: [nfc://conn:172.16.0.78,nfchost:host-110,stg:datastore-30@*****.vmx].
That might give you some performance benefits, but not so significant. Judging by the error message, it seems like you either have not reliable connection or the VMX file is indeed locked for some reason. I suggest contacting to our support team to assist you with further inverstigation.
bc07 wrote:Does HyperIP speed up the "incremental" replication, even replication uses only a small amount of the available bandwidth?
From the posts on these forums, I can say that it does speed up the replication job, so HyperIP should definitely help here.

Thank you!
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: replication and ESXi

Post by bc07 »

Hello Vitaliy,

Thank you for your response.

When the replication is running on the Veeam backup server it uses around 30% CPU time. Total memory usage is around 2.8GB (most of it for database server and the OS).

Well for the full blown ESX host, VMware won't release any new versions which means we have to migrate to ESXi anyway. Where we replicate to, is right now ESXi and in use to migrate to ESX would take some time and resources (this ESXi host is alone at a remote site).
How much faster would replication be using ESX as target compared to ESXi ?

I'm running a replication right now with HyperIP optimizing it. It is faster (not done yet), running now for 2hours and has around 50% done. But that is not as fast as it should be and still uses only a fraction of the possible bandwidth. Waiting around 4hours to replicate changes (which are maybe 1GB or less) of a 15GB VM is not feasable.

Would it matter at the current speed of 0.5 -1MB/s to have the source as ESX instead ESXi and/or use Virtual Appliance instead of Network?

Enrico
lobo519
Veteran
Posts: 315
Liked: 38 times
Joined: Sep 29, 2010 3:37 pm
Contact:

Re: replication and ESXi

Post by lobo519 »

I am replicating an average of 1GB of changes 500MB restore size between ESX hosts using Hyper IP throttled to 18mpbs and virtual appliance mode in about 10-15 minutes. I am putting in 3 new hosts at a remote site keep one as full ESX so I can keep my replication speed until something changes with ESXi.
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: replication and ESXi

Post by bc07 »

How was (or would be) your replication speed without HyperIP?
Vitaliy S.
VP, Product Management
Posts: 27325
Liked: 2778 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: replication and ESXi

Post by Vitaliy S. »

bc07 wrote:Well for the full blown ESX host, VMware won't release any new versions which means we have to migrate to ESXi anyway. Where we replicate to, is right now ESXi and in use to migrate to ESX would take some time and resources (this ESXi host is alone at a remote site).
Yes, we are aware about VMware plans and we are planning to enhance replication architecture in the next release to better support ESXi replication target.
bc07 wrote:How much faster would replication be using ESX as target compared to ESXi ?
I cannot give you the exact numbers, there are too many variables, but using full ESX as replication target (as opposed to ESXi) would reduce the traffic in more than 3 times, because we would be able to use temporary local agent running in target ESX service console.
bc07 wrote:Would it matter at the current speed of 0.5 -1MB/s to have the source as ESX instead ESXi and/or use Virtual Appliance instead of Network?
Changing source host to ESX wouldn't help much here, though Virtual Appliance would be able to process source VM data much faster. In VA mode, VM data is retrieved directly from storage through the ESX(i) I/O stack, which improves performance. As soon as data gets to the backup server, the overall performance will depend on your WAN link.
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: replication and ESXi

Post by bc07 »

Thank you.

The incremental replication (pushing it with Veeam backup at the source host to the ESXi host at the destination) finished with HyperIP optimizing it. It finished in 4hours. I didn't use the whole bandwidth available , data size 1GB and restorepoint size 450MB. That is fater than around 6hours but still unacceptable.

I configured HyperIP to optimize traffic from the source host to the veeam backup at the destination site. This "pulling" replication over WAN works when I use HyperIP to optimize traffic, as soon as I remove that from the replication still fails with the error mentioned in my first post.

At this pulling replication I'm replicating another VM (WinXP, 8GB disk size). The initial replication over WAN took 133 minutes and it used the whole bandwidth of 8Mbit assigned through HyperIP. A followed incremental replication took 9minutes, it also used the whole bandwidth (8Mbit), in veeam backup replication it shows data size 532MB and Restorepoint size 257MB.
This is more than 12 times faster then "pushing" replication.

I opened a support ticket about the "pulling" replication failing, when that is fixed I'll see how much difference HyperIP makes at replication to ESXi target.
Vitaliy S.
VP, Product Management
Posts: 27325
Liked: 2778 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: replication and ESXi

Post by Vitaliy S. »

Enrico, thanks for the update. We'll be waiting for more good news from you.
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: replication and ESXi

Post by bc07 »

I was able to fix the problem. It was my stupid mistake, I had a wrong default route at the source host (with HyperIP it worked because I had to add a satic route to the HyperIP appliance).

I did a full "pulling" replication of the VM with 8GB disk without HyperIP, that took 200 minutes and used around 7Mbit bandwidth (not limited by anything) of around 10Mbit available bandwidth.
That means full replication in my case is with HyperIP more than %40 faster.
I did an incremental replication after it, that took 19minutes (also used around 7Mbit), data size 570MB restore point size 280MB.
It seems that having the Veeam Backup server at the destination site when doing WAN replication to ESXi is the way to go. HyperIP speeds things up additionally, if it is worth buying depends on if you can meet the backup window to transfer all your data time. For optimizing a 10Mbit connection HyperIP does not come cheap but is cheaper than riverbed and most WAN optimization appliances. I tested HyperIP also for SMB file transfer (Win2003 to Windows 2003), it does not do any good there.

I'll do some tests with full ESX next week to find out if it will be faster than ESXi (without HyperIP).

One question remains. How do I do the initial replication to removable storage when the Veeam backup server is on the other site of the WAN? I have to somehow make the inital replication to the removable storage at the source site and then somehow import it on the Veeam server at the destination site.

Enrico
Gostev
Chief Product Officer
Posts: 31707
Liked: 7212 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: replication and ESXi

Post by Gostev »

Replica seeding with Veeam server located on target site is not possible in v5 (it works, but does not make sense since the data will to the Veeam server through WAN anyway).
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: replication and ESXi

Post by bc07 »

Well as you see in my first tests, Veeam server at the source site with ESXi as target is not feasible.
There are only two options:
1st to get it to work in one way or another for the purpose you want to use the product and it was created for (in my case backup AND replication)
2nd it does not work and you stick with what you have or go for another product

There are always by the manufacturer not supported ways to use their product. That does not mean it is wrong (from the users point of view).
I'm waiting for a usable AND affordable VM replication product for years.....
Gostev
Chief Product Officer
Posts: 31707
Liked: 7212 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: replication and ESXi

Post by Gostev »

If you can wait until v6, it will support seeding regardless of Veeam server placement. But as of today, the only available option is option 2. Thanks.
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: replication and ESXi

Post by bc07 »

What is the release date for v6? In a few months, the end of 2011?

I just tested replication of a VM with 8GB disk from ESXi to a full blown ESX host over WAN. The Veeam backup server was at the source site.
I get only around 210KB/s processing speed and it seems to use only around 2.5 - 3Mbit bandwidth. For me that looks like an issue with the replication process when the latency to the target is high and it is not only related to ESXi.

Enrico
Gostev
Chief Product Officer
Posts: 31707
Liked: 7212 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: replication and ESXi

Post by Gostev »

It is more like towards the end of 2011. Pretty huge release there in terms of all new functionality we are adding, specifically multiple replication enhancements. Hopefully most of currently planned features will make it to RTM code ;)

Have you specified service console connection properties for target ESX? Full backup speed may not change significantly going from ESXi to ESX target if your performance is mostly WAN link speed bound.
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: replication and ESXi

Post by bc07 »

Maybe WAN replication between Veeam backup servers (acting as agent) would probably cut the overhead and speed things up ... :)

Yes SSH console connection properties are configured (it does not work without it). The WAN bandwidth is 10Mbit down/up and at the time as the replication ran was plenty of bandwidth left to use more than only 3Mbit.
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: replication and ESXi

Post by bc07 »

The first replication test to the full blown ESX target was without HyperIP.
I just tested it with HyperIP, it used the max bandwidth (8Mbit limited by HyperIP) and got a processing speed of 1MB/s.
Vitaliy S.
VP, Product Management
Posts: 27325
Liked: 2778 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: replication and ESXi

Post by Vitaliy S. »

bc07 wrote:One question remains. How do I do the initial replication to removable storage when the Veeam backup server is on the other site of the WAN? I have to somehow make the inital replication to the removable storage at the source site and then somehow import it on the Veeam server at the destination site.
As initial replication to a removable storage is not possible with Veeam backup v5 located on the target site, you can perform it on the source site and then move replica files as well as the Veeam backup server to your target site, it should make a trick.
ckd5150
Influencer
Posts: 10
Liked: never
Joined: Dec 05, 2009 7:27 am
Full Name: Carl
Contact:

Re: replication and ESXi

Post by ckd5150 »

I have an open support case on this now. ESXi 4.1U1 source with ESXi 4.1U1 target across a 10Gbps connection. Veeam server at the target site pulling (per veeam recommendation) and only getting 4MB/s transfer speed. ESX is not an option. The move to ESXi by vmware hasn't been a secret. Is Veeam planning any minor patches to address this now or is it just telling customers "hate that for you"?
Vitaliy S.
VP, Product Management
Posts: 27325
Liked: 2778 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: replication and ESXi

Post by Vitaliy S. »

We are planning to enhance replication architecture to better support ESXi replication target in the next major release. By the way what is the upload speed when you try to upload a file to a local ESXi host with vSphere Client?
ckd5150
Influencer
Posts: 10
Liked: never
Joined: Dec 05, 2009 7:27 am
Full Name: Carl
Contact:

Re: replication and ESXi

Post by ckd5150 »

I'll run some tests and do a follow-up post. If you're aware of a way to speed up the esxi mgmt interface for pumping data in, please post. I'll give anything a shot at this point.
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: replication and ESXi

Post by bc07 »

Charles check how fast you can copy files with vSphere Client and/or try other applications (SMB2, FTP) if you get higher throughput.
Also try HyperIP that should improve the throughput.

Do you have 10Gbit connection the whole way (source, veeam server and target)? What latency do you have between source, target and veeam server?
ckd5150
Influencer
Posts: 10
Liked: never
Joined: Dec 05, 2009 7:27 am
Full Name: Carl
Contact:

Re: replication and ESXi

Post by ckd5150 »

25ms latency fixed over the WAN.
In theory it's 10Gbps all the way through.
SMB copy from win guest at source DC to win guest at target DC of 603MB copied in 53seconds
VMFS Datastore upload of same 603MB file across WAN to target datastore took 3m19s
upload of same 603MB file from guest at target location to esxi vmfs lun at same target location 7.8sec
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: replication and ESXi

Post by bc07 »

ckd5150 wrote:25ms latency fixed over the WAN.
VMFS Datastore upload of same 603MB file across WAN to target datastore took 3m19s
That looks like an issue with the WAN link or the protocal used during copy of the WAN connection.
I would recommend testing out that not the storage is an issue, test at both location network internally what throughput you get. And then check on the WAN connection maybe any of the devices on the WAN connection is messing with this kind of traffic.
With 25ms latency on 10Gbit it should be faster than 4MB/s, with hyperIP you can compensate for the latency over WAN and more.
ckd5150
Influencer
Posts: 10
Liked: never
Joined: Dec 05, 2009 7:27 am
Full Name: Carl
Contact:

Re: replication and ESXi

Post by ckd5150 »

unless hyperIP is free i can't use it as a solution at the moment. Replication was working prior to the esxi 4.1u1 upgrade and getting decent speeds at that.

From the support tech working the ticket with me "As far as I know there are no known issues with replicating from ESXi 4.1 U1 to ESXi 4.1 U1"
Vitaliy S.
VP, Product Management
Posts: 27325
Liked: 2778 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: replication and ESXi

Post by Vitaliy S. »

ckd5150 wrote:Replication was working prior to the esxi 4.1u1 upgrade and getting decent speeds at that.
Upgrade to 4.1 u1 shouldn't be a reason for a dropped performance, there should be something else that has also changed.
ckd5150
Influencer
Posts: 10
Liked: never
Joined: Dec 05, 2009 7:27 am
Full Name: Carl
Contact:

Re: replication and ESXi

Post by ckd5150 »

nope. replication on ESX 4 U1 worked fine. After upgrading to ESXi 4.1U1 it went into the toilet. I've confirmed with the networking group that nothing has changed between the two DCs. Veeam was upgraded to troubleshoot this issue but hasn't shown any improvement. The disks that it is writing to are configured as RAID 10 and are dedicated to these replicas. Only one replica runs at a time so there is no I/O contention.
Bunce
Veteran
Posts: 259
Liked: 8 times
Joined: Sep 18, 2009 9:56 am
Full Name: Andrew
Location: Adelaide, Australia
Contact:

Re: replication and ESXi

Post by Bunce »

Check what port its replicating over. We had something similar which was causing it to 'fail-back' to network mode.

Can't remember the specifics but the quick check was to see what port it was using - if using 902 the something's not right. I think 4.1 included some changes to SSH authentication (root elevation etc) and we needed to re-enter the SSH credentials in Veeam or something.

Edit: This might be the post: http://www.veeam.com/forums/viewtopic.p ... 983#p24644
Vitaliy S.
VP, Product Management
Posts: 27325
Liked: 2778 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: replication and ESXi

Post by Vitaliy S. »

Bunce wrote:Check what port its replicating over. We had something similar which was causing it to 'fail-back' to network mode.
As far as I see from the description above Charles is already using network mode because of the recommendation to place the backup server on the target site. With ESXi host acting as a destination target it is always a recommended way of deployment.
ckd5150 wrote:replication on ESX 4 U1 worked fine. After upgrading to ESXi 4.1U1 it went into the toilet.
Yes, that's expected. Unfortunately, I'm not aware about other tricks and tweaks than you can do to make ESXi management interface work quicker.
Oletho
Enthusiast
Posts: 67
Liked: 2 times
Joined: Sep 17, 2010 4:37 am
Full Name: Ole Thomsen
Contact:

Slow replication after changing target server to ESXi

Post by Oletho »

[merged]

In a small installation running Veeam in virtual appliance mode, initial replication speed has dropped from 10MB/s to 2MB/s after moving to ESXi on the remote site.

I have found explanations in this forum about ESXi running agentless replication which is slower, and the solution should be installing Veeam on remote site for the replication jobs.

But still 2MB/s on a 100Mb/s connection sounds strange to me. Can anyone explain why, and confirm that the customer should install a remote Veeam appliance?

Ole Thomsen
Vitaliy S.
VP, Product Management
Posts: 27325
Liked: 2778 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: replication and ESXi

Post by Vitaliy S. »

Hello Ole,

Yes, in case of ESXi host being used you should deploy Veeam backup server on the remote site.

To get more details, please take a look at this post: Replication High Network Utilization

Thanks.
Post Reply

Who is online

Users browsing this forum: Google [Bot] and 34 guests