Does anyone have replication with Veeam6 functional?

vertices · Post by **vertices** » Feb 22, 2012 4:15 pm this post

We have been trying for weeks to get this to work. Everything was fine with Veeam5. Upgraded to Veeam6 and nothing but nightmares. So far have got nowhere with my support case (ID#5172531) so posting here in hopes anyone has any suggestions. All ESX hosts are ESXi 4.1. Veeam 6 is patched with latest patch as of a couple days ago.

We have 2 sites with a 20mbps pipe between them, I'll simply call them source and destination. Source site has 3 ESX hosts connected to a SAN. Also has a physical Veeam server that only performs backups. All servers are joined to the same Windows domain and all are using the same domain admin credentials. This physical Veeam server has no problem backing up the VMs in the source site. It backs them up between 5:30pm and 8:00pm. Everything is perfect with this.

In the destination site we have another physical Veeam server with 6 core, 16GB of RAM, nice system, and there is a single ESXi host connected to a small iSCSI SAN. The Veeam server has a fresh install of Veeam 6 on it (not upgraded from 5). We also have a fresh dedicated 2008R2 VM at the source site I'll call prox1. The replication jobs on the Veeam server in the destination site are configured to use prox1 as the source proxy, and the local server as the destination proxy. I have it set to use existing replicas as a seed and the automap just fine. Everything is left to automatic, other than which proxies to use.

So 2 Veeam servers, identical configs. One is for backing up in the source site and works fine, the other is for pulling replicas to the destination site which doesn't work at all.

We are plagued with nondescript failures such as "An existing connection was forcibly closed by the host" and "An established connection was aborted by the software in your host machine".

I watched the jobs run one night and saw some things. Almost always when I see a failure with "An established connection was aborted by the software in your host machine" it fails at the snapshot removal point, either during or just after. I also sometimes see an error in VMware about "Unable to access file <unspecified filename> since it is locked" in regards to a snapshot removal. However using the same credentials that Veeam is configured with, I can manually remove it just fine.

I have done everything support has wanted me to do, recreating jobs, deleting all vbk files in the replica repository. Nothing seems to improve the situation at all. I am strongly considering deeming Veeam6 not ready for production and going back to 5 which worked just fine, albeit not very efficient.

Does anyone have replication working or any suggestions for us? So far we have had our case open for 9 days, and no DR site for 14 days. I can't let this go on much longer without dropping down to Veeam5 to fix things.

By the way my case# is 5172531in case anyone from Veeam chimes in and wants to take a look.

SunkistDavid · Post by **SunkistDavid** » Feb 22, 2012 6:13 pm this post

Is the remote DR ESXi server added in your server list? Check its credentials since its an ESXi server? Are you using root or the equivalant for username?

Also try

Please connect to the host\VC that the guest OS sits on WITH Vsphere FROM the Veeam Server USING the soap connection provided to Veeam for connection to the above host\vc and try downloading server.nvram from the same datastore that the guest is on to the Veeam server. If this does not work its a permission issue as the remote proxy is calling back to the local Veeam server and doesn't have the correct permissions.

Also please verify that port 902 is open open on the VC\Host and Veeam Server

SunkistDavid · Post by **SunkistDavid** » Feb 22, 2012 6:17 pm this post

Also if you are unable to download the nvram then contact support and inform them of case number 5167894 to use as reference to help you finish solving problem.

vertices · Post by **vertices** » Feb 22, 2012 6:31 pm this post

All ESXi hosts are licensed hosts and part of the same vCenter.

These VMs do occasionally successfully replicate. Like maybe once a week each VM is randomly successful which leads me to believe it's not a permissions issue being that it does sometimes work. It just usually doesn't. To be honest, I didn't quite follow 100% what you want me to do in regards to:

"Please connect to the host\VC that the guest OS sits on WITH Vsphere FROM the Veeam Server USING the soap connection provided to Veeam for connection to the above host\vc and try downloading server.nvram from the same datastore that the guest is on to the Veeam server. If this does not work its a permission issue as the remote proxy is calling back to the local Veeam server and doesn't have the correct permissions."

SunkistDavid · Post by **SunkistDavid** » Feb 22, 2012 6:51 pm this post

Connect to your remote Veeam Server. From that server connect to your ESX server that is being backed-up locally and download the VIClient. Run the VI client and connect to the local host, browse the datastore of the VM that is being backed-up, select the nvram file and select it to be uploaded to your local file. If it errors out then its a permission issue. This is bug VMWare found. BUT I don't think this is your issue as you indicate they do occassionally replicate.

Do you have a firewall between the sites that may be closing down the connection?

kellison · Post by **kellison** » Feb 22, 2012 8:37 pm this post

I had the same problem when I upgraded to Ver 6. I was trying to replicate VM's from my production site over to the DR site and sometimes it would work but most of the time it would not and I saw the same type of errors. I also tried everything under the sun with no luck.

But - instead of running the replication from my production Veeam server using the "push" method, I setup the replications on my Veeam server over at my DR site and have not had a problem since.. Guess here you can say it uses the "pull" method.

Both my Veeam server's are VM's, Windows 2008 R2 with 16GB of RAM and 8 CPU's allocated to each. My backups running at both sites are working fine. Maybe try to run the replication from your DR site????? Thanks Ken >

marty9876 · Post by **marty9876** » Feb 22, 2012 8:53 pm this post

No, and I'm completely sick of this. I'm really close to throwing in the towel and being done with Veeam.

Bought matching firewalls for VPN tunnels, sunk the money into a DR ESXi host to replicate to in a colo and it's been a complete bust. $15k+ and hundreds of hours of my time during nights and weekends to get this working. Just a total failure.

Well back to seeding a USB drive to ship off to the colo since I can't replicate over WAN or use the seed I shipped the server with. What's that they say about doing the same thing over again expecting a different result...rhymes with Veeam I think...

SunkistDavid · Post by **SunkistDavid** » Feb 22, 2012 8:59 pm this post

Marty, I have a pretty complicated network at our DR site that includes 3 different subnets with two matching my primary site but blocked by FW. It took awhile but its working flawlessly. Take a deep breath and by the way I don't work for Veeam. Check your firewall and make sure there is a rule with a route back from the Veeam Proxy at the remote site to your primary site. Also did you test the NVRAM process that I recommended. This is what help me determine I had a firewall issue between remote site and primary site.

marty9876 · Post by **marty9876** » Feb 22, 2012 9:26 pm this post

For me I've had a few different problems to address. Yes, networking was an issue. Brand new Watchguard XTM's have a bug in the latest version which stops the Veeam SOAP traffic at a given point.

I setup a virtual pfsense firewall now (on the host) and use that for a VPN endpoint. The small replication jobs are ok I think, the large job fails every time. Sometimes after 36 hours, sometimes after 48 hours (I now have the latest Veeam patch 4 with the 48 hour bug fixed) and then throws the error. Most of the time bombes out after a few hours. I had to grow the drives on the big job (600 GB VM) and that seems to have invalided the seed because things blow up after the first pass of the seed job (inflates from backup, the actual replication pass to grab the changes).

It's been 2.5 months of just grinding away at this. I've had firewall bugs, Veeam bugs and shot myself in the foot bugs all stacked on each other.

I was able to grab the nvram file no problem.

vertices · Post by **vertices** » Feb 22, 2012 10:18 pm this post

Ok well now I'm being told to contact VMware to find out what is going on. Which strikes me as odd as we've had no VMware troubles at all, and again, everything was fine with Veeam5. All the troubles started the day we put in Veeam6. What a mess. I don't think my client is going to enjoy Veeam anymore after all of this work at $150 an hour with no resolution in sight. I wouldn't be surprised if he ditched Veeam Enterprise altogether.

SunkistDavid · Post by **SunkistDavid** » Feb 22, 2012 10:23 pm this post

What is the reason given that it was pushed to VMWare?

vertices · Post by **vertices** » Feb 22, 2012 10:36 pm this post

"If that's the case (and it was what I was afraid was going to happen), then I'll have to request that you contact VMWare directly to have them look into what is happening on the host. Something is being triggered in the host that is making it drop the communication."

Which I also find odd because a Veeam6 server at the source site has NO PROBLEM backing up all VMs that are having replication problems. I mean VMware as well as Veeam6 are working perfectly together in the source site. It's only when I put Veeam6 in a destination site and tell it to replicate instead of backup that it fails, and even then, it worked fine with Veeam5.

marty9876 · Post by **marty9876** » Feb 22, 2012 10:56 pm this post

I feel like the client, at a certain point I need to respect the business needs and move on to something else. Regardless of it's the products issues or my ability to get the product working, at the end of the day the business's needs are not being met.

vertices · Post by **vertices** » Feb 22, 2012 11:17 pm this post

I wonder if there is any possibility that the seeds are not working out right due to the fact that they being created by a different Veeam 6 server.

I think I'm going to try one last thing. We have a new physical Veeam server at the source site for backups that is not in production yet. I'm thinking I will make this server simply a proxy and repository and have the jobs themselves managed by the Veeam server in the destination site. If I can get that working with backups then I have completed isolated it to everything fully functioning with the single Veeam Management server. Then I will use those same backup jobs as seeds. This way the Veeam server in the management site controls all jobs, both backup and replication, and if backups work but not replication it will be easier to point direclty at Veeam without this runaround and trying to push me back to VMware.

Post by **Jfmoots** » Feb 22, 2012 11:34 pm this post

Rob,

Check your PMs.

I'm currently working a problem with similar traits and I'd like to compare notes to see how similar they really are.

vertices · Post by **vertices** » Feb 22, 2012 11:48 pm this post

Reply sent. I won't be able to start testing replication jobs for a couple days until I get this new config squared away and brand new seeds brought over. Which really is our main intent in the end anyway, this is just a slight tweak on it. May as well get the rest going. Client is not happy though and has told me this my last shot to get it working or I have to go back to Veeam 5 and he won't be happy about it.

Post by **Jfmoots** » Feb 22, 2012 11:54 pm this post

Not a problem. I understand your situation.

integis · Post by **integis** » Feb 23, 2012 10:56 pm this post

I have a setup where I one of the source hosts is connected via Sonciwall VPN over a 100mb pipe. Took a few times before I got a seed to go - it would fail after 2-3 hours, occasionally after 30 minutes. Have had especially bad luck while using the "WAN" compression setting.

How long do your jobs run before being dc'd? Instantly, hours, days, or always random? Have you tried various VM sizes? (could create a <2gb windows XP VM easily for testing purposes).

If you can replicate a tiny VM, you could expand its size until you find a breaking point.
If you have available resources, you could also try throttling a local test to 20mb/s similar to your pipe size.

Good luck!

vertices · Post by **vertices** » Feb 26, 2012 5:12 pm this post

Ok well I fixed it. In the end, only 2 differences from before.

1. The source proxy is now a physical server, and not a VM living on the infrastructure that was being replicated.
2. The seeds were made using the same Veeam6 server that owns the replication jobs.

Our old source proxy VM was a brand new dedicated VM with 4 cores and 8GB of RAM. So I really have no idea why it didn't work. I know it doesn't make a whole lot of sense that only changing the 2 above items would fix it. I suspect there are still some weird undiscovered bugs with Veeam6 in certain configurations.

What I do know is now we have 2 identical physical dedicated servers on each side. The one on the destination side has fullblown Veeam, proxy, and repository. The one on the source side is a proxy and repository. We now use the server on the destination side to control all backup and replication jobs. The server on the destination side is the same server we were using before and it wouldn't work. We just dropped another identical one on the source side and moved the proxy off that VM to it, then reseeded and it all worked.

I don't know why it works now, but I do know Veeam6 is pretty awesome once you actually figure out what configuration will work in your environment. It dropped our replication windows from 8-10 hours to around 2 hours. Huge improvement.

We have SonicWALL NSA firewalls with WXA appliances as well. I was messing with all of that trying to figure it out before and nothing helped. In the end, it had nothing to do with our firewalls and ultimately just somehow ended up being a brand new proxy VM and seeds I guess. Weird.

Post by **Gostev** » Feb 26, 2012 7:17 pm this post

Thanks for the update. Glad to hear everything works fine now, and that you are having great results with v6 replication!

vertices · Post by **vertices** » Feb 27, 2012 3:07 pm this post

Well I guess I spoke too soon. Last night the dreaded "Client error: An established connection was aborted by the software in your host machine
Failed to process [sendSignature]" returned on several VMs.

Post by **Gostev** » Feb 27, 2012 3:25 pm this post

Hmm, looks like some process actively terminates the network connections.

vertices · Post by **vertices** » Feb 27, 2012 3:54 pm this post

The issue that I have with that is that if something else was terminating connections, then I would expect to see more random results. However, now when a failure occurs, it is at the exact same spot for each VM every time. Right after it completes sending Hard Disk 1, which is the longest part of the jobs, it then successfully removes the snapshot, then promptly fails with the error I posted above.

If it was network problems, why would it always fail after removing the snapshot?

Post by **Gostev** » Feb 27, 2012 3:57 pm this post

I have literally never seen this issue in 4 years, so have no ideas on possible explanation for this behavior whatsoever...

Post by **rbrambley** » Feb 27, 2012 5:14 pm this post

Have you checked the windows logs on the veeam server and proxies for any other clues to this mystery? Just an idea.

burkerust · Post by **burkerust** » Feb 27, 2012 5:27 pm this post

I had exactly the same problem on my replication job after upgrade from 5 to 6. Ended up opening port on esx 4.1 destination "VeeamAgent : port 2500 tcp.in" No more failures after that. Not sure if the opened port will stay after a reboot as we are upgrading to esxi 5 which has no firewall. We have been replicating for 25 days without failure!!! ummv.

-----------------edit------------------------------------------------------------
After reading the thread more carefully, I see that this is happening on esxi servers, so my fix is not the same. I will be paying close attention to this thread for our future upgrade to esxi. I won't upgrade until Veeam has this straightened out.

ipm · Post by **ipm** » Feb 27, 2012 7:00 pm this post

We could not get replication to work with Veeam 5 or 6 on our ESXi cluster, support wasn't the best...
Higher ups decided to drop Veeam and we're looking into better alternatives now such as Appsure now.

We're probably going to drop Veeam on other client sites as well, on our biggest client didn't work so much... (50 VM's)

Post by **Gostev** » Feb 27, 2012 7:41 pm this post

@ipm we are sad to see you go, however please note that your negative experience with Veeam replication is more of an exception to the rule > v6 WAN Replication - Wow!. As you can see, feedback from multiple other users is overwhelmingly positive. Of course, when there are environmental issues outside of our control, there is little we could do.

As far as alternatives, you will find that they are lacking the depth and breadth of features comparing to our v6 significantly. Specifically to AppAssure, it is not even designed for virtualization (requires data mover agent in each VM). Even the slowest incumbent vendors have now went away from this approach, because of how inefficient it is for virtualization. Moreover, it is fully built on top of Microsoft VSS, and as such does not support any non-Microsoft OSes. And there are many more useful replication features missing... we never really see it as a competition although yes, in the last half-year they have completely copy-cat our web-site and our messaging

Also, check out their own customers' feedback...

vertices · Post by **vertices** » Feb 28, 2012 2:27 am this post

rbrambley wrote:Have you checked the windows logs on the veeam server and proxies for any other clues to this mystery? Just an idea.

I've checked logs everywhere. I can't seem to nail it down.

Tonight I'm going to run without WAN acceleration provided by the WXA appliances. We had no trouble with them under Veeam5 but who knows maybe Veeam6 is more picky. We'll see how it goes.

If it doesn't work well tonight I'm going to try that firewall tip provided above on the target ESXi host.

vertices · Post by **vertices** » Feb 28, 2012 2:34 am this post

Well checking on the firewall for ESXi 4.1, it seems it doesn't have one.

"ESXi 4.0/ESXi 4.1 does not include a firewall because it runs a limited set of well-known services and prevents the addition of further services. With such restrictions, the factors that necessitate a firewall are significantly reduced. As such, no firewall is integrated in to ESXi. "
http://kb.vmware.com/selfservice/micros ... Id=1003634

So I guess that can't be it.

R&D Forums

Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Re: Does anyone have replication with Veeam6 functional?

Who is online