-
- Novice
- Posts: 3
- Liked: never
- Joined: Jun 12, 2012 6:20 pm
- Full Name: Keith Sutherland
- Contact:
Large replication question(Putting the feelers out)
Hi All,
Will be setting up a demo of Veeam B&R soon but wanted to put the feelers out incase someone had a similar setup to mine that can vouce for speed , reliability and my thinking
Currently I have a vmware 4.1 farm consisting or around 25 HP BL465c G7 blades fully populated (24cores) attached to a F-Class 3PAR storage system hosting around 450 windows vm's mix of Exchange, SQL,Citrix VDI's, DC's & F&P.
I have a 100Mbit connection to our dedicated DR site which comprises of roughly 9 HP G6 blades and a HP 6400EVA.
Ideally i would like to replicate around 20 of my core VM's which is around 40TB's in total of which possibly 200GB changes on a daily basis.
Currently we use a product called doubletake to replicate a couple of DC's, exchange boxes, SQL cluster etc etc, but this product has never been used in anger so is untested.
I've a couple of questions.
Firstly has anyone else out there got a similare setup working with veeam successfully.? If so what have been your thoughts on the whole process and have there been any issues.?
Secondly - even though i only want to replicate 20ish of my vm's i assume i still have to purchase a licence for each socket in my HP cluster.?
Thanks in advance for any thoughts, posts etc.
Will be setting up a demo of Veeam B&R soon but wanted to put the feelers out incase someone had a similar setup to mine that can vouce for speed , reliability and my thinking
Currently I have a vmware 4.1 farm consisting or around 25 HP BL465c G7 blades fully populated (24cores) attached to a F-Class 3PAR storage system hosting around 450 windows vm's mix of Exchange, SQL,Citrix VDI's, DC's & F&P.
I have a 100Mbit connection to our dedicated DR site which comprises of roughly 9 HP G6 blades and a HP 6400EVA.
Ideally i would like to replicate around 20 of my core VM's which is around 40TB's in total of which possibly 200GB changes on a daily basis.
Currently we use a product called doubletake to replicate a couple of DC's, exchange boxes, SQL cluster etc etc, but this product has never been used in anger so is untested.
I've a couple of questions.
Firstly has anyone else out there got a similare setup working with veeam successfully.? If so what have been your thoughts on the whole process and have there been any issues.?
Secondly - even though i only want to replicate 20ish of my vm's i assume i still have to purchase a licence for each socket in my HP cluster.?
Thanks in advance for any thoughts, posts etc.
-
- VeeaMVP
- Posts: 6166
- Liked: 1971 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Large replication question(Putting the feelers out)
Uhm, 12-core AMD processors requires Tier B licensing from Veeam, so for sure licensing 50 sockets is going to be a huge expense.
I do not know which level of license you have on your vSphere environment, but if you have DRS you can try a "Should" VM to Host affinity rule, and license only the hosts where those VMs are running on.
About the setup, it sounds good, do not have the same design at customers, but its general lines are similar to many designs I've seen.
Luca.
I do not know which level of license you have on your vSphere environment, but if you have DRS you can try a "Should" VM to Host affinity rule, and license only the hosts where those VMs are running on.
About the setup, it sounds good, do not have the same design at customers, but its general lines are similar to many designs I've seen.
Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- Veteran
- Posts: 261
- Liked: 29 times
- Joined: May 03, 2011 12:51 pm
- Full Name: James Pearce
- Contact:
Re: Large replication question(Putting the feelers out)
As said affinity rules will be needed to control costs. 100Mbps line should be easily enough for the delta rate, although seeding is probably the way to go given the underlying data size.Rangler wrote:I have a 100Mbit connection to our dedicated DR site...i would like to replicate around 20 of my core VM's...possibly 200GB changes on a daily basis....Currently we use a product called doubletake to replicate a couple of DC's, exchange boxes, SQL cluster etc etc, but this product has never been used in anger so is untested.
Doubletake has a horrific impact on disk IO at the primary in my experience; our Exchange server IOPS dropped by 70% when we replaced DT with Veeam. And of course the DT failover method is cumbersome at best. That said with Ex.2010 I'd argue it's better to use a cross-site DAG instead of either.
You might consider a LAN extension to DR site to avoid re-addressing, if it's only 20 VMs core. Also replicating DCs is never ideal unless the plan is to fail-over all DCs serving the domain. Could be better to deploy dedicated DCs at the DR site instead (of course consider the FSMO role placement).
-
- Veteran
- Posts: 315
- Liked: 38 times
- Joined: Sep 29, 2010 3:37 pm
- Contact:
Re: Large replication question(Putting the feelers out)
There is a lot of discussion out there about Backing up/replicating DCs and to be honest, I still don't feel that i fully understand the proper way to restore. If you had two domain controllers and replicate one that has all the FSMO roles - would you still need the second DC to properly failover/restore from a disaster?J1mbo wrote: Also replicating DCs is never ideal unless the plan is to fail-over all DCs serving the domain. Could be better to deploy dedicated DCs at the DR site instead (of course consider the FSMO role placement).
-
- Chief Product Officer
- Posts: 31814
- Liked: 7302 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Large replication question(Putting the feelers out)
Yes, of course you do. Domain Controllers are designed to stop NETLOGON service if they cannot contact any replication partner during certain time after restore. I don't remember how long exactly though.
-
- Enthusiast
- Posts: 85
- Liked: 8 times
- Joined: Jun 11, 2012 3:17 pm
- Contact:
Re: Large replication question(Putting the feelers out)
Replication targets do not impact your licensing. You only pay licensing on your source hosts. If you set up affinity rules as mentioned above, you only have to purchase licensing for the hosts that those VM's live on. Not sure on the size of those VM's, but I would guess that you'd be able to get away with licensing half of your infrastructure with a little bit of affinity wizardry.
As far as that much data, it's probably feasible, depending upon how many snaps you want to maintain. Remember that you cannot maintain more than 32 snaps of a VM. If you're looking at 200GB of daily change, that could be a boatload of data that is retained if you want to retain more than a few days data. For your purposes, I would look at taking more frequent snapshots, and doing 2-3 days retention. If you're using this for DR, you're not going to want to go back to data that is two weeks old anyway, right?
By running maybe 4-6 snaps per day, you'll reduce the network bandwidth needed by spreading that load out over the day. The downside is that you'll impact production storage 4-6 times per day doing the replication, rather than just once. If your primary storage is up to the task, I'd look at doing it that way. Other than that, I don't see anything wrong with your config, just remember to pre-seed that data, or it'll take you a month to get the first run over there.
As far as that much data, it's probably feasible, depending upon how many snaps you want to maintain. Remember that you cannot maintain more than 32 snaps of a VM. If you're looking at 200GB of daily change, that could be a boatload of data that is retained if you want to retain more than a few days data. For your purposes, I would look at taking more frequent snapshots, and doing 2-3 days retention. If you're using this for DR, you're not going to want to go back to data that is two weeks old anyway, right?
By running maybe 4-6 snaps per day, you'll reduce the network bandwidth needed by spreading that load out over the day. The downside is that you'll impact production storage 4-6 times per day doing the replication, rather than just once. If your primary storage is up to the task, I'd look at doing it that way. Other than that, I don't see anything wrong with your config, just remember to pre-seed that data, or it'll take you a month to get the first run over there.
-
- Novice
- Posts: 3
- Liked: never
- Joined: Jun 12, 2012 6:20 pm
- Full Name: Keith Sutherland
- Contact:
Re: Large replication question(Putting the feelers out)
Thanks for the info.
Currently just setup a test lab with 100Mbit connection going to do a full backup of our exchange servers then wait 24hours and see what the incremental size turns out to be.
Have also asked our network guy what the doubletake replicated data usage is to our DR which will give me a closer figure on what the daily delta is likely to be.
Wont bother replicating DC's as we have two at our DR plus probably another 50 or so dotted around the globe.
Its the SQL side of things which might kill us however the DBA's have seperate log shipping jobs over to DR so as long as we have an image or the SQL boxes and the DBA's have done their job right by keeping the maintenance backup plans and log shipping up to date we really should only be 15-30 minutes of lost production time.
Got a meeting with different aspects of the company to get an agreed statement of what they need to keep the business trading so i can get the VM list together then look at affinity on the various hosts.
Seeding/ initial backup shouldnt be an issue as i have a couple of spare sans i can backup to and ship over to DR.
Curious what kind of speed increase can i expect if i install a veeam proxy into a vm on the cluster.?
Cheers in advance.
Currently just setup a test lab with 100Mbit connection going to do a full backup of our exchange servers then wait 24hours and see what the incremental size turns out to be.
Have also asked our network guy what the doubletake replicated data usage is to our DR which will give me a closer figure on what the daily delta is likely to be.
Wont bother replicating DC's as we have two at our DR plus probably another 50 or so dotted around the globe.
Its the SQL side of things which might kill us however the DBA's have seperate log shipping jobs over to DR so as long as we have an image or the SQL boxes and the DBA's have done their job right by keeping the maintenance backup plans and log shipping up to date we really should only be 15-30 minutes of lost production time.
Got a meeting with different aspects of the company to get an agreed statement of what they need to keep the business trading so i can get the VM list together then look at affinity on the various hosts.
Seeding/ initial backup shouldnt be an issue as i have a couple of spare sans i can backup to and ship over to DR.
Curious what kind of speed increase can i expect if i install a veeam proxy into a vm on the cluster.?
Cheers in advance.
-
- VeeaMVP
- Posts: 6166
- Liked: 1971 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Large replication question(Putting the feelers out)
If you mean the proxy at DR site, it's not about "IF" you install it, you HAVE TO install it
Also, proxy in a VM at the replication target is even better so it can use hot add for writing into the DR storage.
Luca.
Also, proxy in a VM at the replication target is even better so it can use hot add for writing into the DR storage.
Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- Enthusiast
- Posts: 85
- Liked: 8 times
- Joined: Jun 11, 2012 3:17 pm
- Contact:
Re: Large replication question(Putting the feelers out)
Actually - my testing has indicated that Hot-Add on the remote side for replication is terrible. Network mode is much preferred. Hot-Add seems to exponentially increase the load on the target storage. Veeam Engineering is looking in to this, has replicated it, but has not gotten back to me on a resolution. Select network mode for the transport type, or you'll see your remote SAN crushed.dellock6 wrote:If you mean the proxy at DR site, it's not about "IF" you install it, you HAVE TO install it
Also, proxy in a VM at the replication target is even better so it can use hot add for writing into the DR storage.
Luca.
-
- VeeaMVP
- Posts: 6166
- Liked: 1971 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Large replication question(Putting the feelers out)
Uhm, this is strange because in my deployment hotadd, when I was able to use it (customers constraints usually) has got better results than network mode.
Keep us updated on the findings from Veeam engineers, I'm interested.
Luca.
Keep us updated on the findings from Veeam engineers, I'm interested.
Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- Expert
- Posts: 162
- Liked: 15 times
- Joined: Nov 15, 2011 8:47 pm
- Full Name: David Borden
- Contact:
Re: Large replication question(Putting the feelers out)
I just finished testing and HOTADD at DR site is Waaaaaaaaaay faster than network mode. By double. Not sure why you would say your DR san is crushed. something must not be configured correctly.
Using FC San on both sides, 100MB WAN link from prod to DR site. and gigabyte switches and network connections. HOTADD is way faster than network mode for replication. I have a Virtual Veeam proxy on both ends using hotadd mode because network mode just wasn't cutting it.
I am running the replication jobs from a veaam server in the DR site rather than production site. I keep the metadata repository on the actual virtual veeam proxy at the production site to keep the metadata close to the source data on the san.
Using FC San on both sides, 100MB WAN link from prod to DR site. and gigabyte switches and network connections. HOTADD is way faster than network mode for replication. I have a Virtual Veeam proxy on both ends using hotadd mode because network mode just wasn't cutting it.
I am running the replication jobs from a veaam server in the DR site rather than production site. I keep the metadata repository on the actual virtual veeam proxy at the production site to keep the metadata close to the source data on the san.
-
- Enthusiast
- Posts: 85
- Liked: 8 times
- Joined: Jun 11, 2012 3:17 pm
- Contact:
Re: Large replication question(Putting the feelers out)
Very interesting - Working with a 1Gbit WAN and using Hot-Add mode with a proxy on both sides caused excessive queue depth and horrible amounts of read IO. Switching to network mode completely resolved the issue for me. Must be some sort of special case.
-
- VP, Product Management
- Posts: 6035
- Liked: 2860 times
- Joined: Jun 05, 2009 12:57 pm
- Full Name: Tom Sightler
- Contact:
Re: Large replication question(Putting the feelers out)
We are aware of the issue with hotadd causing large amounts of extra I/O on the target side, but so far this doesn't appear to be a widespread issue but specific to certain configurations, although we haven't yet determined the exact situations where it happens. I've been able to reproduce this issue in my personal lab using a simple iSCSI target, but it doesn't seem to occur when using local disks (at least I can't tell that it does). One of the most interesting things is that the I/O does not show up in the vCenter graphs, but if you monitor the storage you can see it easily.
Can you share with us your target storage configuration (model, connectivity) and also how you are monitoring? That would be useful. We do have ongoing investigation into this issue, I'm actually running some test in my lab environment right now on this very case. In the interim, network mode is a workaround for customers that see the extra I/O when using hotadd.
BTW, I've only seen this extra load with incrementals, the initial full always seems OK with hotadd. Are you seeing the same behavior?
Can you share with us your target storage configuration (model, connectivity) and also how you are monitoring? That would be useful. We do have ongoing investigation into this issue, I'm actually running some test in my lab environment right now on this very case. In the interim, network mode is a workaround for customers that see the extra I/O when using hotadd.
BTW, I've only seen this extra load with incrementals, the initial full always seems OK with hotadd. Are you seeing the same behavior?
-
- Novice
- Posts: 3
- Liked: never
- Joined: Jun 12, 2012 6:20 pm
- Full Name: Keith Sutherland
- Contact:
Re: Large replication question(Putting the feelers out)
Glad this has generated interest.
Just finnished chatting to our network guy and it appears that DOUBLETAKE traffic currently goes down a 1Gbit fibre to our DR site and normal day to day traffic travels across the 100Mbit connection.
So i have even more bandwidth to play with than i thought
Should know Exchange delta sizes tonight and SQL tomorow so should have better figures to do the sums with.
Regards and thanks.
Just finnished chatting to our network guy and it appears that DOUBLETAKE traffic currently goes down a 1Gbit fibre to our DR site and normal day to day traffic travels across the 100Mbit connection.
So i have even more bandwidth to play with than i thought
Should know Exchange delta sizes tonight and SQL tomorow so should have better figures to do the sums with.
Regards and thanks.
-
- Influencer
- Posts: 22
- Liked: never
- Joined: Nov 02, 2011 5:19 pm
- Full Name: Bill Doellefeld
- Location: Colorado, USA
- Contact:
Re: Large replication question(Putting the feelers out)
I don't see readily see another thread specifically about this problem although I've seen some references to it. (Admin: please move if needed| OP: sorry..) I ran into the same problem seen by mbreitba with huge i/o on the target side and I see it overwhelm the storage. I've had to force NDB at the target. In my case I only noticed it over time after upgrading from v5 and moving my "legacy" jobs over to new jobs. I assume I did not notice right away because v5 jobs couldn't use hotadd at the target(?)tsightler wrote: Can you share with us your target storage configuration (model, connectivity) and also how you are monitoring? That would be useful. We do have ongoing investigation into this issue, I'm actually running some test in my lab environment right now on this very case. In the interim, network mode is a workaround for customers that see the extra I/O when using hotadd.
My configuration is: Veeam proxy on both sides. EVA 6400 source to a EVA 4400 target. 30 Mbit WAN link. Monitoring using the EVA hooks in perfmon. Sam as mbreitba I see very excessive queue depth and very high read.
-
- Influencer
- Posts: 22
- Liked: never
- Joined: Nov 02, 2011 5:19 pm
- Full Name: Bill Doellefeld
- Location: Colorado, USA
- Contact:
Re: Large replication question(Putting the feelers out)
Forgot to add... I can confirm seeing this as well. It flies on the full and the incr just cranks out I/O. An incr that takes maybe 15 minutes over NBD goes to hours using hotadd and leaves you scratching your head.tsightler wrote: BTW, I've only seen this extra load with incrementals, the initial full always seems OK with hotadd. Are you seeing the same behavior?
-
- Influencer
- Posts: 22
- Liked: never
- Joined: Nov 02, 2011 5:19 pm
- Full Name: Bill Doellefeld
- Location: Colorado, USA
- Contact:
Re: Large replication question(Putting the feelers out)
tsightler; Was curious if you ever came across any resolutions to this issue (using hotadd on target side causing abnormal amount of target i/o).
-
- VP, Product Management
- Posts: 6035
- Liked: 2860 times
- Joined: Jun 05, 2009 12:57 pm
- Full Name: Tom Sightler
- Contact:
Re: Large replication question(Putting the feelers out)
I have worked with our development team and we have gathered many logs, and they have performed their own testing as well. Unfortunately at this point there is no resolution. This appears to be a VMware issue and not really related to Veeam. The problem still occurs with B&R 6.5, at least the latest build that I've tested. For now the only resolution is to use network mode on the target although ideally I'd like to have customers open support cases and provide their configuration and storage logs.
Who is online
Users browsing this forum: No registered users and 20 guests