Veeam vs. vSphere Replication

pcantoni · Post by **pcantoni** » Sep 12, 2012 10:21 am this post

Hi.
Can someone post the Key differentiators between VEEAM replica and vSphere 5.1 replication ?

Thanks in advance.

Pier

Sep 12, 2012 10:32 am

#1 limitation is that it provides single restore point only. Can stop at that, this is immediate show stopper for most customers I discussed vSphere replication with. Multiple restore points are absolutely essential, because just like "good" data, any corruption/virus/dataloss from the source VM is immediately replicated to target VM, and if you don't spot the problem and perform failover to replica fast enough (before the next replication cycle) - which is going to be impossible in most cases - then you are done.

Other limitations
• No failback
• No traffic compression
• No traffic throttling
• No swap exclusion
• No network customization (network mapping)
• No re-IP upon failover
• Minimum possible RPO is 15 minutes
• Basic VSS quiescing (no application-aware processing)
• Works within single vCenter only
• No ability to create container-based jobs (explicit VM selection only)
• Limited seeding options: cannot seed from backup, or using different VM as a seed (disk IDs have to match)
• Different ports for initial and incremental sync required
• No good reporting

Also, be aware that biggest marketing push around vSphere replication is technically incorrect statement!
“Unlike other solutions, enabling vSphere replication on a VM does not impact I/O load, because it does not use VM snapshots”

It is simply impossible to transfer specific state of running VM without some sort of snapshot even in theory! In reality, during each replication cycle they do create hidden snapshot to keep the replicated state intact, just different type of snapshot (exact same concept as Veeam reversed incremental).

PROS: No commit required, snapshot is simply discarded after replication cycle completes.
CONS: While replication runs, there is 3x I/O per each modified block that belongs to the replicated state. This is the I/O impact that got lost in marketing.

I consider this to still be a better than regular snapshots for some cases, but definitely not better by a mile.

Post by **Berniebgf** » Sep 12, 2012 11:26 am this post

So VMware's doing "hidden" Copy on Write snapshots instead of traditional "Redirect-on-Write" snapshots? Interesting...

Bernie.

Sep 12, 2012 2:07 pm

Correct. They never mentioned this helper snapshot in their presentation at VMworld, and kept me wondering about their "no I/O impact" claim - something completely impossible even in theory. So, I had to interrogate their developers to find out how it really works

In any case, I consider this approach to still be way better than native Hyper-V 2012 replica. Hyper-V uses journaling approach, meaning double write I/O at all times for protected VMs. Also, I assume Hyper-V replica does not transfer just the final state of each block, but essentially all blocks states since last replication (the whole journal is transferred, and then replayed on target). I guess, the only "benefit" of this approach is consistent and so predictable I/O impact (does not change when actual replication cycles are running)

averylarry · Post by **averylarry** » Oct 23, 2012 8:27 pm this post

If I understand from this page:

http://www.vladan.fr/vmware-srm-5-1-whats-new/

The new vSphere Replication (VR) has been improved over since the 1st VR with SRM. The VR’s main improvements is the VSS integration (through VMware Tools) and doesn’t merely request OS quiescence, but flushes app/db writers if present. Can create application consistent copies of entire VMs.

It will do application aware processing. Unless it's something in-between . . ?

Boy it would be nice if Veeam could use those "Lightweight Delta" (LWD) files (snapshots).
It's strange to me that it cannot replicate the VM if it is powered off.

Post by **Gostev** » Oct 23, 2012 9:35 pm this post

Even your quote itself does not say anything about application-aware processing. Nothing beyond non-application aware quiescence, all the same old stuff added back in ESX 3.5 U2 (VMware Tools VSS).

We will not use LWD, because VMware does not provide any API around this that would allow 3rd party vendors to leverage this functionality. Besides, LWD has drawbacks too, for example it triples I/O for each modified block belonging to the replicated state, which results in significant I/O impact on production storage when replicating over slow links. So, it’s not necessarily as cool as VMware marketing makes you think it is.

We do have much better technology in the works though.

ventura5150 · Post by **ventura5150** » Jan 07, 2013 4:00 am this post

Hi..
I am newbie in virtualization, I want to ask, what is difference of vSphere Host Replication (included on VMWare vSphere 5.1) vs Veeam Backup & Replication ?

Thx

bbosak · Post by **bbosak** » Apr 09, 2013 4:27 pm this post

So VMware's doing "hidden" Copy on Write snapshots instead of traditional "Redirect-on-Write" snapshots? Interesting...
When using vSphere Replication we never see the snapshot in the snapshot manager EXCEPT if there's already a snapshot in place. Then "vSphere Replication snapshot" appears in the snapshot manager.

Post by **Dima P.** » Apr 10, 2013 9:55 am this post

Hello Brian,

Yes, as mentioned above:

It is simply impossible to transfer specific state of running VM without some sort of snapshot even in theory! In reality, during each replication cycle they do create hidden snapshot to keep the replicated state consistent, just different type of snapshot (exact same concept as Veeam reversed incremental).

Post by **Gostev** » Apr 27, 2013 9:08 pm this post

The snapshot vSphere replication creates is a special type of snapshot that is not visible in the Snapshot Manager.

pcantoni · Post by **pcantoni** » Sep 06, 2013 7:16 am this post

Hi.
Can someone post the Key differentiators between VEEAM 7 replica and vSphere 5.5 replication ?

Thanks in advance.

Pier

Post by **Vitaliy S.** » Sep 06, 2013 10:52 am this post

Hi Pier,

vSphere replication now has an option to save mutliple restore points and replicate between different two vCenter Servers, but basically all core differentiators have remained the same.

Hope this helps.

andersonts · Post by **andersonts** » Sep 08, 2013 1:04 am this post

We did see SureReplica added to V7 as well which allows you to automate testing of replicas at the DR side. This is a big difference.

vmKen · Post by **vmKen** » Oct 07, 2013 7:40 pm this post

There is some misleading information about how vSphere Replication works in here. There is NO snapshot of any sort created at the primary location, no "hidden" snapshot, no impact whatsoever to the running virtual machine. During VSS quiescing where VSS is used for W2k8/W2k12 there will be a snapshot done, because there is no other way to make those OSes quiescent due to the VSS implementation for those OSes. But there is absolutely no impact to the VM doing its writes, no write filter, no CoW, etc. Blocks are written to the vmdk exactly as they would for non-replicated VMs.
With non-VSS replicas, VR simply grabs the blocks that have changed from the VMDK that have already been written and flagged as changed since the last replication took place. No snapshot, no CoW, etc.
If a block is changed on the primary vmdk while in transit but not yet written to disk at the recovery location yet that one block will be redirected to a CoW until the block is acknowledged as written at the recovery site.
Without VSS there is no 'hidden snapshot'. With VSS for w2k8/w2k12 there is a snapshot because of the way MS does their VSS.

There is no triple I/O per block, that is misleading. The blocks for the primary vmdk are written to disk exactly once and that is it. It is then copied to the recovery site at time of replication and written to a redo log which then gets committed once the redo log is complete. The impact to the production VM is nothing as it is just simply a block write like any other block write.

averylarry · Oct 07, 2013 9:30 pm

vmKen -- So what do you think LWD is?

http://thesaffageek.co.uk/2012/08/27/vs ... plication/

Oct 07, 2013 9:52 pm

Hi Ken, thanks for taking time to register on our forums to post this.

I see you work for VMware, so I recommend that you just talk to vSphere Replication developers, because this is where I got all my information from. They actually had a great session at VMworld 2012 with lots of technical details about how it works in depth, with lots of pictures and long Q&A afterwards.

The "hidden" snapshot is called LWD in VMware terminology (Light Weight Delta), and it does exist. Again, you don't have to take my word for it, just talk to your devs directly.

You have provided a very long and confusing explanation, so instead of trying to address individual misstatements, I think it will be best for me to approach this from a different angle, which will be easy to understand for everyone even without having to know specifics of the particular implementation.

Here is how I like to explain this. It is impossible to perform asynchronous replication without some sort of snapshot created for the duration of data transfer, because you must have means of protecting a replicated state of the VM image, while replication of that state takes place (which can easily take minutes). And this is not possible without some sort of snapshot even in theory. Simple as that! Now, synchronous replication is the whole other story, blocks are replicated immediately as they are modified, but this is NOT what vSphere Replication does.

As it comes to marketing papers (LinkedIn says you have a marketing role at VMware), it is perfectly acceptable to state there that vSphere Replication does not use snapshots (because most users think VM snapshots when they hear "snapshot"). However, we are not discussing technologies on the marketing level on these forums, but rather a few levels below that

Now, don't get me wrong: no one here says LWD approach is bad. As I've said above, LWD is better than using regular VM snapshots, and is much better than the approach one Microsoft implemented for Hyper-V replica. But every technology has its pros and cons, and it is important for VMware to be very clear about both with the technical audience.

Gostev wrote: PROS: No commit required, snapshot is simply discarded after replication cycle completes.
CONS: While replication runs, there is 3x I/O per each modified block that belongs to the replicated state.

Also, while you are here, do you care to comment why VMware would not open LWD API for the 3rd party vendors to use? As you can see above, your users would like 3rd party vendors like Veeam to be able to leverage this technology.

Here at Veeam, we've put our bets on integrating with storage snapshots, as only this can help to completely eliminate I/O overhead on replicated VMs during the data transfer window... but the obvious CONS of our approach is that it is limited to certain supported storage devices only. And even though we are constantly expanding this list, having access to LWD would enable us to deliver universal engine we could failover to in case of incompatible storage, thus enabling 100% of our joint customers to have better VMware-based data protection strategy.

Thank you!

Anton Gostev
VMware vExpert 2013

averylarry · Post by **averylarry** » Oct 07, 2013 10:20 pm this post

I think Ken is suggesting that it's really an extended fancy version of a synchronous replication.

vmKen · Oct 07, 2013 10:20 pm

Yes, I'm one of the guys who presented that session at VMworld 2012 (and 2013), I'm an architect in the product management group who does the technical marketing for VR and SRM. I speak with the developers of the product on a daily basis.
The LWD is not a hidden snapshot, the LWD is a collection of blocks that is created dynamically at time of replication. We create pointers to the blocks as they change, in a memory bitmap and in a file called the PSF file (persistent state file). They are simply pointers that get updated as the blocks change. No snapshot, no intrusion.
When the scheduler determines it is time to replicate we refer to those pointers to know which blocks need to be sent, and read a copy of the blocks at their current state for replication. Those blocks are read into buffers and shipped to the recovery site where a network file copy writes them to a redo log on the recovery site. There is no intrusion or interaction with the production VM at all.
There are two scenarios in which there might be interaction with the objects on the replicated side. 1) If a block that is being replicated at that *exact* moment while we are reading and sending it, we need to protect the block until we are assured it has been written at the recovery location. In that one instance alone we redirect the current write to the persistent state file until the replicated block is written and acknowledge, and then we commit it to the original vmdk. No snapshot takes place, no intrusion to the VM, no stun, nothing, but this may be considered a CoW for *individual* blocks that are changing only while they are being sent and the send has not completed. This does not interact with the VM, or its writes directly, only those rare scenarios where that particular block is changing *during* the replication of it. And even then the VM is unaware of it and not interacted with in any fashion like a snapshot.
The LWD is the bundle of blocks that we treat as a unit for replication, all the blocks that are read for replication at one time.
2) The only other scenario where there *is* an actual snapshot is if the VM is set up to replicate with VSS quiescing and the OS is 2k8/2k12. For those systems the only way Microsoft implements VSS is through snapshots, and even then we do this through an interesting 'forked snapshot' that is very temporary and discarded after replication is complete.
For normal run of the mill operation there is no snapshot. We behave more like a 'delayed synchronous' where we *track* them as they are modified, but ship them as a bundle (the LWD) asynchronously. If we were using a filter in-line to the writes that forked the write, you'd be correct that we'd need to stun the VM in some fashion, but instead we allow every write to take place directly and simply track which blocks are changed, then grab a read of those blocks non-intrusively.

Post by **Gostev** » Oct 07, 2013 10:24 pm this post

averylarry wrote:I think Ken is suggesting that it's really an extended fancy version of a synchronous replication.

I wish it was, but the story falls apart as soon as you copy 2GB ISO to a VM synchronized by vSphere Replication to an offsite location over 1Mbps link

averylarry · Post by **averylarry** » Oct 07, 2013 10:30 pm this post

"read a copy of the blocks at their current state for replication." "There is no intrusion or interaction with the production VM at all."

I do not understand how these 2 statements are not direct contradictions?

averylarry · Post by **averylarry** » Oct 07, 2013 10:32 pm this post

Gostev wrote: I wish it was, but the story falls apart as soon as you copy 2GB ISO to a VM synchronized by vSphere Replication to an offsite location over 1Mbps link

Not if you have enough RAM as a local buffer, and enough overall bandwidth to eventually catch up. Right?

Post by **Gostev** » Oct 07, 2013 10:32 pm this post

Hi Ken, now with the above explanation, it sounds like we are on the same page.

To me, that persistent state file you referenced is a type of "hidden snapshot" in my definition, an entity used to protect the replicated VM state, and what is causing the extra I/O. You are perfectly correct this time, stating that PSF file is a CoW type of storage - and when there is CoW, there is an extra I/O.

Write to CoW, then read from CoW and commit into VMDK gives that 3x I/O per each modified block that belongs to the replicated state, just as I've said above.

Can we agree that we agree with each other?

Thanks!

vmKen · Post by **vmKen** » Oct 07, 2013 10:36 pm this post

Sorry, didn't see the rest of this down below:

Gostev wrote: As it comes to marketing papers (LinkedIn says you have a marketing role at VMware), it is perfectly acceptable to state there that vSphere Replication does not use snapshots (because most users think VM snapshots when they hear "snapshot"). However, we are not discussing technologies on the marketing level on these forums, but rather a few levels below that

When I wrote the technical material you called it confusing.

Happy to get as detailed as you like, that's my job. I've posted a lot of material on this topic at http://blogs.vmware.com/vsphere/uptime.

Gostev wrote: PROS: No commit required, snapshot is simply discarded after replication cycle completes.
CONS: While replication runs, there is 3x I/O per each modified block that belongs to the replicated state.

Not sure where this 3x I/O is coming from. There is no intercept of the writes, they write out as normal. Each block is read once for replication (as all replication needs to do), and that's it. So it's a single write and a single read for the changed blocks on the protected site. If you're including the write I/O to the redo log and commit at the recovery site, that's a bit unfair as every replication technology needs to do writes at the target...

Gostev wrote: Also, while you are here, do you care to comment why VMware would not open LWD API for the 3rd party vendors to use? As you can see above, your users would like 3rd party vendors like Veeam to be able to leverage this technology.

There is no published API at all, to partners or customers, it's using a fundamental call within the kernel itself, so it's hard to expose that gracefully to the outside world. Trust me, we get beat up on APIs about this all the time, but securing the kernel is important. So we have a few calls we can make to it internally (via CLI) to configure the replication and that's about it. Lots of people want API access and lots of people want expanded CLI. We're always looking at how to do that though! Some other vendors in the world are doing... inappropriate things to gain access to things like the vSCSI filters without an API, and the problem there is if we change anything at all on those internal calls the whole house of cards might come down. People don't like it when their replication stops working for DR.

So we're looking at potentially writing a published API for this, but since that hasn't been in scope from the start it's something we're going to have to retrofit.

Gostev wrote: Here at Veeam, we've put our bets on integrating with storage snapshots, as only this can help to completely eliminate I/O overhead on replicated VMs during the data transfer window... but the obvious CONS of our approach is that it is limited to certain supported storage devices only. And even though we are constantly expanding this list, having access to LWD would enable us to deliver universal engine we could failover to in case of incompatible storage, thus enabling 100% of our joint customers to have better VMware-based data protection strategy.

Sure, if we get an API developed for this or rolled into the SDK or the like it'll be much easier to layer on top of this, but my goal in coming here was strictly to clear up a few things, not solve all the difficulties of VMware partnership.

Post by **Gostev** » Oct 07, 2013 10:37 pm this post

averylarry wrote:Not if you have enough RAM as a local buffer, and enough overall bandwidth to eventually catch up. Right?

Yes, albeit there will be a few cycles of missed RPOs, but eventually it should catchup. I simply referenced the most basic test everyone can perform to realize vSphere Replication is not a synchronous replication (or an extended fancy version of).

averylarry · Post by **averylarry** » Oct 07, 2013 10:39 pm this post

Gostev wrote:...
To me, that persistent state file you referenced is a type of "hidden snapshot" in my definition, an entity used to protect the replicated VM state, and what is causing the extra I/O.

...

Ken specifically stated there is no CoW (with an exception). The PSF file is more like a second CBT file, containing only pointers to changed blocks, not the changed data itself.

If I understand him . . .

vmKen · Post by **vmKen** » Oct 07, 2013 10:41 pm this post

Gostev wrote:Hi Ken, now with the above explanation, it sounds like we are on the same page.
To me, that persistent state file you referenced is a type of "hidden snapshot" in my definition, an entity used to protect the replicated VM state, and what is causing the extra I/O. You are perfectly correct this time, stating that PSF file is a CoW type of storage - and when there is CoW, there is an extra I/O.

Write to CoW, then read from CoW and commit into VMDK gives that 3x I/O per each modified block that belongs to the replicated state, just as I've said above.

Well I guess I'm getting picky - the PSF file isn't a snapshot, it's just a file that in no way looks like a snapshot or interacts with the snapshot tree, that's where I was getting caught up.
And that CoW in the PSF is pretty rare... It's not affecting every write, it's only for changed blocks that 1) have changed while the replication for that VMDK is taking place, 3) have not already been sent from the current LWD, 3) have been sent but not written and acknowledged by the recovery site.
In that case there is a write, a redirect, then a write. So 2 extra writes just for those blocks. How is that different than populating them into a snapshot then committing the snapshot though?
I think we're close to agreement.

vmKen · Post by **vmKen** » Oct 07, 2013 10:43 pm this post

Great conversation, I'll come back to chat more, right now I've got to run. If anyone's in VMworld in Barcelona, come say hi!
-Ken

Post by **Gostev** » Oct 07, 2013 10:48 pm this post

averylarry wrote:Ken specifically stated there is no CoW (with an exception)

OK, however I have always been talking about the exception (modified blocks belonging to the replicated state).

Anyway, I think the latest post from Ken sums it up quite nicely.
This exactly that overhead I/O I was talking about, confirmed:

vmKen wrote:CoW ... for changed blocks that
1) have changed while the replication for that VMDK is taking place,
2) have not already been sent from the current LWD,
3) have been sent but not written and acknowledged by the recovery site.

I think we are all in agreement now (except of the definition of "snapshot", haha).

Thank you both.

averylarry · Post by **averylarry** » Oct 07, 2013 10:55 pm this post

I don't think so. Ken claims that only the very small subset of changed blocks that are changed again during the replication cycle are CoW, where Veeam because it uses a VMware snapshot will CoW ANY and ALL blocks that are changed during the replication cycle.

If I understand. So Veeam using a VMware snapshot will have 3X I/O for ALL changed data during the replication cycle, where VMware will have 2X I/O for all changed data and 3X I/O only for changed data that is changed again.

averylarry · Post by **averylarry** » Oct 07, 2013 10:57 pm this post

Ken -- I'd still like you to address this:

averylarry wrote:"read a copy of the blocks at their current state for replication." "There is no intrusion or interaction with the production VM at all."

I do not understand how these 2 statements are not direct contradictions?

R&D Forums

Veeam vs. vSphere Replication

Re: VEEAM vs vSphere 5.1 Replication

Re: VEEAM vs vSphere 5.1 Replication

Re: VEEAM vs vSphere 5.1 Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

[MERGED] vSphere Host Replication vs Veeam Backup & Replicat

Re: VEEAM vs vSphere 5.1 Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

[MERGED] Veeam Replication vs. vsphere 5.5

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Re: Veeam vs. vSphere Replication

Who is online