Comprehensive data protection for all workloads
Post Reply
sjutras
Service Provider
Posts: 19
Liked: never
Joined: Oct 14, 2009 4:23 am
Contact:

Strange behaviour - EQL Snapshots and CBT

Post by sjutras » Mar 01, 2013 4:31 pm

Hi,

Just would like to share a little, yet annoying, problems that i came through recently and see if anyone could explain this behaviour.

Context: 1 VM with 1 x 500GB vmdk full at 85%. This VMDK resides on a 700GB LUN on the Equallogic.

When there is no snapshot at the LUN level, the reverse incremental backup job with Veeam results into 15GB off the 500GB vmdk being read. The .VRB file is then about 10GB.
Then i take a snapshot at the LUN Level and wait 24 hours. The snapshot space used after 24 hours is about 300gb (due to the 15mb page size of EQL). Until now, everything is as of expected.
Now Veeam starts its reverse incremental backup job of this very same VM. Instead of reading 15GB off the 500gb vmdk, as usual, it will reads about the same amount of data as the snapshot space being used at the LUN level, so about 300GB. The time required for this incremental is almost multiplied by 10. Yet the .vrb at the end is still only 10GB......

IF we remove all the snapshots at the LUN level and rerun the veeam backup job, all returns to normal behaviour.

It's like if the snapshot at LUN level, would affect vsphere block tracking change. I dont think Veeam is at fault in this story but it is really strange. We ended up disabling snapshot at the SAN level for the time being.

Anyone has an idea?

Thanks

joergr
Expert
Posts: 386
Liked: 39 times
Joined: Jun 08, 2010 2:01 pm
Full Name: Joerg Riether
Contact:

Re: Strange behaviour - EQL Snapshots and CBT

Post by joergr » Mar 06, 2013 10:00 pm 1 person likes this post

Hi,

this sounds extremely strange and thus even more extremely interesting to me ;)
thus, i have a quite a lot questions regarding your case:

a) eql firmware?
b) is this a multi member group, is the lun spanned across multiple members?
c) is the lun thick or thin?
d) is the veeam server accessing the lun directly via san mode or indirectly via nbd?
e) esxi version?
f) iscsi initiator: hardware or software?
g) what nic accessing the eql in the esxi machine?
h) vsphere licensing edition? are special features active (storage ressources related)?
i) eql mem 1.1.x insalled on the esxi?

Best regards,
Joerg

sjutras
Service Provider
Posts: 19
Liked: never
Joined: Oct 14, 2009 4:23 am
Contact:

Re: Strange behaviour - EQL Snapshots and CBT

Post by sjutras » Mar 07, 2013 2:53 am

Hello,

Indeed, this is rather strange. Here's the answers to your questions:

a) 6.0.2
b) yes, 2 members, LUN load balanced across both members, which are 2 x ps4000XV
c) LUN is thick, vmdk's though, are Thin
d) Using SAN Appliance (hotadd, so not san mode, nor nbd)
e) 5.1 build 914609
f) Software, best practices configured with Dell MEM latest version
g) vmnic3,7,10,11 - vmk1,2,3,4
h) Enterprise
I) yes and working good with excellent IOmeter results

edit: Also using Veeam 6.5 with last available patch

Thanks

jodiety11
Novice
Posts: 4
Liked: never
Joined: Mar 07, 2013 6:16 am
Full Name: Jodie Smith
Contact:

Re: Strange behaviour - EQL Snapshots and CBT

Post by jodiety11 » Mar 07, 2013 6:24 am

Thanks. :) :)
Behind every successful man there's a lot u unsuccessful years. - Bob Brown

joergr
Expert
Posts: 386
Liked: 39 times
Joined: Jun 08, 2010 2:01 pm
Full Name: Joerg Riether
Contact:

Re: Strange behaviour - EQL Snapshots and CBT

Post by joergr » Mar 07, 2013 12:06 pm

OK, thanks. Now - i honestly think that could be something huge and really worth examining. Reason: Any LUN-accessor should never ever see even the slightest difference between a lun as such and the same lun with snapshots. This should be absolutely and completely transparent to everything - the host, the initiator and the cbt engine. The only exception would be tools or plugins which especially take use of deeply integrated san-snapshot features. So, this is something of extreme interest to me. And thus, i will build a testlab in the next days with this scenario.

Would it be OK for you if i forward/show this discussion to the dell online community forum group, so that the eql experts from nashua could take a look at it, too?

Best regards,
Joerg

sjutras
Service Provider
Posts: 19
Liked: never
Joined: Oct 14, 2009 4:23 am
Contact:

Re: Strange behaviour - EQL Snapshots and CBT

Post by sjutras » Mar 08, 2013 1:10 am

Sure enough, you can forward this, if you could provide me the discussion link id like to follow it up.

Thanks

joergr
Expert
Posts: 386
Liked: 39 times
Joined: Jun 08, 2010 2:01 pm
Full Name: Joerg Riether
Contact:

Re: Strange behaviour - EQL Snapshots and CBT

Post by joergr » Mar 08, 2013 9:11 am

Done ;-)

Best regards,
Joerg

joergr
Expert
Posts: 386
Liked: 39 times
Joined: Jun 08, 2010 2:01 pm
Full Name: Joerg Riether
Contact:

Re: Strange behaviour - EQL Snapshots and CBT

Post by joergr » Mar 11, 2013 6:43 pm

Hi,

a question came up at the DELL guys community: are you using VSM to take the initial snapshot? If so, does the same thing happen if you create a snapshot at the Group Manager instead?

Best regards,
Joerg

sjutras
Service Provider
Posts: 19
Liked: never
Joined: Oct 14, 2009 4:23 am
Contact:

Re: Strange behaviour - EQL Snapshots and CBT

Post by sjutras » Mar 13, 2013 12:48 pm

Yes i was using VSM. I havent tried not using VSM as we needed crash consistency VMs in our snapshots. Since this is a production environnement, doing this test is going to be difficult but i will try to reproduce it in my lab.

joergr
Expert
Posts: 386
Liked: 39 times
Joined: Jun 08, 2010 2:01 pm
Full Name: Joerg Riether
Contact:

Re: Strange behaviour - EQL Snapshots and CBT

Post by joergr » Mar 13, 2013 12:53 pm

Oh my...i should have asked this from the beginning...thanks DELL guys ;-)

Please read this: http://kb.vmware.com/selfservice/micros ... Id=1020128

I was under the assumption the vm was clean of any vsphere snapshot, i thought completely about eql snapshots. If you use VSM it´s only natural vSphere snaphots came into play before veeam snapshot was taken, thus cbt could not be used as it should.

Best regards,
Joerg

sjutras
Service Provider
Posts: 19
Liked: never
Joined: Oct 14, 2009 4:23 am
Contact:

Re: Strange behaviour - EQL Snapshots and CBT

Post by sjutras » Mar 13, 2013 12:58 pm

Hi, oh..

We made sure that VSM schedule does not conflict with Veeam schedule, in a way that the VM is ALWAYS snapshot clean before a VSM job or a Veeam job.

So it is an expected behaviour? Like:
6:oopm, VSM snapshot the VM, snapshot the LUN, remove the VM snapshot
8:00pm, Veeam snashot the VM, backup the VM, remove the VM snapshot

This way the VM never has a pending snapshot when one or the other job begins, thus the CBT should work fine, doesnt it?

joergr
Expert
Posts: 386
Liked: 39 times
Joined: Jun 08, 2010 2:01 pm
Full Name: Joerg Riether
Contact:

Re: Strange behaviour - EQL Snapshots and CBT

Post by joergr » Mar 13, 2013 1:01 pm

It doesn´t matter. If somewhere in between a snapshot is taken, cbt might not be able to successfully track all changes. please check out the vm kb doc.

Best regards,
Joerg

sjutras
Service Provider
Posts: 19
Liked: never
Joined: Oct 14, 2009 4:23 am
Contact:

Re: Strange behaviour - EQL Snapshots and CBT

Post by sjutras » Mar 13, 2013 1:07 pm

Ok thanks.

dwilliam62
Lurker
Posts: 2
Liked: never
Joined: Jul 21, 2012 3:35 pm
Full Name: Don Williams
Contact:

Re: Strange behaviour - EQL Snapshots and CBT

Post by dwilliam62 » Mar 13, 2013 3:30 pm

Good morning,

My name is Don from Dell Equallogic.

I was wondering if you could confirm that after VSM runs that the VMware snapshot is always deleted. There have been reports that VSM v3.5 can fail to remove it if there's a problem with the VMware snapshot creation. I don't have Veeam so I can't test this in house.

In your post, you mention you want "crash consistent" snapshots. On Equallogic arrays, snapshots done via the EQL GUI are considered crash consistent, since anything in the array cache or disk is included in the snapshot. Only cached writes on the host are excluded, which is a very small window of vulnerability. No different if you pulled plug on physical server or if the Hypervisor host fails. VSM provides greater than crash consistency. ASM/ME provides what we consider to be application consistency when dealing with MS SQL, Exchange, and SharePoint. Since it actually tightly integrates with those products. ASM/ME requires that the MS OS directly connects to the volume with its MS iSCSI initiator, not a VMFS volume or RDM. So you can consider the snapshot options as "good, better, best".

Regards,

Don

sjutras
Service Provider
Posts: 19
Liked: never
Joined: Oct 14, 2009 4:23 am
Contact:

Re: Strange behaviour - EQL Snapshots and CBT

Post by sjutras » Mar 13, 2013 3:45 pm

Hi Don!

I can confirm that VSM remove the VMware snapshot correctly, before Veeam proceed with its backup. That is why i thought it was strange.

By crash consistent, i really only meant putting all chance on our sides to remain the VMDK valid. I think this may be the reason why VSM snapshot the VM(s) residing on the LUN being snapshotted. For application level protection, we arent using this since there is no native ntfs LUN in this particular configuration and Veeam takes care of this in the backups.

It is true though, that having snapshot of a VMFS LUN done directly via EQL GUI, from a VMDK standpoint, is not worse than a host failure.

Thanks

Post Reply

Who is online

Users browsing this forum: Bing [Bot], conrad.stephens, Google [Bot], nmdange, rickrbyrne, ZhenYa and 94 guests