Comprehensive data protection for all workloads
Post Reply
mrstorey
Influencer
Posts: 20
Liked: never
Joined: Jun 24, 2013 11:11 am
Full Name: Alex Storey
Contact:

Lefthand Snapshots

Post by mrstorey »

Hi,

I’ve already logged a support case regarding this issue (Case 00502827), but seeing as I don’t think it’s a Veeam issue per se, I thought I’d throw it out to the forum.

Do any of you have experience with Lefthand Array Snapshots?

I excitedly tried out the ‘Backup From Storage Snapshot’ feature last night, after a couple of successful runs against one of our newer Lefthand clusters. Here’s a brief rundown of our environment:

ESXi 5.1 / vCenter 5.1
Cluster 1 = 4 X P4300 G3 running San IQ 9.5 (needs upgrading I know)
Cluster 2 = 2 X StoreVirtual 4350 + FOM running Lefthand OS v11
All storage 10Gb iSCSI
Windows 2008 R2 Backup Proxy and Repo - 10Gb iSCSI, Direct SAN Mode
Veeam v7 R2a

However, the job ended up failing with the error 'The underlying connection was closed: A connection that was expected to be kept alive was closed by the server', presumably because the array snaps took up a disproportionately large amount of free space in the cluster, which immediately generated capacity utilisation alerts after snapping two 500GB VMFS LUNs on Cluster 1 as part of a production Windows backup run.

Cluster 1 has around 11TB of RAW space, with about 2.5TB free. All VMFS LUNs are 500GB provisioned with 2-way network RAID10.

I watched the snaps of 2 x 500GB data stores take place in the CMC - the moment they did, nearly 2TB of available space was stripped from the cluster - generating an alert for exceeding 95% capacity utilisation.

The VMs never started backing up and I then watched the snaps get deleted - immediately freeing up the space, and the job finished after removing all the VM snapshots taken in vCenter.

VMs on Cluster 2 seem to process fine - however since it’s a newer cluster and has plenty more available space, no disk space alerts were generated. The amount of space the snaps took though still seemed very excessive.

Has anyone else seen this? It looks like the array snaps aren’t even Thin Provisioned or something - rather than growing over time they just seem to demand an enormous chunk of disk immediately.

I’d really like to use this Enterprise Plus Veeam feature, but if the disk space requirements for Lefthand array snaps are so high then it doesn’t seem cost effective to use - simple, tried and tested Direct SAN over iSCSI is working really well for us after all.

Any advice from Lefthand users out there?

Thanks!
Alex
Gostev
Chief Product Officer
Posts: 32746
Liked: 7962 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Lefthand Snapshots

Post by Gostev »

Hi Alex, I've checked with one heavy LeftHand user, and he did not observe any storage space issues like you mentioned due to this functionality... he noted LeftHand snapshots are thin, so he only ran into a disk space issue once after creating a dozen of snapshots at the same time on the 60TB volume with 2TB free disk space.
dellock6
Veeam Software
Posts: 6208
Liked: 1995 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Lefthand Snapshots

Post by dellock6 »

Hi Alex,
thanks to Anton for making me aware of this thread, I am the "heavy Lefthand user" :) (thanks Anton for the praise, I owe you a beer or two)

Anyway, yes I can totally confirm Lefthand snapshots are thin. There are some "issues" on the CMC, that means the information you get are not consistent depending on the position you are into it. I usually control used space in the cluster summary, select the cluster on the left and "use summary" on the tabs.

We had some problems with excessive snapshot usage, because afterl all LH uses copy-on-write, so both a large number of snapshots, and most of all their commit, is going to kill I/O if you go too far. But never a problem because of space.

Just to complete the scenario, I used Storage Snapshots on LeftHand OS 10.5, and we started the upgrades to 11. Never with 9.5, so I don't know if there are differences in the observed behaviour.

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
mrstorey
Influencer
Posts: 20
Liked: never
Joined: Jun 24, 2013 11:11 am
Full Name: Alex Storey
Contact:

Re: Lefthand Snapshots

Post by mrstorey »

Hi Chaps,

Thanks for your quick replies - I'll carry out some more testing today to hopefully shed a little more light on the issue.

I'm using v11 CMC with both v9.5 and v11 clusters, which are in separate management groups.

We don't actually have a great deal of VMs to backup from the Lefthands at the moment - only around 4.5Tb / 70 VMs and it generally completes within 1.5hrs using Direct SAN, so I don't expect the array snaps to grow very large during this backup window.

Ok - I've just tried creating a snap of a 700GB VMFS LUN that's 80% utilised on Lefthand OS v11. As you pointed out, there's is a difference between the 'Details' tab and the 'Summary' tab:

Details Tab Before - Available 2949.99GB, Used 1421.88GB, Util 67%
Details Tab After - Available 2490.21GB, Used 1422.46GB, Util 72%

Summary Tab Before - Provisioned 6201.49GB, Available 2942.99GB
Summary Tab After - Provisioned 6654.27GB, Available 2490.21GB

This suggests to me, the snap instantly uses around 450GB of space? Seems like a lot....multiply that up by a few LUNs during parellel processing and you end up with a large space requirement.

I'll try the same test again on a v9.5 Cluster - this time a 500GB VMFS LUN that's 85% utilised:

Details Tab Before - Available 2443.24GB, Used 8217.51GB, Util 78%
Details Tab After - Available 1454.73GB, Used 8217.71GB, Util 87%

Summary Tab Before - Provisioned 8881.75GB, Available 2443.24GB
Summary Tab After - Provisioned 9870.26GB, Available 1454.73GB

In this case, it looks like the snap used 989GB !!

Even though in both cases the amount of 'Available' space didn't really drop significantly, the amount of utilised shot up, generating utilisation alerts.

Maybe I should just ignore the alerts, and rest assure that it's merely reporting the fact I'm in danger of overprovisioning storage on the array? - in which case, if this is expected behavour, maybe there's another reason my Veeam backup job failed?

Thanks for your interest and help with this thread - much appreciated!

Alex
dellock6
Veeam Software
Posts: 6208
Liked: 1995 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Lefthand Snapshots

Post by dellock6 »

Hi Alex,
thanks for your tests, good to see some real numbers. I'm not sure if the excessive used space is some form of protection Lefthand uses to avoid filling completely the free space, they only people can confirm this are HP guys. Probably Anton or other at Veeam can get in touch with them and obtain a reply.

I'm almost sure the initial size of the snap is 1 Gb and than it grows from here, I saw that size (funny enough) more clearly into the Veeam console browsing the lefthand storage than in the CMC :)

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Gostev
Chief Product Officer
Posts: 32746
Liked: 7962 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Lefthand Snapshots

Post by Gostev »

Hi Alex, please check if you have the same issue when creating the LUN snapshot manually. If yes, then it might be worth opening a support case with HP and asking if this is expected, or if there are some settings around this. You may get a quicker answer this way. Thanks!
mrstorey
Influencer
Posts: 20
Liked: never
Joined: Jun 24, 2013 11:11 am
Full Name: Alex Storey
Contact:

Re: Lefthand Snapshots

Post by mrstorey »

Sorry I should have said - the numbers I mentioned in my earlier post were from creating snaps directly in the CMC, not via Veeam.

OK thanks for your replies - I'll raise a call with HP tomorrow to try and understand what's going on.

Alex
mrstorey
Influencer
Posts: 20
Liked: never
Joined: Jun 24, 2013 11:11 am
Full Name: Alex Storey
Contact:

Re: Lefthand Snapshots

Post by mrstorey »

The more I delve into this, the more confused I get! Apologies - this may be a longish reply.....

I got this response from HP:

When you create a Full provisioned Volume, the entire space gets reserved for the Volume. So a 2 TB volume would be consuming 4TB of Space in the Cluster as it is a Network RAID 10 Volume. It would not matter how much Data is present in the Volume. When you take a Snapshot, All the Data from the Volume moves to a Layer below the Volume. As the Snapshot is Thin Provisioned, it will consume the amount of Data present in the Volume. But the Top Layer (Volume which is Full Provisioned) will still consume 4 TB of Space in the Cluster. So the Total Space Consumed would be Adding the Space Consumed by Volume and Snapshot.

So It will be 4TB (Total Space Consumed by Volume in the Cluster) + Snapshot (Space Consumed by Data, which will again be Network RAID 10).

Let say you have a volume which has 1.5TB of data in it. So the Volume will be 4TB + 1.5 TB (Data - present in the Snapshot) = 5.5 TB.

If you make the Volume Thin Provisioned, then the Top Layer (Volume) would only consume the amount of Space consumed by Data.

So for the volume (if it was Thin Provisioned) Total Space Consumed would have been 0.5 GB (reserved space for a Volume) + 1.5TB = 2TB

1.5 is the size of the initial snapshot (The size of data that you have in volume in the beginning). The subsequent snapshots will only consume the space of the deltas.

Let say, you add 100MB of data after you take the first snapshot, then the size of new snapshot will be 100MB. It wouldn’t be 1.5TB+100MB, as 1.5TB snapshot was already taken.[/i][/i]


However, the way I'm reading that is that no matter how the volumes are provisioned on the lefthands, the amount of space an initial snap will take will be at least equal to the amount of used space on the volume. So yes, your combined volume + snap consumed space will be larger with thick provisioned volumes, but the snap will always be equal to the size of the data on that volume.

Interestingly though, the last sentence suggests that *subsequent* snaps only consume the changes since the last snap - this is fine, but why doesn't the *first* snap work in this way?? This is how I expected / wanted it to work in the first place.

Surely this means that even if you are thin provisioned, you will never be able to use more than 50% of your cluster - otherwise if you take just one snap of all volumes you'll consume all available space?

I asked HP the above, and was asked to carry out some simple tests to help me understand:

Thin volume space

1. Create a Test volume with 10G Size (Networks Raid 10 which is by default), and check the Consumed space and actual space of the volume in CMC
2. Add a File to the volume ( File size <=10G).
3. Take a snapshot of the volume
4. Note down the Size of the volume and the snapshot.

Full Volume

Perform the above steps, except that when you create the volume, change the mode from Thin to Full from the Advance Menu.
Note down the size of both snapshot and volume.

Compare the size of Thin and Full volumes/snapshots.
...and these were my results:

----------------------------------

Thin Provision

- Created a 10Gb Thin Provisioned volume, presented it to ESXi and formatted with VMFS5. Consumed = 1Gb
- Created a VM on the volume, 500MB Thick Provisioned Eager Zeroed disk. Consumed = 8.43Gb (seems like a lot, but this is maybe how VMFS and VMDK’s work on a thin provisioned volume)
- Created snapshot, Consumed = 9.46GB
- Deleted snapshot …… interesting…..Consumed now = 8.00Gb Less than before the snap was taken?

Thick Provision

- Converted volume to Thick provisioned = 20Gb
- Created snapshot, consumed = 28.01GB
- Deleted snapshot, consumed = 20GB

More Tests:

- Increase VMDK to 5GB
- Convert to Volume to Thin = Consumed 7.62GB
- Created snapshot = Consumed space = 8.64
- Deleted snapshot = Consumed space = 7.25GB

----------------------------------

All a bit confusing.....

Anyway, I think the takeaway from all this is that snapping reasonably utilised volumes with Lefthand (most of ours are about 70-80%) will require large amounts of available space in the cluster - seems crazy to spend £25k on 4.5Tb useable storage when you can only *actually* safely use half of it when using array snaps?

Am I missing something here? Bit gutted I won't be able to safely backup from storage snapshots here, even though both our clusters are <80% utilised :(
Gostev
Chief Product Officer
Posts: 32746
Liked: 7962 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Lefthand Snapshots

Post by Gostev »

Here is one note from Luca in our offline conversation that might be relevant for the test above:
Don’t get confused by CMC, it states the whole size of the lun, IIRC it shows double the size of the lun because of the snapshots, but if you go to the cluster summary, the one with the column chart, it shows the correct usage. One of the (many) bug and annoyances of the CMC...
mrstorey
Influencer
Posts: 20
Liked: never
Joined: Jun 24, 2013 11:11 am
Full Name: Alex Storey
Contact:

Re: Lefthand Snapshots

Post by mrstorey »

Thanks - all this info from experienced Lefthand users is really great!

I think though, that my numbers from a previous post seem to unfortunately back up what HP were telling me - the initial snap will be equal to the consumed space on the volume:

I'll try the same test again on a v9.5 Cluster - this time a 500GB VMFS LUN that's 85% utilised:

Details Tab Before - Available 2443.24GB, Used 8217.51GB, Util 78%
Details Tab After - Available 1454.73GB, Used 8217.71GB, Util 87%

Summary Tab Before - Provisioned 8881.75GB, Available 2443.24GB
Summary Tab After - Provisioned 9870.26GB, Available 1454.73GB

In this case, it looks like the snap used 989GB !!
The case above - a 500GB volume with about 430GB used space produced a snap that's about 990GB, which is nearly about right before Network RAID10. The results are consistent regardless of looking in the CMC's 'Details' or 'Summary' tab.

I still have the case open with HP, so if I find out anything new I'll post back here.

Luca - if you get the chance (and only if you have an opportunity), can you try snapping one of your volumes and letting me know the difference in consumed / provisioned / available cluster space? Also - are your VMFS luns thick or thin? 2 Way RAID?

Sorry for all the questions - there really is no urgency around this - our backups are running absolutely fine with Direct SAN - I'm just always looking for ways to improve, and to exploit the features we've paid for.

Thanks everyone!
dellock6
Veeam Software
Posts: 6208
Liked: 1995 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Lefthand Snapshots

Post by dellock6 »

Hi Alex,
thanks again for further investigations, are totally appreciated.
They say numbers doesn't lie, but this is starting to sound strange to me, because I've always seen lefthand using thin snapshots, otherwise on our volumes we would had huge problems for the level of overprovisioning we use. Honestly, we never use thick volumes on lefthand, ours are all thin.
Could it be the system allocated the same size of the used space but it does not actually consumes it, as a form of protection, but it surely requires more investigation. At the moment I do not have spare time sadly to test it myself in my lab, but I'm taking note to check it when I will have time.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
mrstorey
Influencer
Posts: 20
Liked: never
Joined: Jun 24, 2013 11:11 am
Full Name: Alex Storey
Contact:

Re: Lefthand Snapshots

Post by mrstorey »

Ah - I think that's the key bit of info right there - all your lefthand volumes are Thin. All ours are thick, and around 80% utilised.

It seems that snaps of thin volumes take up considerably less space than snaps of thick volumes. I did some more tests, creating two 100GB volumes, 1 thin and 1 thick, and created a 5GB eager zeroed disk on each - here are the results:

----

Thin Volume

- Created 100Gb volume - consumed, 1Gb
- Created 5GB eager Zeroed VM - consumed, 37.50GB
- Snap volume - consumed, 38.51GB
- Removed snap - consumed, 35.62GB

Thick Volume

- Created 100GB Volume - Consumed, 200Gb
- created 5GB eager zeroed VM - consumed, 200Gb
- Snap volume - consumed, 218.82GB
- Remove snap - consumed, 200GB

----

Thick volumes behave a little more as you expect, I'm not sure I understand disk space consumption with thin volumes - for example, the consumed space reduces each time you create and delete a snap - strange.

Maybe I should pluck up some courage to start using thin volumes - I just want to clearly understand how they consume space first!

Thanks for all your help Luca, Gostev - appreciate it.

Alex
dellock6
Veeam Software
Posts: 6208
Liked: 1995 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Lefthand Snapshots

Post by dellock6 » 1 person likes this post

To our experience and to HP statements, there is no performance impact on using thin instead of thick, is more about management; with thick you need to carefully monitor the growth of your LUNs before it's too late.

Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Post Reply

Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 33 guests