StoreOnce Catalyst performance not as expected

Availability for the Always-On Enterprise

StoreOnce Catalyst performance not as expected

Veeam Logoby Regnor » Wed Jun 13, 2018 7:54 am

We're currently evaluating a StoreOnce appliance at a customer site.
The appliance is connected via 8 Gb/s FC to the backup server and we're using StoreOnce Catalyst for the backups.
During backups we see a throughout of 150-300MB/s, which is not high but OK. Additional Full backups of VMs which are already on the appliance are running at the same rate.
I would expect a much higher performance with Catalyst as there's no need to read/write the data again; with every run the dedupe rate is increasing but performance stays at the same level.
We've tested both Low and High Bandwith, but didn't notice any differences.
Bottleneck according to Veeam is Source 40%, Proxy 24%, Network 13%, 99% Target.

Am I missing something here?
Regnor
Veeam ProPartner
 
Posts: 148
Liked: 28 times
Joined: Mon Jan 31, 2011 11:17 am

Re: StoreOnce Catalyst performance not as expected

Veeam Logoby taurus1978 » Wed Jun 13, 2018 1:41 pm

Hello
maybe you need to configure more streams for the fibre-channel Adapters on the storeonce.
Look at this guide on page 85:

https://support.hpe.com/hpsc/doc/public/display?docId=a00043535en_us

got that hint from a HPE pre-sales guy.

that removes the stream limit on the FC Cards. So that you can use more parallellisation.
And remeber to connect both the StoreOnce AND the backup Server to the FC SAN. If there is a LAN connection anywhere in between you will surely experience loss of bandwidth.

Regards,
Patrick
taurus1978
Technology Partner
 
Posts: 7
Liked: never
Joined: Mon May 11, 2015 11:51 am
Full Name: Patrick Huber

Re: StoreOnce Catalyst performance not as expected

Veeam Logoby Regnor » Wed Jun 13, 2018 4:48 pm

I've already increased the logins per port to 16 but I'll try 256 and see if anything happens.

What do you mean with connecting the StoreOnce to the FC SAN?
Isn't the data flow SAN -> Backup Server -> StoreOnce?
Regnor
Veeam ProPartner
 
Posts: 148
Liked: 28 times
Joined: Mon Jan 31, 2011 11:17 am

Re: StoreOnce Catalyst performance not as expected

Veeam Logoby Regnor » Thu Jun 14, 2018 6:57 am

So with 256 and no limitation from the repository the performance stays the same; a single task/stream will not pass over 160MB/s.
I'm still not sure when Catalyst should/would kick in...

One thing I've noticed; If I uncheck "decompress backup data" on the repository, performance almost doubles per stream; perhaps the proxy/gateway server is to slow?
Regnor
Veeam ProPartner
 
Posts: 148
Liked: 28 times
Joined: Mon Jan 31, 2011 11:17 am

Re: StoreOnce Catalyst performance not as expected

Veeam Logoby Gostev » Thu Jun 14, 2018 12:20 pm

Regnor wrote:a single task/stream will not pass over 160MB/s

My understanding is that this is perfectly normal and by design for ANY deduplicating storage. You can only scale throughput by increasing the number of streams, and this is the case for both writes and reads (backup and restore, that is).
Gostev
Veeam Software
 
Posts: 22209
Liked: 2628 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: StoreOnce Catalyst performance not as expected

Veeam Logoby Regnor » Thu Jun 14, 2018 3:02 pm

I thought that if the data/blocks were already on the deduplication appliance then we wouldn't have to send them over again and therefore increase throughput (for a single stream). At the moment it looks like we're always sending over the data to the appliance, which only discards the blocks.

Regarding the performance boost of unchecking "decompress backup data":
I've set the power management settings to high performance (backup proxy) and now the rates are equal; a single stream now goes to 300MB/s.
Regnor
Veeam ProPartner
 
Posts: 148
Liked: 28 times
Joined: Mon Jan 31, 2011 11:17 am

Re: StoreOnce Catalyst performance not as expected

Veeam Logoby nefes » Thu Jun 14, 2018 4:22 pm

Regnor wrote:I thought that if the data/blocks were already on the deduplication appliance then we wouldn't have to send them over again and therefore increase throughput (for a single stream). At the moment it looks like we're always sending over the data to the appliance, which only discards the blocks.

It depends on your Catalyst Store setting. If it is set to high bandwidth, all data is sent to device as is, and deduplication happens on device itself.
If it is set to low bandwidth, deduplication happens at Catalyst library (on gateway where Veeam target agent works) and only new blocks are sent to device.
You can check both and see which one will result in better overall job performance, however from my experience, high bandwidth is usually faster.
nefes
Veeam Software
 
Posts: 577
Liked: 136 times
Joined: Mon Dec 10, 2012 8:44 am
Full Name: Nikita Efes

Re: StoreOnce Catalyst performance not as expected

Veeam Logoby Regnor » Fri Jun 15, 2018 11:12 am

There's almost no difference when comparing high bandwidth to low bandwidth mode; at least in that case.
Regnor
Veeam ProPartner
 
Posts: 148
Liked: 28 times
Joined: Mon Jan 31, 2011 11:17 am

Re: StoreOnce Catalyst performance not as expected

Veeam Logoby Gostev » Fri Jun 15, 2018 11:40 am

Regnor wrote:I thought that if the data/blocks were already on the deduplication appliance then we wouldn't have to send them over again and therefore increase throughput (for a single stream).

That is correct, so you will see the difference when the bandwidth is limited. Otherwise, there would not be any performance again, as the process of identifying where the data block already exists on a dedupe appliance slows processing down, balancing out the benefit from not sending some blocks over. Naturally, on high speed network connection it could be faster to just shoot everything to the dedupe appliance, and let it deal with the data locally.
Gostev
Veeam Software
 
Posts: 22209
Liked: 2628 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: StoreOnce Catalyst performance not as expected

Veeam Logoby FedericoV » Sun Jun 17, 2018 12:12 am 1 person likes this post

Backup performance for single stream is roughly 30% faster when you keep source side dedupe disabled (i.e. "High Bandwidth" on the Catalyst Store GUI). Despite this, I keep suggesting using source side dedupe (i.e. Low-bandwidth mode) in production environments.
Working in source side dedupe, when your backup job contains about 10 VMs and you process them concurrently, you can easily see the throughput rising higher than 1500MB/s even across a single 10GbE link. When you add more jobs and proxy/gateway servers in the game, you can rise the throughput even further without being limited by the network connectivity.
It is important you make sure the entire data path is design for your expected throughput, not just the last link to StoreOnce. At these speeds, the primary storage could easily became a bottleneck as well.
An important suggestion is to pay attention that your Veeam Proxy, which is selected by the Backup-Job, runs on the same server (physical or VM) as your Veeam Gateway, which is selected by the Backup Repository. If the two services run on different servers, then your backup data does not go straight to StoreOnce via Catalyst, but there is an extra hop in LAN and that connection is not deduped.
In the past, I thought Catalyst could make high bandwidth reduction only for Full Backups, and my lab tests seemed to validate that behavior. Then I sow production environments achieving bandwidth reduction in the range of 10:1 to 30:1 even for CBT incremental backups. That surprised me, so I went back to my lab to check what was wrong... and the wrong part was my workload generator based on many 50MB files. Indeed, most production workloads are based on many small write operation and few large ones. When I changed my workload generator to produce many small and few large files, I sow a rising dedupe effect also for Incremental backups. Tests with just 1% of new data distributed over a lot of very small files were able to generate .vib as big as 15% of the full. This happens because even a few KBs file forces the CBT engine to mark as changed an entire 1MB wide segment in its VMDK. StoreOnce dedupe works at a much granular level and it identifies the real changed data inside the larger segment, and so it avoids sending the unchanged parts.
This long discussion to say that Source Side Dedupe helps also for Incremental backup and not only for the Full ones.
FedericoV
Technology Partner
 
Posts: 7
Liked: 7 times
Joined: Mon Aug 21, 2017 3:27 pm
Full Name: Federico Venier

Re: StoreOnce Catalyst performance not as expected

Veeam Logoby johna8 » Mon Jun 18, 2018 4:29 am

Just to let you know:

We run 3540s over 10GbE to the StoreOnce.
FC for storage.

We need to get on average around 170MB/s for writing to the 3540.
This peak though in another site - does go up to 365MB/s - processing rate 231 MB/s for example.
Depending on the model though - would've thought 150-300MB/s was ok?
johna8
Novice
 
Posts: 9
Liked: never
Joined: Tue Oct 11, 2016 8:23 am

Re: StoreOnce Catalyst performance not as expected

Veeam Logoby sys-adm » Mon Jun 18, 2018 6:26 am 1 person likes this post

Hi,

We have here a StoreOnce 4900 and had also performance issue over FC. After we have switched to 2x 10GB Ethernet in LACP setting the performance was twice than before.
But the performance is not good enouth and we had some other trouble with this device.
Now we are evaluate some other backup appliance from ExaGrid. ExaGrid have another strategy with a landing zone. The idea is to backup to a non deduplicated storage and transfer the older backup data to the deduplicated storage.
We start with the proof of concept in a few weeks.
I think it's highly recommended to check some other products than the HPE StoreOnce.

Br sys-adm
sys-adm
Lurker
 
Posts: 1
Liked: 1 time
Joined: Mon Jun 11, 2018 5:59 am
Full Name: Andreas Buetler

Re: StoreOnce Catalyst performance not as expected

Veeam Logoby Regnor » Mon Jun 18, 2018 7:40 am 1 person likes this post

@Federico: Thanks for your input; it's really interesting to see how the StoreOnce works with different workloads.
I'll play with both modes and see how the compare with more VMs.

@Johna8: The performance is really ok from my point of view; I just thought the it would increase more when dedupe kicks in.

@sys-adm: When using a dedup appliance as primary backup storage I see advantages in having a landing zone. We're using them as a secondary storage so performance isn't that critical.

To come to a conclusion; the more streams/tasks we put on the StoreOnce, the better performance results we get.
Regnor
Veeam ProPartner
 
Posts: 148
Liked: 28 times
Joined: Mon Jan 31, 2011 11:17 am


Return to Veeam Backup & Replication



Who is online

Users browsing this forum: Bing [Bot], Google [Bot], Google Feedfetcher, Majestic-12 [Bot] and 27 guests