Discussions specific to object storage
Post Reply
glamic26
Service Provider
Posts: 26
Liked: 10 times
Joined: Apr 21, 2015 12:10 pm
Contact:

S3 Capacity Tier Deduplicating Appliance

Post by glamic26 » Sep 11, 2019 1:59 pm

Is anyone using the offload to S3 capacity Tier with an Object Storage Deduplication appliance? We are doing a POC with a Pure ObjectEngine (formerly Stor Reduce) which does Object Storage Deduplication. However, I hadn't used the S3 offload to another S3 target before this POC and so didn't realise that Veeam does an indexing process that reduces the amount of data being sent to S3 by only sending changes / unique data.

Does this indexing work on a per-VM, per-Job, or per-SOBR basis? For example if we have several Windows 2016 VMs across multiple jobs going to the same SOBR which is offloading sealed files to S3 will the Veeam indexing only ever send one lot of the Windows operating system related files that are on all of the VMs to the S3 capacity tier once? If so I'm assuming that indexing is working on a per-SOBR basis. Or would it send one copy per Job writing to that SOBR? Or is it per-VM and so there will still be many copies of the same Windows system files sent to S3? I'm trying to understand how much further deduplication an appliance might be able to do on any data being sent to S3 capacity tier from the same SOBR.

HannesK
Veeam Software
Posts: 3764
Liked: 454 times
Joined: Sep 01, 2014 11:46 am
Location: Austria
Contact:

Re: S3 Capacity Tier Deduplicating Appliance

Post by HannesK » Sep 11, 2019 2:38 pm

Hello,
I'm not sure what you mean, because indexing happens during backup (application aware processing settings).

Can you describe what kind of "indexing" you mean?

Best regards,
Hannes

Gostev
SVP, Product Management
Posts: 24610
Liked: 3458 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: S3 Capacity Tier Deduplicating Appliance

Post by Gostev » Sep 11, 2019 3:01 pm

I understand this question. What you call "indexing" happens on a per-job basis, as it is based on the same dedupe engine used by backup jobs. So, further storage-level deduplication may still help. However, unlike with regular Backup and Backup Copy jobs, don't expect significant additional savings, because those come from deduping multiple full backups with one another, which is something our object storage integration engine already effectively does (by preventing them from appearing in the first place). Thanks!

glamic26
Service Provider
Posts: 26
Liked: 10 times
Joined: Apr 21, 2015 12:10 pm
Contact:

Re: S3 Capacity Tier Deduplicating Appliance

Post by glamic26 » Sep 12, 2019 7:07 am

Thanks Gostev. So the indexing is essentially giving me a per-job indexing. So the only further deduplication I can expect is after that would be between jobs?

Does the indexing feature have any performance overheads for offload or restores?

Not that it is necessarily the right thing to do, just exploring options, but could the indexing but turned off so that I could get my deduplication appliance to do all of the deduplication?

Gostev
SVP, Product Management
Posts: 24610
Liked: 3458 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: S3 Capacity Tier Deduplicating Appliance

Post by Gostev » Sep 12, 2019 12:42 pm

It only helps with performance, for example by allowing us to use data from Performance Tier when matching blocks are available there - instead of pulling everything from the object storage.

And no, it absolutely cannot be turned off, because it's an integral part of the architecture. But again, that would be a bad idea to do anyway, as for example offloading data to object storage would start to take 20x longer on full backups (as now, there would be 20x more blocks to upload).

fentor
Lurker
Posts: 1
Liked: never
Joined: Sep 12, 2019 7:50 pm
Full Name: Rich Fenton
Contact:

Re: S3 Capacity Tier Deduplicating Appliance

Post by fentor » Sep 12, 2019 8:00 pm

Thanks Gostev for the info above.

Can you confirm that when the SOBR offload job runs, does the job automatically compress the file when sending it to the capacity tier?
From my testing it appears to compress (regardless of the settings applied on the job)?

See below...my simple job here is processing 35.3GB and then transferring 26GB (1.4X saving) which leads me to believe some compression is happening (despite compression being turned off in the job)
https://ibb.co/sqL1cwP

You can probably guess my next question: If the SOBR offload is compressing, can that be tuned to be turned off?

Thanks

Gostev
SVP, Product Management
Posts: 24610
Liked: 3458 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: S3 Capacity Tier Deduplicating Appliance

Post by Gostev » Sep 13, 2019 1:34 pm

Correct, it is always compressed and currently there's no way to disable compression.

Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests