S3 Capacity Tier Deduplicating Appliance

glamic26 · Post by **glamic26** » Sep 11, 2019 1:59 pm this post

Is anyone using the offload to S3 capacity Tier with an Object Storage Deduplication appliance? We are doing a POC with a Pure ObjectEngine (formerly Stor Reduce) which does Object Storage Deduplication. However, I hadn't used the S3 offload to another S3 target before this POC and so didn't realise that Veeam does an indexing process that reduces the amount of data being sent to S3 by only sending changes / unique data.

Does this indexing work on a per-VM, per-Job, or per-SOBR basis? For example if we have several Windows 2016 VMs across multiple jobs going to the same SOBR which is offloading sealed files to S3 will the Veeam indexing only ever send one lot of the Windows operating system related files that are on all of the VMs to the S3 capacity tier once? If so I'm assuming that indexing is working on a per-SOBR basis. Or would it send one copy per Job writing to that SOBR? Or is it per-VM and so there will still be many copies of the same Windows system files sent to S3? I'm trying to understand how much further deduplication an appliance might be able to do on any data being sent to S3 capacity tier from the same SOBR.

Post by **HannesK** » Sep 11, 2019 2:38 pm this post

Hello,
I'm not sure what you mean, because indexing happens during backup (application aware processing settings).

Can you describe what kind of "indexing" you mean?

Best regards,
Hannes

Post by **Gostev** » Sep 11, 2019 3:01 pm this post

I understand this question. What you call "indexing" happens on a per-job basis, as it is based on the same dedupe engine used by backup jobs. So, further storage-level deduplication may still help. However, unlike with regular Backup and Backup Copy jobs, don't expect significant additional savings, because those come from deduping multiple full backups with one another, which is something our object storage integration engine already effectively does (by preventing them from appearing in the first place). Thanks!

glamic26 · Post by **glamic26** » Sep 12, 2019 7:07 am this post

Thanks Gostev. So the indexing is essentially giving me a per-job indexing. So the only further deduplication I can expect is after that would be between jobs?

Does the indexing feature have any performance overheads for offload or restores?

Not that it is necessarily the right thing to do, just exploring options, but could the indexing but turned off so that I could get my deduplication appliance to do all of the deduplication?

Post by **Gostev** » Sep 12, 2019 12:42 pm this post

It only helps with performance, for example by allowing us to use data from Performance Tier when matching blocks are available there - instead of pulling everything from the object storage.

And no, it absolutely cannot be turned off, because it's an integral part of the architecture. But again, that would be a bad idea to do anyway, as for example offloading data to object storage would start to take 20x longer on full backups (as now, there would be 20x more blocks to upload).

fentor · Post by **fentor** » Sep 12, 2019 8:00 pm this post

Thanks Gostev for the info above.

Can you confirm that when the SOBR offload job runs, does the job automatically compress the file when sending it to the capacity tier?
From my testing it appears to compress (regardless of the settings applied on the job)?

See below...my simple job here is processing 35.3GB and then transferring 26GB (1.4X saving) which leads me to believe some compression is happening (despite compression being turned off in the job)
https://ibb.co/sqL1cwP

You can probably guess my next question: If the SOBR offload is compressing, can that be tuned to be turned off?

Thanks

Post by **Gostev** » Sep 13, 2019 1:34 pm this post

Correct, it is always compressed and currently there's no way to disable compression.

R&D Forums

S3 Capacity Tier Deduplicating Appliance

Re: S3 Capacity Tier Deduplicating Appliance

Re: S3 Capacity Tier Deduplicating Appliance

Re: S3 Capacity Tier Deduplicating Appliance

Re: S3 Capacity Tier Deduplicating Appliance

Re: S3 Capacity Tier Deduplicating Appliance

Re: S3 Capacity Tier Deduplicating Appliance

Who is online