Discussions related to using object storage as a backup target.
Post Reply
gstrouth
Novice
Posts: 4
Liked: never
Joined: Nov 08, 2021 10:41 pm
Full Name: Gerard Strouth
Contact:

Backup Job Settings

Post by gstrouth »

To help reduce the size consumed in the cloud object storage locations does the Backup job settings matter as far as compression level, dedup on/off, and block size? We use a dedup appliance so usually you have dedup and compression off in the backup job settings but we are starting to play with object storage for the capacity tier and wondering how these settings affect the size.
HannesK
Product Manager
Posts: 14844
Liked: 3086 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Backup Job Settings

Post by HannesK »

Hello,
and welcome to the forums.

Yes, we apply compression when transferring data from a deduplication appliance to object storage.

Best regards,
Hannes
gstrouth
Novice
Posts: 4
Liked: never
Joined: Nov 08, 2021 10:41 pm
Full Name: Gerard Strouth
Contact:

Re: Backup Job Settings

Post by gstrouth »

So basically none of the backup job storage settings matter then?
HannesK
Product Manager
Posts: 14844
Liked: 3086 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Backup Job Settings

Post by HannesK »

correct
bytewiseits
Service Provider
Posts: 54
Liked: 32 times
Joined: Nov 23, 2018 12:23 am
Full Name: Dion Norman
Contact:

Re: Backup Job Settings

Post by bytewiseits »

Not technically from a total size consumed perspective, but from what I understand, block size will definitely matter here as block size is carried over to the object storage as well. If you reduce the block size of a job from the default, it will also increase the number of API calls/puts etc which can be costly with providers charging per API call. Increasing block size from default further reduces API calls but the incremental backups are also larger etc. The default block size (1MB) is usually a good tradeoff between incremental backup sizes and number of overall blocks but depending on the source data/change rates etc.

Block size may not hugely affect the overall size of the stored object data (depending on source data) but it can contribute to a large portion of cost from API calls if the cloud provider charges for them, in addition to the total data stored as well. If S3 immutability is enabled you also have those regular API calls to update immutability on each relevant stored block as well.

Just something to keep in mind if you are just starting out with Capacity Tier.
HannesK
Product Manager
Posts: 14844
Liked: 3086 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Backup Job Settings

Post by HannesK »

yes, that's correct (I missed block size in my answer).

I just double checked the default job settings with DataDomain: Compression is set to optimal per default. Block size is Local target (large blocks). Enable inline data deduplication is "off". In the repository settings, it is "decompress backup file data blocks before storing".
gstrouth
Novice
Posts: 4
Liked: never
Joined: Nov 08, 2021 10:41 pm
Full Name: Gerard Strouth
Contact:

Re: Backup Job Settings

Post by gstrouth »

So what about the backup job itself. For Veeam dedup to work all the VM's have to be put into the same job, does that apply for object storage as well? Or can you have multiple jobs and the space consumed would be the same?
HannesK
Product Manager
Posts: 14844
Liked: 3086 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Backup Job Settings

Post by HannesK »

For Veeam dedup to work all the VM's have to be put into the same job
Veeam inter-machine deduplication can only work with per-job chains (unchecked the per-machine backup files). With per-machine backups, also VMs in one job are only deduplicated inside one machine (no inter-machine deduplication)

Per-machine chains are always recommended and will become the new default in V12. For dedupe appliances, you probably use per-machine files, because otherwise they work even slower than they are slow anyway. The effect of inter-machine deduplication you might have seen on normal repositories for the first backup is becoming more and more irrelevant over time as data blocks are hardly equal between different machines.

Long story short: compression has the most impact on backup size. Deduplication is relatively irrelevant and not worth thinking about it.

Space consumption on object storage is comparable with REFS / XFS file systems. The only difference is, that active full backups only take up the incremental space on object storage. per-job / per-machine backup files have small influence (same like on normal repositories). But as I said, we always recommend per-machine files as the space savings are irrelevant and advantages are on the per-machine files.
Post Reply

Who is online

Users browsing this forum: No registered users and 14 guests