Object counts and sizes

Post by **squebel** » May 23, 2022 8:51 pm this post

We have Netapp StorageGrid deployed in-house and are starting to work with the object options in Veeam. I've set up a test SOBR with capacity tier with both copy and move enabled. I set up a very simple job to backup two vm's to this SOBR. After the first copy to the object bucket, I noticed that there are over 114K objects in my bucket for these two full backups. I also noticed that many of these objects are extremely small. These two discoveries have me asking questions:

1. Why are there roughly 55K objects for a single Full backup per vm?
2. Why are so many of the objects so small?
3. Are there any parameters that can be set to increase the object size?

I ask about the objects size specifically because for capacity planning on our StorageGrid environment, we need to know if we're going to be able to erasure code much of the objects or not. NetApp recommends not applying erasure cording to objects less that 1MB. If we end up having to do object replication (100% overhead), this will have to be calculated into our capacity planning vs being able to do erasure coding which is more like 66% overhead.

May 23, 2022 9:14 pm

Great questions! As you'd expect, the amount of objects is directly tied to two things:

Size of the backup data
Block size chosen in the job configuration
- Check the Storage optimization section

The above documentation link describes where in the job configuration the block size is configured. Here's an additional link for further reading that will help you determine which block size is best for your use case.

Post by **squebel** » May 23, 2022 9:34 pm this post

Checked the backup job and it is set to "Local Target".

First run (active full) stats:
VM count: 2
Processed: 119GB
Read: 112GB
Transferred: 77GB

So, where does 114K objects come from where most of them are way less than 1MB?

May 24, 2022 1:08 pm

Thanks for the additional data. Using this, we can perform a rough analysis to understand where the 114K objects come from:

As the Storage optimization for the job is set to Local Target, this means you'll have a block size of 1MB pre-compression.

If we convert the Read value you provided from GB to MB, we get: 112GB x 1024 = 114688MB

As 1MB is your block size, that means you'd have ~114K objects. On top of that, the 1MB blocks are compressed on average at a 50% reduction which would mean the blocks uploaded to object storage would typically be much smaller than the pre-compression size of 1MB.

I hope this makes things more clear.

Post by **squebel** » May 24, 2022 2:21 pm this post

That's a great explanation and I think it does make sense. And if I'm following this, when I see these thousands of smaller objects, they were originally 1MB objects that had an extreme amount of compression applied to them?

May 24, 2022 2:40 pm

Glad it helped! The level of compression is configurable as well right above in the job configuration where you configured the block size. Here's more info on the available options in our compression settings.

I'll add that, for most use cases, we recommend using the default compression setting of Optimal. This provides the perfect balance of size reduction without greatly increasing the time/resources required for a job to complete.

Answering your final question, yes, most of those objects are 1MB blocks that had their size reduced.

May 24, 2022 4:01 pm

This is all great info. I think what we can assume then is that we're going to end up with a lot of objects smaller than 1MB which is going to mean a large majority of the objects on the StorageGrid platform are not going to be able to have erasure coding applied to them. It means we have to account for more storage usage than we had hoped. I realize this isn't necessarily a Veeam problem but if there was a way to create larger objects when writing to the capacity tier, that would help in our case. Changing the compression or block size on the backup job will change it for the local repo as well which would have implications on data consumption there where that storage is more expensive so I'm not sure it's worth it to try to have larger blocks going to S3.

R&D Forums

Object counts and sizes

Re: Object counts and sizes

Re: Object counts and sizes

Re: Object counts and sizes

Re: Object counts and sizes

Re: Object counts and sizes

Re: Object counts and sizes

Who is online