-
- Service Provider
- Posts: 108
- Liked: 14 times
- Joined: Jan 01, 2006 1:01 am
- Full Name: Dag Kvello
- Location: Oslo, Norway
- Contact:
Performance of Object Storage vs Block
Is there any published information on Backup/Restore/Instant Recovery performance when using direct to S3/Object Storage compared to a standard Linux XFS Repository ?
-
- Chief Product Officer
- Posts: 31836
- Liked: 7327 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Performance of Object Storage vs Block
It really only makes sense to compare the specific two storage devices, otherwise it's like comparing the speed of randomly picked cars in trying to understand if a diesel engine is "faster" than a gas one... there are both really fast and really slow cars with both engine types...
-
- Service Provider
- Posts: 108
- Liked: 14 times
- Joined: Jan 01, 2006 1:01 am
- Full Name: Dag Kvello
- Location: Oslo, Norway
- Contact:
Re: Performance of Object Storage vs Block
OK, I'll bite.
So, I have a specific All-Flash storage system that provides 10TiB block storage to a Physical Veeam V12 Ubuntu (XFS) repository. One 10TiB Volume
Then I create a MinIO Ubuntu deployment (Single Node) creating six Block volumes with (Getting Erasure Coding, Immutability etc) on the exact same Storage system giving me a 10TiB S3 Repository.
So everything being "The same". Same physical Backend Storage, Same HW Spec physical host as a Repository (one Block and one MinIO S3) with the same "Net" Storage available.
Same 32GB FC Storage network and 25GB Data Network.
What would perform best ? I'm especially curious about the Instant Recovery performance of the Object Storage Repository.
So, I have a specific All-Flash storage system that provides 10TiB block storage to a Physical Veeam V12 Ubuntu (XFS) repository. One 10TiB Volume
Then I create a MinIO Ubuntu deployment (Single Node) creating six Block volumes with (Getting Erasure Coding, Immutability etc) on the exact same Storage system giving me a 10TiB S3 Repository.
So everything being "The same". Same physical Backend Storage, Same HW Spec physical host as a Repository (one Block and one MinIO S3) with the same "Net" Storage available.
Same 32GB FC Storage network and 25GB Data Network.
What would perform best ? I'm especially curious about the Instant Recovery performance of the Object Storage Repository.
-
- Chief Product Officer
- Posts: 31836
- Liked: 7327 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Performance of Object Storage vs Block
In case of MinIO, I expect XFS to win by far. I don't believe MinIO is highly optimized for backup and restore performance, like for example Object First is.
-
- Service Provider
- Posts: 108
- Liked: 14 times
- Joined: Jan 01, 2006 1:01 am
- Full Name: Dag Kvello
- Location: Oslo, Norway
- Contact:
Re: Performance of Object Storage vs Block
I'll do some testing with MinIO, Wasabi and Cloudian to see how DR/Instant Recovery performs compared to block. I've helped several customers off DD and other DeDupe appliances the last couple of years due to the terrible I/O performance when doing Instant Recovery.
I need to figure out if on-prem S3/Object is a viable alternative to Block (Price and Performance) in cricital scenarios.
I need to figure out if on-prem S3/Object is a viable alternative to Block (Price and Performance) in cricital scenarios.
-
- Technology Partner
- Posts: 7
- Liked: 11 times
- Joined: Aug 19, 2020 2:43 pm
- Full Name: Eco Willson
- Contact:
Re: Performance of Object Storage vs Block
Dag,
I think the issue lies with the setup of MinIO. MinIO is software defined and built to run on commodity hardware. It is performance optimized to run on as fast as the underlying hardware will let it. There are benchmarks here: https://blog.min.io/minio-veeam-backup- ... -benchmark (these were performed jointly with Veeam).
We respectfully disagree with Gostev on the optimization for backup and restore. MinIO has a number of optimizations that make it ideal from an RTO and RPO perspective. Further, it does not lock you into the appliance model.
We would be happy to hop on a call with you. Just let us know.
I think the issue lies with the setup of MinIO. MinIO is software defined and built to run on commodity hardware. It is performance optimized to run on as fast as the underlying hardware will let it. There are benchmarks here: https://blog.min.io/minio-veeam-backup- ... -benchmark (these were performed jointly with Veeam).
We respectfully disagree with Gostev on the optimization for backup and restore. MinIO has a number of optimizations that make it ideal from an RTO and RPO perspective. Further, it does not lock you into the appliance model.
We would be happy to hop on a call with you. Just let us know.
-
- Chief Product Officer
- Posts: 31836
- Liked: 7327 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Performance of Object Storage vs Block
Please keep in mind that the context of my response is OP asking about a very specific config: a single physical server and a mere 10TB of storage.
Sure, I've personally seen MinIO perform very well on an 8-node NVMe storage cluster with 100GbE interconnect, the only question is how many Veeam customers can afford that for a backup storage...
And my Object First reference was likewise specific to the scenario in question. While they do single node storage really well in terms of backup and restore performance, I would also not recommend them to customers looking to have a fully redundant 8-node cluster for their backup storage...
Sure, I've personally seen MinIO perform very well on an 8-node NVMe storage cluster with 100GbE interconnect, the only question is how many Veeam customers can afford that for a backup storage...
And my Object First reference was likewise specific to the scenario in question. While they do single node storage really well in terms of backup and restore performance, I would also not recommend them to customers looking to have a fully redundant 8-node cluster for their backup storage...
-
- Novice
- Posts: 3
- Liked: 3 times
- Joined: Jul 18, 2023 12:35 am
- Full Name: Ainsley Edwards
- Contact:
Re: Performance of Object Storage vs Block
In a simple setup on a single node where you have the same number and type of disks serving as the backup repository, the block storage will usually be faster than object storage for most cases. The story changes as you scale the storage requirements where complexity and volume management starts to influence the conversation and this is where having "cheaper" object storage with its capability to provide better throughput per dollar starts to shine.
-
- VP, Product Management
- Posts: 7098
- Liked: 1517 times
- Joined: May 04, 2011 8:36 am
- Full Name: Andreas Neufert
- Location: Germany
- Contact:
Re: Performance of Object Storage vs Block
I see it like Ainsley,
If you manage backup target storage for hundreds of TB or multiple PB, block storage is not that flexible in case of changes, scale-out capacity and scale-out performance. If you want to change one of these, then it is ton of work to do with creating volumes, adding them to servers, formating the drives, decide what to move, ... .
For that reason I saw the scale out filers like Isilon, Ceph or Spectrum Scale used a lot for backup purposes. One large filesystem spanning many servers. If you need more capacity/throughput, then just add an additional node and no changes at all needed to be done on the backup infrastructure. But all of them were built for another workload (unstructured data and hosting of a ton of small files, not some large backup files). As well they scale much better with many workstreams. The operation model was as well done in a way that you buy one filer per purpose/department/datacenter.
So here the new bright and shiny object storages comes in to save the day. Idea is to operate it like a service and everyone consumes the storage over a tenant like model. Same in case of scale-out as the filers, but less downsides in case of backup workloads. An important planning consideration is, that backup means as well delete operations near similar to the amount that you daily write into the storage. The more traditional workloads that used object storage were more like write everything and never delete ones.
With all 3 storages, it is important to size well in case of capacity but as well to size well in case of IO and delete operations.
On a single server with just a few volumes, I would always go directly without an additional overhead. HW Raid + XFS/ReFS + directly writing to this. The overhead of object storage is, cost, IO-overhead, management overhead, complexity, if something goes wrong additional components in the mix, ...
If you manage backup target storage for hundreds of TB or multiple PB, block storage is not that flexible in case of changes, scale-out capacity and scale-out performance. If you want to change one of these, then it is ton of work to do with creating volumes, adding them to servers, formating the drives, decide what to move, ... .
For that reason I saw the scale out filers like Isilon, Ceph or Spectrum Scale used a lot for backup purposes. One large filesystem spanning many servers. If you need more capacity/throughput, then just add an additional node and no changes at all needed to be done on the backup infrastructure. But all of them were built for another workload (unstructured data and hosting of a ton of small files, not some large backup files). As well they scale much better with many workstreams. The operation model was as well done in a way that you buy one filer per purpose/department/datacenter.
So here the new bright and shiny object storages comes in to save the day. Idea is to operate it like a service and everyone consumes the storage over a tenant like model. Same in case of scale-out as the filers, but less downsides in case of backup workloads. An important planning consideration is, that backup means as well delete operations near similar to the amount that you daily write into the storage. The more traditional workloads that used object storage were more like write everything and never delete ones.
With all 3 storages, it is important to size well in case of capacity but as well to size well in case of IO and delete operations.
On a single server with just a few volumes, I would always go directly without an additional overhead. HW Raid + XFS/ReFS + directly writing to this. The overhead of object storage is, cost, IO-overhead, management overhead, complexity, if something goes wrong additional components in the mix, ...
-
- Service Provider
- Posts: 7
- Liked: 1 time
- Joined: Jun 21, 2021 9:10 am
- Full Name: Jake Ellingham
- Contact:
Re: Performance of Object Storage vs Block
Minio are only supporting Veeam workload on flash based storage.
We have run into many issues with veeam s3 on HDD backed minio. Veeam creates/deletes too many folders for it to perform well.
We have run into many issues with veeam s3 on HDD backed minio. Veeam creates/deletes too many folders for it to perform well.
-
- Veeam Legend
- Posts: 411
- Liked: 232 times
- Joined: Apr 11, 2023 1:18 pm
- Full Name: Tyler Jurgens
- Contact:
Re: Performance of Object Storage vs Block
@jakeellingham
Do you have a Minio statement supporting that?
I do hope that Veeam uses the *already provided* SOSAPI implementations to automatically choose the S3 recommended block sizes, as 1 MB blocks that Veeam sets by default are not ideal. Nowhere in my dealings with Minio have I heard them say they don't support Veeam on HDD backed Minio clusters.
Do you have a Minio statement supporting that?
I do hope that Veeam uses the *already provided* SOSAPI implementations to automatically choose the S3 recommended block sizes, as 1 MB blocks that Veeam sets by default are not ideal. Nowhere in my dealings with Minio have I heard them say they don't support Veeam on HDD backed Minio clusters.
Tyler Jurgens
Veeam Legend x3 | vExpert ** | VMCE | VCP 2020 | Tanzu Vanguard | VUG Canada Leader | VMUG Calgary Leader
Blog: https://explosive.cloud
Twitter: @Tyler_Jurgens BlueSky: @explosive.cloud
Veeam Legend x3 | vExpert ** | VMCE | VCP 2020 | Tanzu Vanguard | VUG Canada Leader | VMUG Calgary Leader
Blog: https://explosive.cloud
Twitter: @Tyler_Jurgens BlueSky: @explosive.cloud
-
- Chief Product Officer
- Posts: 31836
- Liked: 7327 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Performance of Object Storage vs Block
Actually, I don't believe there's a facility in the current version of SOSAPI to exchange recommended block sizes with the object storage vendor just yet.
More importantly, Veeam specifically warns against the usage of 4MB block because they lead to 2x disk space consumption as well as slower granular and instant restores. It would be wrong for Veeam to apply explicitly discouraged settings by default, leaving customers in the dark about them spending 2x more on storage than needed.
I strongly believe it should be a discussion between MinIO and its customers. For example, MinIO could openly explain/document that their storage does not scale to 1MB objects with HDDs, so customers should either double the HDD storage space and use 4MB blocks, or use SSDs. And (just fantasizing here) in return MinIO could provide the some discount for Veeam users to absorb these extra costs caused by their architectural limitations. Then, a prospect will be able compare the offer with alternatives and make an educated decision. The object storage market is really hot these days anyway, many vendors to choose from!
And this is not something totally out of this world to expect, by the way. For example, Wasabi has 90 days minimum storage policy for all users, but they reduce this to 30 days for Veeam users based on a support request. I guess it works for them financially because unlike other workloads, Veeam brings LOTS of backup data to store, while they charge per TB.
More importantly, Veeam specifically warns against the usage of 4MB block because they lead to 2x disk space consumption as well as slower granular and instant restores. It would be wrong for Veeam to apply explicitly discouraged settings by default, leaving customers in the dark about them spending 2x more on storage than needed.
I strongly believe it should be a discussion between MinIO and its customers. For example, MinIO could openly explain/document that their storage does not scale to 1MB objects with HDDs, so customers should either double the HDD storage space and use 4MB blocks, or use SSDs. And (just fantasizing here) in return MinIO could provide the some discount for Veeam users to absorb these extra costs caused by their architectural limitations. Then, a prospect will be able compare the offer with alternatives and make an educated decision. The object storage market is really hot these days anyway, many vendors to choose from!
And this is not something totally out of this world to expect, by the way. For example, Wasabi has 90 days minimum storage policy for all users, but they reduce this to 30 days for Veeam users based on a support request. I guess it works for them financially because unlike other workloads, Veeam brings LOTS of backup data to store, while they charge per TB.
-
- Veeam Legend
- Posts: 411
- Liked: 232 times
- Joined: Apr 11, 2023 1:18 pm
- Full Name: Tyler Jurgens
- Contact:
Re: Performance of Object Storage vs Block
Maybe I'm missing something but isn't the point of the SOSAPI to provide additional information and recommendations to Veeam for how that object storage wants to interact with Veeam? I was under the impression that the SOSAPI has a recommended object size in the spec, but I cannot find the SOSAPI spec myself to see what's in it. I believe the recommended block size is included because I've seen multiple references to it both from S3 providers and various blog posts. While I believe the information is there, I also understand Veeam doesn't do anything with it today.
Maybe it's not ideal for the customers to use 4 MB or 8 MB blocks because of the storage increase. That's a fair statement. No one is saying to have Veeam *unable* to modify that setting, I'm only suggesting that Veeam use the SOSAPI recommended block size by default - nothing more.
From a service provider perspective, I would rather look to cut the Veeam customers a deal on price when they're seeing extra consumption of data due to larger block sizes than switch away from Minio because *one* backup Vendor has a harder time than others. Our deal we make with Minio isn't in question here, it's the deal we make with Veeam customers that matter. If that causes us to make a different deal with Minio, then so be it. I'll concede I may be trying to fit an elephant into a refrigerator, but god damn it, I love that elephant and its hot outside.
I can understand your point of view if you only look at it as one company, one backup vendor, and one S3 cluster. As much as I love and advocate *heavily* for Veeam, that scenario is not the reality for me and likely not for many other service providers.
Maybe it's not ideal for the customers to use 4 MB or 8 MB blocks because of the storage increase. That's a fair statement. No one is saying to have Veeam *unable* to modify that setting, I'm only suggesting that Veeam use the SOSAPI recommended block size by default - nothing more.
From a service provider perspective, I would rather look to cut the Veeam customers a deal on price when they're seeing extra consumption of data due to larger block sizes than switch away from Minio because *one* backup Vendor has a harder time than others. Our deal we make with Minio isn't in question here, it's the deal we make with Veeam customers that matter. If that causes us to make a different deal with Minio, then so be it. I'll concede I may be trying to fit an elephant into a refrigerator, but god damn it, I love that elephant and its hot outside.
I can understand your point of view if you only look at it as one company, one backup vendor, and one S3 cluster. As much as I love and advocate *heavily* for Veeam, that scenario is not the reality for me and likely not for many other service providers.
Tyler Jurgens
Veeam Legend x3 | vExpert ** | VMCE | VCP 2020 | Tanzu Vanguard | VUG Canada Leader | VMUG Calgary Leader
Blog: https://explosive.cloud
Twitter: @Tyler_Jurgens BlueSky: @explosive.cloud
Veeam Legend x3 | vExpert ** | VMCE | VCP 2020 | Tanzu Vanguard | VUG Canada Leader | VMUG Calgary Leader
Blog: https://explosive.cloud
Twitter: @Tyler_Jurgens BlueSky: @explosive.cloud
-
- Chief Product Officer
- Posts: 31836
- Liked: 7327 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Performance of Object Storage vs Block
The "vendor recommendation" part is one of the ideas for SOSAPI roadmap indeed, it just was not a part of SOSAPI v1. The whole idea came too late to implement in V12. But it could already be in the current version of the SOSAPI spec indeed, Andreas would know for sure.
Appreciate you sharing a service provider's perspective. From Veeam's perspective, defaults are extremely important because this is how our prospects will be doing POC/trial (Next>Next>Next>done). So if they happen to have some object storage handy that "recommends" 4-8MB blocks and point Veeam there for a test, backing up the same set of machines with Veeam to that storage and with some competing backup appliance to its local disks, Veeam will show such a poor backup footprint results that no sane prospect will ever choose us. Even if said prospect never intended to use this particular object storage as a production backup repository in the first place!
Appreciate you sharing a service provider's perspective. From Veeam's perspective, defaults are extremely important because this is how our prospects will be doing POC/trial (Next>Next>Next>done). So if they happen to have some object storage handy that "recommends" 4-8MB blocks and point Veeam there for a test, backing up the same set of machines with Veeam to that storage and with some competing backup appliance to its local disks, Veeam will show such a poor backup footprint results that no sane prospect will ever choose us. Even if said prospect never intended to use this particular object storage as a production backup repository in the first place!
-
- Veeam Legend
- Posts: 411
- Liked: 232 times
- Joined: Apr 11, 2023 1:18 pm
- Full Name: Tyler Jurgens
- Contact:
Re: Performance of Object Storage vs Block
That's a totally fair perspective and I appreciate your thoughts.
That said, hasn't Veeam been battling this scenario for years already with NTFS vs ReFS vs XFS vs <insert dedup storage appliance here>? If Object Storage Vendor X recommends 8 MB blocks and Object Storage Vendor Y recommends 256K blocks, you'd see a significant change in storage consumption. IMO, this isn't any different than someone doing a POC/Trial on using an old piece of hardware running NTFS vs someone else running a POC on the top-of-the-line dedupe appliance or beautifully built XFS repo. I'm sure there are a few, uh, provisos, a, a couple of quid pro quos that already get talked about. "You can, and it would work, but you should know..."
That said, hasn't Veeam been battling this scenario for years already with NTFS vs ReFS vs XFS vs <insert dedup storage appliance here>? If Object Storage Vendor X recommends 8 MB blocks and Object Storage Vendor Y recommends 256K blocks, you'd see a significant change in storage consumption. IMO, this isn't any different than someone doing a POC/Trial on using an old piece of hardware running NTFS vs someone else running a POC on the top-of-the-line dedupe appliance or beautifully built XFS repo. I'm sure there are a few, uh, provisos, a, a couple of quid pro quos that already get talked about. "You can, and it would work, but you should know..."
Tyler Jurgens
Veeam Legend x3 | vExpert ** | VMCE | VCP 2020 | Tanzu Vanguard | VUG Canada Leader | VMUG Calgary Leader
Blog: https://explosive.cloud
Twitter: @Tyler_Jurgens BlueSky: @explosive.cloud
Veeam Legend x3 | vExpert ** | VMCE | VCP 2020 | Tanzu Vanguard | VUG Canada Leader | VMUG Calgary Leader
Blog: https://explosive.cloud
Twitter: @Tyler_Jurgens BlueSky: @explosive.cloud
-
- Chief Product Officer
- Posts: 31836
- Liked: 7327 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Performance of Object Storage vs Block
No, actually this whole block size story is completely new and is specific to object storage only. Block size has little impact for block/file storage because it works differently and don't care much how many blocks Veeam sends to it.
The only purpose of this whole dance with large block sizes is an attempt by some vendors to artificially reduce the number of objects reaching the storage by making them larger, instead of solving the actual architectural issue behind the problem. And if you think about it, this approach makes no sense overall. Object storage is designed to store objects, so it is counterintuitive for storage vendors to wish their customers injected LESS objects (i.e. the very things their storage is designed to store). This is the path to nowhere - as even with large blocks, backup footprint will continue to grow at each customer and they will face the issue of "too many block for the storage to handle" regardless sooner or later.
The only purpose of this whole dance with large block sizes is an attempt by some vendors to artificially reduce the number of objects reaching the storage by making them larger, instead of solving the actual architectural issue behind the problem. And if you think about it, this approach makes no sense overall. Object storage is designed to store objects, so it is counterintuitive for storage vendors to wish their customers injected LESS objects (i.e. the very things their storage is designed to store). This is the path to nowhere - as even with large blocks, backup footprint will continue to grow at each customer and they will face the issue of "too many block for the storage to handle" regardless sooner or later.
-
- VP, Product Management
- Posts: 7098
- Liked: 1517 times
- Joined: May 04, 2011 8:36 am
- Full Name: Andreas Neufert
- Location: Germany
- Contact:
Re: Performance of Object Storage vs Block
SOSAPI is an API definition formed by the input of a lot of object storage vendors, our support and general Veeam feedback. Instead of changing SOSAPI along the way and ask vendors to implement changes over time, we decided to implement everything in the first API definition and implement all the ideas over time in the Veeam code. Currently Capacity Reporting and Smart Entities are implemented in Veeam Backup & Replication. Regarding the handover of recommended settings, this was mainly in the code to address latest support issues where customer overload the storage. For example the number of parallel tasks used to offload data to the storage can be controlled by the storage. Customer adds additional nodes and change the value of parallel tasks by increasing one of the values in SOSAPI while protecting smaller installations. All vendors that have implemented the standard can be found here (Veeam Ready is mandatory for SOSAPI): https://www.veeam.com/alliance-partner- ... api&page=1 . In case of the recommendation settings, we work with the vendors during Veeam Ready processing so that the setting applied actually make sense and do not harm customers over time.
In case of the block size recommendation.
Some deduplication appliances like to operate as well with a 4MB block size but the negative effect of additional storage consumption is mitigated by the deduplication engine.
To not waste space, for example Exagrid with the undeduplicated landing zone use 1MB blocks instead of 4MB. In case of object storage a higher block size means that all random IO based restores including instant VM recovery, file level recovery and application restores will need to read much more data than needed from the object storage and therefore processing is slower.
In case of the block size recommendation.
Some deduplication appliances like to operate as well with a 4MB block size but the negative effect of additional storage consumption is mitigated by the deduplication engine.
To not waste space, for example Exagrid with the undeduplicated landing zone use 1MB blocks instead of 4MB. In case of object storage a higher block size means that all random IO based restores including instant VM recovery, file level recovery and application restores will need to read much more data than needed from the object storage and therefore processing is slower.
Who is online
Users browsing this forum: Google [Bot] and 12 guests