just to provide an update on all this.
We reinstalled out backup host with VMware ESXi and added it back into vcentre.
I then redeployed a Rocky Linux 8.9 VM with 96GB RAM and 8 vCPU cores with the same IP that I used for MinIO when it was on bare metal. (so that I wouldn't loose any of the data on the s3 buckets).
I configured the VM with PCI pass-through so I could pass-through the Dell HBA 330 controller directly to the VM.
I then vmotioned back the 4x 8GB ram Linux proxies onto the host.
Our throughput increased significantly as the hot add adds the disk directly using the 2x16GB fibre ports on that host, and then pushes it to the MinIO appliance thereby keeping all the traffic on the backup host rather than it traversing our 10GbE core.
Disappointingly though the throughput on Active full Backup run is running at about 485MB/s on a 22x8TB disk set.
The 22 disk set is split into 2x11 erasure sets in MinIO with 2 parity disks per raid set.
Code: Select all
# mc admin info minio1
● ?????.com:9000
Uptime: 3 hours
Version: 2024-03-10T02:53:48Z
Network: 1/1 OK
Drives: 22/22 OK
Pool: 1
Pools:
1st, Erasure sets: 2, Drives per erasure set: 11
22 TiB Used, 1 Bucket, 12,110,496 Objects
22 drives online, 0 drives offline
When we ran the same configuration except the Rocky 8.9 VM was running linux md with the same hardware in a 22 disk raid 10 configuration, then the throughput on the same job was 979MB/s.
(interestingly during tests yesterday on tin we played with the SAN mode and found this to be roughly half the speed of HotAdd in our configuration - despite it being presented as the optimal solution).
So on raid 10 we loose half the capacity of the disks so we have usable storage of 11x8TB=88TB but it's twice as fast for Active fulls compared to a MinIO node running the same disk/ram configuration.
Using MinIO in our config means we loose 4 disks so we have usable capacity of (22-4)*8TB=144TB.
our new unit we will order will be kitted out with 24x16TB spindles, so we should end up with the following usable space;
radi10: (12*16)=192TB.
MinIO: (24-4)*16TB=320TB.
I'm not sure how well traditional raid 6 scales at these sorts of drive capacities/set sizes as rebuild times would likely be horrendous, (from past experience and a rebuild would likely put undue load on the remaining spindles increasing the risk of failure).
The other approach we could take is to go all flash/NVME on the primary backup unit and then use MinIO for storage tiering as secondary targets, but to go all flash on the primary unit would dramatically increase our operating costs.
I guess we could also sideways scale but again that would significantly impact our cost and in our case presents connectivity challenges as our backup unit is currently back to back connected into spare ports on our EMC Unity 680F all flash SAN.
Hopefully some of this might help others to consider some of the tradeoffs on using objects stores as a primary target.