-
- Influencer
- Posts: 11
- Liked: 2 times
- Joined: Jun 28, 2021 1:35 pm
- Full Name: Fredrik Kristensen
- Location: Norway
- Contact:
Single or multiple buckets - best practices?
Hi guys,
I'm setting up on-prem S3-compatible object storage with MinIO for our backup of MS365. We have multiple proxies, hence multiple repositories. My question is whether I should create individual buckets for each repository or use a single bucket shared among them?
I have looked at the deployment guide but couldn't find a specific answer. If using one bucket is feasible, what are the pros and cons? For example, is data migration between repositories easier if they share the same bucket? I want to do it right from the start, so I don't make a decision now that will have a long-term negative impact. I am completely new to object storage, so I apologize if the answer to my questions are obvious. Appreciate all the input anyone has!
I'm setting up on-prem S3-compatible object storage with MinIO for our backup of MS365. We have multiple proxies, hence multiple repositories. My question is whether I should create individual buckets for each repository or use a single bucket shared among them?
I have looked at the deployment guide but couldn't find a specific answer. If using one bucket is feasible, what are the pros and cons? For example, is data migration between repositories easier if they share the same bucket? I want to do it right from the start, so I don't make a decision now that will have a long-term negative impact. I am completely new to object storage, so I apologize if the answer to my questions are obvious. Appreciate all the input anyone has!
IT adviser
Data Center Operations
Data Center Operations
-
- Product Manager
- Posts: 8735
- Liked: 2294 times
- Joined: May 13, 2017 4:51 pm
- Full Name: Fabian K.
- Location: Switzerland
- Contact:
Re: Single or multiple buckets - best practices?
Hi Fredrik
I would personally prefer multiple buckets instead of a single one. Some object storage appliances have a limit on how many objects they can manage. Utilizing dedicated buckets reduces the likelihood of potential performance impacts in the future due to object count limitations.
We also recommend separate buckets in our best practice guide:
https://bp.veeam.com/vb365/guide/buildc ... orage.html
May I ask, what sort of data migration are you looking for on the same repository? What data do you expect to move in the future from one repository to another? Normally our customer are requesting us to deliver a feature to migrate data from one object storage provider to another. Or from one bucket to a bucket in another account.
Best,
Fabian
I would personally prefer multiple buckets instead of a single one. Some object storage appliances have a limit on how many objects they can manage. Utilizing dedicated buckets reduces the likelihood of potential performance impacts in the future due to object count limitations.
We also recommend separate buckets in our best practice guide:
https://bp.veeam.com/vb365/guide/buildc ... orage.html
Data Migration between object storage repositories is not supported. This won't have an affect on the chosen design.For example, is data migration between repositories easier if they share the same bucket?
May I ask, what sort of data migration are you looking for on the same repository? What data do you expect to move in the future from one repository to another? Normally our customer are requesting us to deliver a feature to migrate data from one object storage provider to another. Or from one bucket to a bucket in another account.
Best,
Fabian
Product Management Analyst @ Veeam Software
-
- Service Provider
- Posts: 25
- Liked: 5 times
- Joined: Feb 16, 2023 2:11 am
- Full Name: Luke Marshall
- Contact:
Re: Single or multiple buckets - best practices?
Hi Fredrik / Fabian
MinIO wont have a limit on buckets. From a quick Google
Issues will however arise when you don't have adequate hardware for the job. If you do a MinIO cluster deployment, then follow their best practices if you can (and yes, using anything other than SSDs is not following their BPs).
When restoring large sets of data you should (someone please correct if wrong on my understanding) get better performance if you split jobs into one bucket per job. This is due to the way Veeam reads / writes data from the S3 repository. You may not notice this at a small repository level but you should see improvements in search speed when splitting out the jobs.
We have had long / on going issues with long restore times from S3 with on prem repos / semi given up trying to narrow down with support. It took 4 weeks to search across an entire companies set of data and the total restored info out to be about 600Mb's.
To the S3 Object Storage Migration side,
1. Disable any changes to s3, delete it from M365 if you can to prevent veeam from running any retention jobs in the background.
2. move data
3. remap jobs to new location
Theoretically this should result in a like for like migration but mileage may vary. Make sure you do restore tests and validate there is no inconsistencies between data before putting back in prod.
MinIO wont have a limit on buckets. From a quick Google
. So unless you are going to take over the world with M365 backups, I dont think you will have any issues on that end.MinIO recommends no more than 500,000 buckets per deployment as a general guideline.
Issues will however arise when you don't have adequate hardware for the job. If you do a MinIO cluster deployment, then follow their best practices if you can (and yes, using anything other than SSDs is not following their BPs).
When restoring large sets of data you should (someone please correct if wrong on my understanding) get better performance if you split jobs into one bucket per job. This is due to the way Veeam reads / writes data from the S3 repository. You may not notice this at a small repository level but you should see improvements in search speed when splitting out the jobs.
We have had long / on going issues with long restore times from S3 with on prem repos / semi given up trying to narrow down with support. It took 4 weeks to search across an entire companies set of data and the total restored info out to be about 600Mb's.
To the S3 Object Storage Migration side,
1. Disable any changes to s3, delete it from M365 if you can to prevent veeam from running any retention jobs in the background.
2. move data
3. remap jobs to new location
Theoretically this should result in a like for like migration but mileage may vary. Make sure you do restore tests and validate there is no inconsistencies between data before putting back in prod.
-
- Influencer
- Posts: 11
- Liked: 2 times
- Joined: Jun 28, 2021 1:35 pm
- Full Name: Fredrik Kristensen
- Location: Norway
- Contact:
Re: Single or multiple buckets - best practices?
Thank you you both, Fabian and Luke, for your replies!
Again, thank you both for your time and input. This is very valuable to me and is of great help to ensure that I do it right from the start.
I only read the installation guide and not the best practice guide, which of course should be the first place I should have checked. My bad!Mildur wrote: We also recommend separate buckets in our best practice guide:
https://bp.veeam.com/vb365/guide/buildc ... orage.html
I am talking about if I were to change the proxy server of a backup job, which in turn would also change the repository used for that job. I see now that Move-VBOEntityData does not support moving data between object storage, as you also point out. So, if we need to change proxy/repository, the only option is to have a new full backup run using the new object storage repository, correct? Are there any benefits in this scenario if the old and new repository use the same bucket?Mildur wrote: Data Migration between object storage repositories is not supported. This won't have an affect on the chosen design.
May I ask, what sort of data migration are you looking for on the same repository? What data do you expect to move in the future from one repository to another? Normally our customer are requesting us to deliver a feature to migrate data from one object storage provider to another. Or from one bucket to a bucket in another account.
We will deploy a single-node single-drive setup, as we only have available hardware for this. The compute resources will be more than what MinIO recommends, but the storage will unfortunately be HDD's. I say "single-drive", but the the storage will be backed by a HDD SAN over 25Gbps iSCSI, so hopefully it will be adequate. When we will refresh our backup infrastructure in a couple of years, I will be able to design it with MinIO best practices in mind, this is just to get us up running with object storage in the mean time.JustBackupSomething wrote: Issues will however arise when you don't have adequate hardware for the job. If you do a MinIO cluster deployment, then follow their best practices if you can (and yes, using anything other than SSDs is not following their BPs).
Due to the size of our company (25 000 users), we have a large number of jobs to limit the object count for each job. We only do small restore jobs, and never any large scale restore. Creating one bucket per job would mean a lot more management load on the backup admin (i.e. me ), so hopefully we can get away with one bucket per repository.JustBackupSomething wrote: When restoring large sets of data you should (someone please correct if wrong on my understanding) get better performance if you split jobs into one bucket per job. This is due to the way Veeam reads / writes data from the S3 repository. You may not notice this at a small repository level but you should see improvements in search speed when splitting out the jobs.
As mentioned above, when I said "migration" I meant moving backup objects between proxies and repositories. I have done a migration between JET-based repositories before, following similar steps that you outline, with success. So, it's good to hear that a similar approach should be possible with S3 storage.JustBackupSomething wrote: To the S3 Object Storage Migration side,
1. Disable any changes to s3, delete it from M365 if you can to prevent veeam from running any retention jobs in the background.
2. move data
3. remap jobs to new location
Theoretically this should result in a like for like migration but mileage may vary. Make sure you do restore tests and validate there is no inconsistencies between data before putting back in prod.
Again, thank you both for your time and input. This is very valuable to me and is of great help to ensure that I do it right from the start.
IT adviser
Data Center Operations
Data Center Operations
-
- Product Manager
- Posts: 8735
- Liked: 2294 times
- Joined: May 13, 2017 4:51 pm
- Full Name: Fabian K.
- Location: Switzerland
- Contact:
Re: Single or multiple buckets - best practices?
Hi Fredrik
Our upcoming version 8 of the product will have new features which will make this removal step not required anymore. Stay tuned for more updates in the upcoming months.
Best,
Fabian
You can remove this repository and recreate it with the same bucket/folder for the new proxy. It won't be a full backup. First job run will rescan all source items, but only download items which weren't backed up since the last run.I am talking about if I were to change the proxy server of a backup job, which in turn would also change the repository used for that job. I see now that Move-VBOEntityData does not support moving data between object storage, as you also point out. So, if we need to change proxy/repository, the only option is to have a new full backup run using the new object storage repository, correct? Are there any benefits in this scenario if the old and new repository use the same bucket?
Our upcoming version 8 of the product will have new features which will make this removal step not required anymore. Stay tuned for more updates in the upcoming months.
I strongly recommend to get our system engineers/architects involved to design this backup environment. Our SE/SA's will be happy to guide you or review the design. Please reach out to a regional system engineer if you want to have a design meeting with them.Due to the size of our company (25 000 users), we have a large number of jobs to limit the object count for each job. We only do small restore jobs, and never any large scale restore. Creating one bucket per job would mean a lot more management load on the backup admin (i.e. me ), so hopefully we can get away with one bucket per repository.
Best,
Fabian
Product Management Analyst @ Veeam Software
-
- Influencer
- Posts: 11
- Liked: 2 times
- Joined: Jun 28, 2021 1:35 pm
- Full Name: Fredrik Kristensen
- Location: Norway
- Contact:
Re: Single or multiple buckets - best practices?
I've reached out to our customer success contact so that we can get some assistance.I strongly recommend to get our system engineers/architects involved to design this backup environment. Our SE/SA's will be happy to guide you or review the design. Please reach out to a regional system engineer if you want to have a design meeting with them.
IT adviser
Data Center Operations
Data Center Operations
Who is online
Users browsing this forum: No registered users and 18 guests