Archive tier questions

theviking84 · Post by **theviking84** » Feb 20, 2021 10:54 pm this post

I'm thinking of how to decide how much to keep in s3 vs glacier. It is tricky because restore requests are random. My thought is maybe older than 6 months go to archive tier.

My questions are

1. Will veeam use expedited restore requests from glacier to s3 bucket when you need to restore data from archive tier? Or will it use standard times which I think take 3 to 5 hours? Or will we have the option? Is there notifications in veeam of what to expect and a waiting period and then I'm emailed or told that its ready to restore etc.?

2. Will only absolutely needed blocks be transferred from glacier back to performance tier?

Example: using default delta mode so each glacier gfs is not entire standalone. Say I need a file from 10 months back. Will veeam use similar logic as now in v10 where it knows some blocks are already in capacity tier so no need to retrieve those from archive tier?

I believe it works this way now for capacity to performance etc.

3. Moving data between capacity tier and archive tier does not lose the space savings benefits? Block referencing etc.

4. This last one is not archive but capacity to performance tiers specific. I believe when v10 first launched, block cloning is lost when moving data from capacity tier to performance tier. So xfs or refs data explodes in that scenario. Has this been fixed yet or still an issue?

dalbertson · Post by **dalbertson** » Feb 20, 2021 11:59 pm this post

1. You have 3 options that aws has. Bulk, expedited and standard.

1-5 minutes for expedited retrievals
3-5 hours for standard retrievals
5-12 hours for bulk retrievals

2. Yes. There are two options for data stored in glacier. One is like capacity tier where the data is stored in a forever forward incremental fashion, and the other option is to store each backup as its own chain.

This would be very handy for long retention periods like years or maybe for compliance reasons to have each chain separate.

3. The data stored in glacier will still have the same space savings you are used to. The only difference is that the data stored in s3 and the data stored in glacier are their own unique chains.

4. I’m not familiar with what you are referring to here. But I’ll dig around

theviking84 · Post by **theviking84** » Feb 21, 2021 3:27 am this post

Thank you. So on 1. Will veeam present the choice each time we need data from glacier of which method we want? Or how is it handled?

If it is going to take 5 hours then veeam just runs a process during all that time to move it to capacity tier and then are we notified when it is ready to actually restore the data to production? Or we just choose method and kick it off and then it will just do it all and take however long it needs?

dalbertson · Feb 21, 2021 3:35 am

It will be a separate processs of a restore job. There will be a retrieval job that runs to pull the data into capacity tier. When it’s done it keeps it there for a period of time that you specified during the retrieval wizard. You will have the capacity to also extend that time as well. But it automatically will time out and send it back to archive.

It’s a really awesome process.

theviking84 · Post by **theviking84** » Feb 21, 2021 3:41 am this post

Can't wait for v11. I'm banking on it saving me from retention and costs getting too high in s3 bucket.

Only tough thing is deciding on incremental forever vs standalone fulls in glacier. I think it would still be very costly to go the standalone route for a lot of companies as we are talking a full size for every single gfs.

Sure glacier is cheaper than s3 bucket but if you explode storage 5 times or more then is 5 times cheaper meaningful at that point? Kind of stuck back where you were.

I guess from what I can read online it seems glacier should still be 99.99999 whatever reliable. Even if there is corruption i am hoping it would be small enough that a file or folder (which is all i would ever need from that old) would be still chance of intact.

dalbertson · Feb 21, 2021 3:51 am

My only recommendation would be to plan out your GFS strategy very carefully at first and see what makes sense.

What I mean by that is to see if you plan to keep years of data then it makes more sense to store independent backups just to protect yourself from an issue with a very long chain.

But if you are only keeping a handful of restore points then yes the ffi approach may be better.

It will be not that bad to cost out. Api calls will be minimal and part of what we are doing when we move to archive is to make the objects (blob) larger in size to cut the costs of APIs and performance.

I plan on updating the cloud tier deep dive very soon so keep an eye out for it.

R&D Forums

Archive tier questions

Re: Archive tier questions

Re: Archive tier questions

Re: Archive tier questions

Re: Archive tier questions

Re: Archive tier questions

Who is online