-
- Veteran
- Posts: 323
- Liked: 25 times
- Joined: Jan 02, 2014 4:45 pm
- Contact:
V11 Archive Tier Question - How will it save money?
Looking at how V11 archive tier functions, it looks like you simply set an option to move data older than X days to the archive tier. This setting is configured when you add the archive tier to a SOBR. I assume this means that data is moved from the capacity tier (NOT performance tier) so the data needs to be in the capacity tier first.
Since backups have to live in capacity tier first for X days/months, AND THEN be moved to archive tier, how does this yield any cost savings? Or is there a way to move data directly to archive tier, by-passing the capacity tier?
For some background, this is an environment where there is a short-term retention on disk policy (e.g. 30 days), an off-site backup copy on disk policy (e.g. 120 days), a completely separate DR solution. With V10 "cold" tier storage was used as capacity-tier for multi-year retention to move away from tape. If I enable archive tier here, I am not seeing how it can save money since data needs to live in capacity-tier first.
Apologies if I have a fundamental misunderstanding here. Thanks in advance!
Since backups have to live in capacity tier first for X days/months, AND THEN be moved to archive tier, how does this yield any cost savings? Or is there a way to move data directly to archive tier, by-passing the capacity tier?
For some background, this is an environment where there is a short-term retention on disk policy (e.g. 30 days), an off-site backup copy on disk policy (e.g. 120 days), a completely separate DR solution. With V10 "cold" tier storage was used as capacity-tier for multi-year retention to move away from tape. If I enable archive tier here, I am not seeing how it can save money since data needs to live in capacity-tier first.
Apologies if I have a fundamental misunderstanding here. Thanks in advance!
-
- Chief Product Officer
- Posts: 31754
- Liked: 7259 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: V11 Archive Tier Question - How will it save money?
The Archive Tier is designed for long-term retention (years) and the savings come from storing all the unique data GFS backups spanning many years hold in a much cheaper storage.
Consider most typical long-term retention policy, where yearly backups are stored for 3 years, which is something most regulations require. For yearly backup of 1TB in size, overly simplified the total storage cost across 3 years would be:
Amazon S3 > USD 828
Amazon S3 Glacier > USD 144 (5x cheaper)
Amazon S3 Glacier Deep Archive > USD 36 (23x cheaper)
Possibly, your misunderstanding comes from thinking that both the Capacity Tier and the Archive Tier will store the same data? This is not the case of course, as with 5% of daily changes on average, the bigger part of each monthly GFS restore point consists of unique data not found in the previous monthly backup. And the question now is, where do you want to store all this unique data for 3 years?
For the reference, here' s a useful data point I shared before in the forum digest on the actual daily change rate numbers of our customers based on our support big data:
In 35% of environments: 3% or less
In 50% of environments: 5% or less
In 70% of environments: 10% or less
In 90% of environments: 20% or less
By the way, it not very clear from your post if you do long-term retention in principle? Because you mentioned 120 days retention policy in your post, which is of course too little to even start talking about data archiving. A good reference point here is that Deep Archive has a minimum storage duration period of 180 days!
Consider most typical long-term retention policy, where yearly backups are stored for 3 years, which is something most regulations require. For yearly backup of 1TB in size, overly simplified the total storage cost across 3 years would be:
Amazon S3 > USD 828
Amazon S3 Glacier > USD 144 (5x cheaper)
Amazon S3 Glacier Deep Archive > USD 36 (23x cheaper)
Possibly, your misunderstanding comes from thinking that both the Capacity Tier and the Archive Tier will store the same data? This is not the case of course, as with 5% of daily changes on average, the bigger part of each monthly GFS restore point consists of unique data not found in the previous monthly backup. And the question now is, where do you want to store all this unique data for 3 years?
For the reference, here' s a useful data point I shared before in the forum digest on the actual daily change rate numbers of our customers based on our support big data:
In 35% of environments: 3% or less
In 50% of environments: 5% or less
In 70% of environments: 10% or less
In 90% of environments: 20% or less
By the way, it not very clear from your post if you do long-term retention in principle? Because you mentioned 120 days retention policy in your post, which is of course too little to even start talking about data archiving. A good reference point here is that Deep Archive has a minimum storage duration period of 180 days!
-
- Expert
- Posts: 119
- Liked: 11 times
- Joined: Nov 16, 2020 2:58 pm
- Full Name: David Dunworthy
- Contact:
Re: V11 Archive Tier Question - How will it save money?
One quick question on this. Regarding the capacity tier and the archive tier not having the same data...
I was myself thinking about a similar chain with a full and incrementals both in capacity tier and also archive tier.
Is it instead that the original full which is in capacity tier only exists there and never in archive tier? So that the archive tier only ever gets incremental pieces of the one chain in capacity tier? In other words the archive tier needs blocks from capacity tier to actually restore data? (This is all if we leave it on default settings of incremental forever for gfs points in archive, otherwise it would obviously have several fulls)
But I'm trying to gather if specially there is going to be one full in archive or only incrementals which still need to combine with full from capacity tier. I could understand the savings more if there is no full in capacity tier.
I was myself thinking about a similar chain with a full and incrementals both in capacity tier and also archive tier.
Is it instead that the original full which is in capacity tier only exists there and never in archive tier? So that the archive tier only ever gets incremental pieces of the one chain in capacity tier? In other words the archive tier needs blocks from capacity tier to actually restore data? (This is all if we leave it on default settings of incremental forever for gfs points in archive, otherwise it would obviously have several fulls)
But I'm trying to gather if specially there is going to be one full in archive or only incrementals which still need to combine with full from capacity tier. I could understand the savings more if there is no full in capacity tier.
-
- Product Manager
- Posts: 20368
- Liked: 2288 times
- Joined: Oct 26, 2012 3:28 pm
- Full Name: Vladimir Eremin
- Contact:
Re: V11 Archive Tier Question - How will it save money?
No, GFS restore point firstly lands on Capacity Tier and after some time it gets transferred to Archive Tier - the transferred restore points are deleted from Capacity Tier.Is it instead that the original full which is in capacity tier only exists there and never in archive tier?
No, it does not. Whatever gets stored in the Archive Tier is fully self-sufficient.In other words the archive tier needs blocks from capacity tier to actually restore data?
Anton's meant that GFS restore points consist mostly of unique data and thus cannot be deduped with each other. Storing big amount of unique data in Archive Tier is dramatically cheaper than it is in Capacity Tier. The longer you store it, the lesser overall price will be.But I'm trying to gather if specially there is going to be one full in archive or only incrementals which still need to combine with full from capacity tier. I could understand the savings more if there is no full in capacity tier.
Thanks!
-
- Chief Product Officer
- Posts: 31754
- Liked: 7259 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: V11 Archive Tier Question - How will it save money?
I found this table very helpful for understanding a restore point data placement in various tiers depending on the SOBR policy.
-
- Expert
- Posts: 119
- Liked: 11 times
- Joined: Nov 16, 2020 2:58 pm
- Full Name: David Dunworthy
- Contact:
Re: V11 Archive Tier Question - How will it save money?
I'm getting ready to try to implement the Archive tier. So far s3 is seeming to cost a bit more than what I had calculated. So, I have the following configuration on my jobs. 6 weeklies/12 monthlies/2 yearlies. Does it still make sense to use it? My thought is I really need to get a control on the cost before it grows too high.
So if I told it to archive ALL gfs like every weekly and on, would that be OK? I know glacier deep archive is 180 or more days, but what about regular glacier?
Also, since what goes in archive tier is self sufficient, wouldn't I be ADDING to total object storage size and not really saving right off the bat? (instead of reducing price I'm now adding a new service fee of glacier charges....)
So if I told it to archive ALL gfs like every weekly and on, would that be OK? I know glacier deep archive is 180 or more days, but what about regular glacier?
Also, since what goes in archive tier is self sufficient, wouldn't I be ADDING to total object storage size and not really saving right off the bat? (instead of reducing price I'm now adding a new service fee of glacier charges....)
-
- Chief Product Officer
- Posts: 31754
- Liked: 7259 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: V11 Archive Tier Question - How will it save money?
It's not just a 3rd copy of your data. You're not only adding data to Glacier, you're also removing it from a few times more expensive S3.
That said, it is not a good idea to archive weekly backups imho... remember, Archive Tier only makes sense economically for a long-term retention. Putting data into Glacier takes some $$$ due to API costs, so anything you put there needs to be stored long enough to offset those archival costs with the cheaper storage costs.
With your retention policy, I would only archive monthly and yearly, since each of them you need to store for at least 1 year. Archiving weeklies will be expensive due to an early deletion fee: you only need to store them for 6 weeks, but minimum storage duration in Glacier is almost 13 weeks. Also, when archiving weeklies you WILL be in situation when a significant portion of data in Glacier is just a copy of what is already in S3, since much of the data in adjacent weeklies is the same.
That said, it is not a good idea to archive weekly backups imho... remember, Archive Tier only makes sense economically for a long-term retention. Putting data into Glacier takes some $$$ due to API costs, so anything you put there needs to be stored long enough to offset those archival costs with the cheaper storage costs.
With your retention policy, I would only archive monthly and yearly, since each of them you need to store for at least 1 year. Archiving weeklies will be expensive due to an early deletion fee: you only need to store them for 6 weeks, but minimum storage duration in Glacier is almost 13 weeks. Also, when archiving weeklies you WILL be in situation when a significant portion of data in Glacier is just a copy of what is already in S3, since much of the data in adjacent weeklies is the same.
-
- Expert
- Posts: 119
- Liked: 11 times
- Joined: Nov 16, 2020 2:58 pm
- Full Name: David Dunworthy
- Contact:
Re: V11 Archive Tier Question - How will it save money?
Ok so I can set it for like maybe 60 days gfs on archive tier and then basically monthlies 2 through 12 and the two yearlies will go?
-
- Chief Product Officer
- Posts: 31754
- Liked: 7259 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: V11 Archive Tier Question - How will it save money?
Yep, this works as such archiving window avoids processing your weeklies.
Who is online
Users browsing this forum: Majestic-12 [Bot] and 27 guests