Discussions related to using object storage as a backup target.
Post Reply
dtwiley
Enthusiast
Posts: 87
Liked: 5 times
Joined: Aug 26, 2014 1:33 pm
Full Name: Dave Twilley
Contact:

Azure Blob Single Instance / Deduplication?

Post by dtwiley »

If I use Azure Blob as a backup copy repository, will the data be single instanced?

eg. We have 10TB of VMs and we want to have a week end copy retained for 4 weeks, would this consume 10TB in Azure Blob or 40TB in Azure Blob?
ie. will it reference the blocks already uploaded if they've not changed.

We accept there will be some additional storage cost based on rate of change within this data.

Without this, you can see why we're worried about 10TB with 4 weekly, 12 monthly and 7 yearly backups as it will grow massively out of control without single instancing / deduplication of blocks.

Many thanks.
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Azure Blob Single Instance / Deduplication?

Post by Gostev »

Yes, the data is single instanced aka forever-incremental approach: only changes since last backup are captured, transferred to and stored in object storage. Weekly, monthly and yearly backups are therefore "spaceless" i.e. they don't consume any storage space. They are merely pointers to existing objects brought over by previous backup runs.

The actual consumption will depend solely on your data change rate: whether those VMs are running workloads that are relatively static, or constantly produce large amounts of unique data. Across all of our customers, average daily change rate is in the area of 3-5%. So you can use this for initial disk space requirements estimation. Remember to divide the resulting number by 2, which is an average backup compression ratio.
dtwiley
Enthusiast
Posts: 87
Liked: 5 times
Joined: Aug 26, 2014 1:33 pm
Full Name: Dave Twilley
Contact:

Re: Azure Blob Single Instance / Deduplication?

Post by dtwiley »

Perfect thanks, Veeam FTW :)
dtwiley
Enthusiast
Posts: 87
Liked: 5 times
Joined: Aug 26, 2014 1:33 pm
Full Name: Dave Twilley
Contact:

Re: Azure Blob Single Instance / Deduplication?

Post by dtwiley »

Do you have any guidance on rate of change over a longer period of time. Obviously using 3-5% * 31 days isn't realistic for a monthly rate of change as its likely to be the same data which is changing in most businesses.

With 3-5% rate of change we can calculate x number of days retention but nothing past that. I guess we'd need an average monthly and yearly rate of change too.
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Azure Blob Single Instance / Deduplication?

Post by Gostev »

It's utterly unpredictable due to being extremely workload-specific. But I would advise against planning for less than 3% per day across 31 days in any case as an average across the entire environment.

Also, if you keep backups for many months/years then you should definitely consider using Archive Tier as well. This makes a huge difference as it comes to the long-term storage costs.
dtwiley
Enthusiast
Posts: 87
Liked: 5 times
Joined: Aug 26, 2014 1:33 pm
Full Name: Dave Twilley
Contact:

Re: Azure Blob Single Instance / Deduplication?

Post by dtwiley »

When you say:
But I would advise against planning for less than 3% per day across 31 days in any case as an average across the entire environment.
Are you therefore warning not to use less than 3% x 31, ie 93% per month!? That would suggest you're saying that we should expect an entire environment to change in a month.

I appreciate there re no guarantees, its tied to the type of workload, i'm just looking for some ranges on which to base some estimates, i'll always tell a customer it could be significantly more or less than the estimate.
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Azure Blob Single Instance / Deduplication?

Post by Gostev »

That is correct. Unless your customer has a quiet environment with fairly static workloads, I would not plan for less. Remember this is average across the entire environment. Active database servers will have almost the entire disk content changed every single day! And you will need a lot of "very quiet" VMs to balance that out and get to 3% daily change rate across the entire environment.
Post Reply

Who is online

Users browsing this forum: No registered users and 6 guests