Using object storage as a backup target
New to Cloud

Post by nunciate »

I am just doing some data gathering and had a lot of questions about using cloud storage with Veeam. I am trying to get a better handle on how all this works. We have nothing in the cloud today and I have never really worked with cloud storage at all.

Mostly I am considering using cloud storage as a replacement for our tape libraries. We keep 30 days worth of backups on disk onsite in both my production and disaster recovery site. We backup in prod, replicate over 230 VMs to DR nightly, and then we back up those same replicas in DR to disk and on to tape. I know, I know it is overkill but was the requirement from high above and so that is what we do.

It is a lot of onsite storage and a lot of tape usage. Today I have 130Tb in backup data on my prod backup server. I do daily tapes storing daily for 30 days, weekly full for 30 days, monthly fulls for 3 years and a yearly full backup stays offsite forever. I have a lot of tapes stored in our vault. Thousands of them.

All that being the case what would a cloud storage setup look like for me? I am not even sure if it will be cost-effective.

So my questions are these:
What kind of network bandwidth is required for cloud backups to work effectively? Given that I would have to find some way to get all this data offsite and then perform daily backups. Do I need a dedicated network connection out to the cloud? I can't imagine sending all that data across our regular internet connection. I am sure the network guys would not be happy.

Today I perform my full backups as synthetic full with reverse incremental. It uses less storage but is I/O heavy. Would that work in the cloud or would I need to send full backups to the cloud once a week? If the latter then again, what about network utilization. Is that even possible with 230+ VMs? I have a few VM's that are over 6Tb in size (file servers). I can tell you that today it takes me about 4 days to get all my full backups to tape. I scheduled the full backups over a series of days starting on Thursday and ending Sunday just to get them all done by Monday.

Speaking of full backups we would definitely want to use something like Amazon S3 Immutable storage. Something that will help prevent ransomware from encrypting our backups. Yes, we are paranoid. Very. (See my backup posture above). That being the case I assume I would have no choice but to send weekly full backup since any other backup files will be locked to read-only.

One other concern is restoring from the cloud. Can this be an effective replacement for my 30 days of backup onsite? We restore from backups frequently. Mostly file-level restores but the occasional full VM restore as well. Some things I have noticed with my local storage is that it takes quite a long time to mount a backup of a large file server (6Tb). Sometimes as long as an hour. That is because my storage is always busy doing something. How well is the restore performance from the cloud? I assume it is mostly related to the network bandwidth once again.

Finally, we come to storage costs. Anyone have any ideas on how that works with say Amazon S3? Do they charge by the Gb/Tb? Any charges for heavy usage. What about data growth. If I am using this for a replacement for tape then the data is going to get huge over the next 3 years as my monthlies stay offsite.

Thanks in advance for any answers you can provide. I am sure I will have more but this is the big stuff I can think of right now.

Re: New to Cloud

Post by HannesK »


bandwidth: uploads to object storage are forever incremental. so the question is how much time you have every day for upload and how much data. there are countries on this planet where 10 GBit internet is affordable. So it's hard to say, whether you need an extra connection. Some cloud providers (Azure, Amazon) offer direct lines while other cloud providers offer by far cheaper storage (e.g. Wasabi)

IO: for you backup server, I recommend switching to block cloning (REFS / XFS) to eliminate the IO issues. upload is incremental forever. 230VMs does not sound much

Also for immutable object storage: it's incremental forever. we work with "object lock" (not "bucket lock")

Reverse incremental does not have synthetic full. As you mention, that you do not have block cloning, I assume that "defragement and compact full" would solve your mount issue. Most customers keep a copy of the data they restore often from local. Veeam checks whether blocks are still available local somewhere before transferring data from the cloud to save egress charges (not every cloud provider has egress charges by the way)

I will not give advice on pricing as Amazon and Azure made it maximum complicated. for long term backups both offer cheaper storage (infrequent access, cool). but be careful: read costs are high! Wasabi, Backblaze and others are a lot easier to calculate.

also: do not change the block size to "optimize" deduplication ratios. that will increase put / get costs: ... ml?ver=100 - just keep the settings "default"

Re: New to Cloud

Post by rsoto »

I’ve spoken to companies looking for alternative for tape backup as you are (even for storage solution noted in this thread) for variety of reasons. Storage solution at the end of the day do the same thing, store your data for your needs. But how they do this is a differentiator. From the information in your post you get it.
You specifically asked about Amazon S3 cost. A internet search for AWS pricing calculator will assist but as HannesK noted, it can be complicated. There are AWS specialized hosting/consulting company you can hire to help. It’s not to scare you from AWS or malice towards AWS (we are partners as we are ProPartners with Veeam) the cost is not straight forward.
Getting back to your main consideration for the cloud is cloud storage to replace tape backup. A Veeam VCPS may also be something to consider too.
