Veeam 7 Reference Architecture

mschlott · Post by **mschlott** » Jul 02, 2013 6:53 pm this post

I know every situation is a bit different, and it can be hard to make recommendations that fit everyone, but I would find it useful if Veeam would publish some white papers on actual use cases of Veeam 7 in some common scenarios. Since you have partnered with HP for the storage snapshots, perhaps they could provide servers, tape drives and some man hours to help with this.

I personally am interested in backing up a data center to a local office and writing the backups to tape. I'd like to see what hardware is used for a proxy at the data center, and for the backup server. Since using Server 2012 de-duplication has been talked about in the past, I'd like to see if that made the Veeam recommended architecture in this case. Since I'm writing to tape, I need a physical server, how do I back up and recover that server?

There are a number of ways to get good backups and restores given this one scenario. What I'd like to see is a published recommendation for this scenario and others.

Post by **dellock6** » Jul 02, 2013 8:22 pm this post

Mike,
I usually design in my daily job data protection plans, and I can tell you there are so many variables, that is nearly impossible to give reference architectures on a Veeam Infrastructure. Is not because the solution is complicated or hard to deploy, is the opposite, but there are several informations required to do a good design. Maybe you can start with some questions to understand what you need:
- total size of VMs
- daily change of data
- required / desired RPO and RTO, probably separated by groups of VMs (not all VM require the same values)
- what kind of production storage you use, and what kind of connection to the hosts
- size and time schedule of your backup window
- spare compute power on the production clusters (so you can use virtual proxies for examples)
- how many restore point to keep locally
- how long is the long term retention
- you want to use surebackup? For how many VMs?
- any "strange" VM like ones with FT enabled, or with physical RDM?
- any separated network that would be hard to reach from the management network (or the one where you want to deploy Veeam)?
- any physical hardware that would be re-used?

And so on. Each of this parameter (and there are much more while the design goes on) can deeply change the architecture of the backup.
I know there are other consultant that has some kind of pre-built design and they try to apply it to every situation, I do not feel it's a good service for the customer. Obviously this is my very personal opinion.

Luca.

mschlott · Post by **mschlott** » Jul 02, 2013 8:42 pm this post

Luca, I agree with you. There are so many variables, it can be difficult to know where to start; and Veeam gives you so many options of how to deal with those variables it can take some time and testing to find the options that are right for your site. That's why I'd like to see some white papers that talk about how Veeam was used to provide a solution in some given situations. I would expect that there would be some chalenges and decisions made overcome or work around those chalenges in that environment. It would be good to see an explanation of why those decisions were made.

When I first started using Veeam, it all seemed so straight forward, but it took me a few itterations of different installations to see why certain options just did not fit my needs. Veeam 7 has some huge changes that will make it a fit for many more customers. For me, I'm likely going to re-architect my Veeam infrastructure to be able to take advantage of backup to tape, or because of the backup copy option, I may decide to not use tapes any more. It would just be nice to have a few real world examples, including failures to help me shape the new wheel I am inventing. I'm thinking triangle.

Post by **dellock6** » Jul 02, 2013 9:07 pm this post

If you want to talk about failures, in my personal experience the biggest one has always been customers underestimating the IOPS and speed required by the primary backup storage, no doubt.
With Veeam prior to 7, you would have a single backup storage, and this has to have a balance (so, a compromise) between speed and capacity, in order to keep the desired retention. I've seen at times customers going for a cheap and large NAS because price and capacity was the most important value. Unfortunately (for them...) the Veeam repository has some specific requirements that usual backup storage do not have.
- backups are not completely sequential writes since the files are compressed and deduped, and this is even more true with reverse incremental
- restore has random reads right for the same reasons
- Instant recovery is the peak of this situations. In order to run a VM from a deduplicated and compressed storage, you needs a fast storage (compressed and deduped storage is usually a problem in a production SAN if you think about it)

If you fail in the performance sizing of the backup storage, you are going to miss RPO and RTO you need.

Luca.

mschlott · Post by **mschlott** » Jul 02, 2013 9:26 pm this post

You are dead on about IOPS being a key factor. Especially when doing synthetic full backups where you are doing reads and writes at the same time. I have ended up running monthly fulls with incrementals to 7 seperate LUNs and multiple backups to each LUN with fulls staggered throug out the month.

I'm happy with what I have. I back up about 11 TB on 200+ VMs in under 4 hours each night. I'm hoping Veeam 7 can improve on that and give me better retention managment on Tape.

Post by **Gostev** » Jul 03, 2013 1:14 pm this post

Unless tape is involved, Veeam v7 reference architecture is actually very simple: fast and small primary backup repository, and large low cost per TB secondary backup repository to hold the required retention. This is what I call "the ultimate VM backup architecture".

What drives this architecture is a few considerations. First, image-level VM backup is unique in the way that it operates with huge data sets in a very small backup window, so backup job performance is the king, and you need to design your backup infrastructure around this. Second, multiple backup repositories is the only way to meet the best practice of backup (keeping at least 3 copies of data - meaning production data and at least 2 backups - with backups sitting on different storage devices). Third, Veeam has a lot of unique functionality based around running VM from backup, for which backup repository performance is important.

Thus, hardware requirements for backup repositories are as follows:

1. Primary backup repository: fast, small capacity storage (just to hold a few latest restore points with enough retention for operational restores). Fast storage allows for smallest backup window (backup repository speed is almost always the main bottleneck), fastest restores from the most recent restore points (which is what you want to restore in 99% of cases anyway), and great performance for Veeam magic around running VMs from backup files.

2. Secondary backup repositories: storage with enough capacity to hold the required retention. Some good candidates are: decommissioned former production SAN, JBODs on Windows Server 2012 Storage Spaces, deduplicating storage devices. Capacity is important, but performance is not (since you are only using it for archival). Secondary repositories are populated from primary repository with Backup Copy jobs.

Now, since 3-2-1 backup best practice rule dictates that you keep at least one copy of your backups offsite, you can also add 3rd backup repository offsite, and copy your backups there using Backup Copy jobs with built-in WAN acceleration enabled. That is unless you are using other approaches for offsite backup - such as tape, rotated drives, storage-based replication or cloud copy.

mschlott · Post by **mschlott** » Jul 03, 2013 1:37 pm this post

Thanks for the reply Gostev. Can you elaborate on a few things?

How does the use of tape change this recommendation? The greatest way that I see it affecting me is that my Veeam server is no longer a VM and must reside at my office so I can change tapes.

I am concerned I that running forever incremental will exceed the IOPS capability of the second tier storage. If I understand the way Veeam 7 will work, my oldest backup file will be a full vbk, and all the rest will be incrementals. Each day I will be creating a new synthetic full which in the past took a very long time on my server. Is there something different about this process in Veeam 7?

I see that multiple disks on a VM can be processed at the same time. Are all of these disks written to the same backup file or are they distributed across multiple files. Today I have my backups spread across multiple LUNs to prevent IO contention. Is there some new V7 magic which helps with this?

Post by **Gostev** » Jul 03, 2013 2:23 pm this post

With tape in the equation, you may not need secondary backup repositories at all.

I guess you are thinking in terms of legacy architecture here. Unlike with the primary storage, it does not matter how long synthetic full backup takes on secondary backup repositories (because backup copy is done outside of the backup window).

Same backup file for each backup job, no changes here.

Post by **dellock6** » Jul 03, 2013 5:36 pm this post

I'm liking this thread, really. Just saying

Luca.

AJ83 · Post by **AJ83** » Jul 03, 2013 9:23 pm this post

I use reverse incremental for all my backups, this way i`m able to offload the vbk`s to tape every day. This seems to be overkill, but when i would use incrementals, i would need to load alot of tapes in a disaster recovery scenario. This would mean more chance of restore failure and longer restore time.

What i miss in the reverse incremental backup scenario is a possibility to let veeam place the vrb files in a different location from the vbk. This way i would be able to keep the vbk on fast storage (even SSD!) while it offloads the vrb`s to a slower tier for long retention. This would speed up my backups and give me less chance of shoeshining on the tapedrives.

Post by **dellock6** » Jul 03, 2013 9:44 pm this post

Well, in a certain way BackupCopy will allow you to do so. Just keep only 2-3 restore points in the fast primary storage, and offload older restore points to slow storage using BackupCopy. It's not exactly what you are asking for, but the result is pretty the same...

Luca.

Post by **veremin** » Jul 04, 2013 8:16 am this post

This way i would be able to keep the vbk on fast storage (even SSD!) while it offloads the vrb`s to a slower tier for long retention.

To some extent it’s what Anton was talking about. So, utilize your fast SSD storage as a local repository for short-term retention goals, meanwhile, copy backup data to an offsite appliance for the archival purpose. With the introduction of version 7, it can either be tapes, or just slow devices with required capacity (to which data will be copied via Backup Copy Job) or you can even mix both.

Thanks.

AJ83 · Post by **AJ83** » Jul 04, 2013 10:08 am this post

dellock6 wrote:Well, in a certain way BackupCopy will allow you to do so. Just keep only 2-3 restore points in the fast primary storage, and offload older restore points to slow storage using BackupCopy. It's not exactly what you are asking for, but the result is pretty the same...

Luca.

Sure, but then the older incrementals are out of the chain, it would mean a big hassle to use/search them. How would that work, would i need the whole chain in one place to be able to go back to earlier backward incrementals? Because in that case, i would need to copy the vbk to the archive location to be able to use the backward incrementals.

Im curious if the new 'Storage space' feature of server 2012 can be used to create what i want.

Post by **veremin** » Jul 04, 2013 10:54 am this post

Sure, but then the older incrementals are out of the chain, it would mean a big hassle to use/search them. How would that work, would i need the whole chain in one place to be able to go back to earlier backward incrementals? Because in that case, i would need to copy the vbk to the archive location to be able to use the backward incrementals.

Actually, Veeam Backup Copy Job won’t copy backup files as the whole, instead, it will use its special logic to synthetically create required restore points in remote location. And it doesn’t matter what backup mode is being used or what source is chosen for Backup Copy Job: be it certain VMs or given jobs (that, from our perspective, are viewed as VMs containers).

So, there wouldn’t be any need for copying whole backup chain, including .vrb/.vib or .vbk files, to offsite location, since all the necessary operations will be successfully handled for you by VB&R. Moreover, there wouldn’t be any issues with restore process, either; since, files stored on remote location would be considered as fully independent restore points. And yes, you will be provided with corresponding tracking system, so that, you can search and find required restore points within the seconds.

Thanks.

Post by **Gostev** » Jul 05, 2013 3:04 pm this post

Backup Copy is NOT a backup copy literally, it is selective in both VMs and restore points. The produced backup files are newly created according to the Backup Copy job settings, they are not just copied from the source backup repository... as the result, you get all the flexibility - for example, it does not matter how your primary backup jobs are organized, and how often they are scheduled to run.

Jul 08, 2013 5:10 am

Bookmarking this one

HDClown · Post by **HDClown** » Jul 09, 2013 10:31 pm this post

When you talk about "fast" disk for primary repository, are you talking about something like 10K SAS disks, or more along the lines of SSD's? Having enough SSD space would mean consumer trades, and flash wear rates would probably be a big issue there.

There are also scenario of 2x as many 7.2K SATA disks compared to 10K SAS disks, assuming same RAID array level. For example, 6 10K SAS disks in RAID6 vs 12 7.2K SATA disks in RAID6. And if you throw another factor in here, different RAID levels, things get even more interesting. For example, 6x600GB 10K SAS disks in RAID6 vs 12x4TB 7.2K SATA disks in RAID10 or even vs 6x4TB 7.2K SATA disks in RAID10. The SATA configs in these scenario will yield higher IOPS, and equivalent or higher throughput rates compared to the lesser # of SAS disks or a higher penalty RAID level on SAS disks (that scenario is based around the idea of meeting space requirements)

chrisdearden · Jul 10, 2013 6:51 am

Its going to depend on your budget really - the advantage being that you dont have to size that primary tier of backup storage to hold a year of backups. If it only holds A full backup plus a couple of reverse incrementals Then it gives you a bit more choice of what you want to deploy.

Personally I'm a fan of a big local das array with a high spindle count ,but I have seen customers get pretty creative with their backup solutions in the past.

Post by **Gostev** » Jul 10, 2013 8:48 am this post

SSD is definitely an overkill for holding backups

that's incredible amount of IOPS you will never actually use.

chrisdearden · Post by **chrisdearden** » Jul 10, 2013 9:01 am this post

Gostev wrote:SSD is definitely an overkill for holding backups that's incredible amount of IOPS you will never actually use.

I belieev there was a customer using an SSD tier to run multiple virtual labs from , but that is a very specialist use case.

Post by **dellock6** » Jul 10, 2013 12:07 pm this post

Another use case coming with the v7 could be an heavy use of Instant Recovery, since running a VM from deduped and compressed backup would be heavier than running the same VM from the original production storage.
I'm using this thread as an inspiration for a new blog post to be honest, great information about the new way of designing multi-tier backup storage, backupcopy alone is a killer feature!

Luca.

HDClown · Post by **HDClown** » Jul 10, 2013 1:16 pm this post

What kind of I/O impact do backup copy jobs have on primary backup repository? I was discussing with my co-worker new backup infrastructure design and introduction of off-site replication with backup copy jobs and WAN acceleration. We started to talk about the repositories itself and the read and write impacts on them during various operations.

The primary repository will generally be heavy write with heavy read introduced if large restores are required.

A secondary repository that is being used as a target for a backup copy would be heavy write as well, and this would also introduce heavy read on the primary repository, correct? We then extended this to WHEN these jobs are occurring. With a typical "nightly, after business hours" backup window to primary repository, when would we use a backup copy for off-site replication?

We could do this during daytime hours with heavy bandwidth restrictions, but that means longer windows for the job.

If we run an offsite backup copy job at night, we can use more bandwidth (almost double) but now we are overlapping the jobs of backup to primary repository, introducing heavy read on top of write to the disk, which would slow down the primary backup job.

So, longer windows with less bandwidth, or more I/O on primary disk? We could buy more bandwidth, but that's more money. So how do we avoid spending more money and loading down disk even more?

We thought, what if we had a secondary repository that was used primary as a source for offsite backup copy jobs. We could run backups during nightly window to primary storage, and once the primary job completed, we could run the backup copy to the secondary storage in the same site. This secondary storage is what a backup copy job runs from. As this is merely an "intermediate" seed type storage, I wouldn't care so much that it has simultaneous heavy write (the backup copy job form primary to intermediate storage) and heavy read (the backup copy job from intermediate storage to offsite).

Then we said, are we getting overkill? Is it necessary? Then we looked realistically and what this "intermediate" storage would need to be. Doesn't need to be super fast, doesn't need a ton of disk. Something like a Synology DS213+ and a pair of 2 or 3TB HDD"s in RAID0 would be around $600-650 USD. Heck, I could do it with a single 2TB external drive for $150 all-in given my backup sizes.

Jul 10, 2013 10:45 pm

Uhm, feels like you are going back to a "pre-v7" design, with a storar that is not fast but it is neither large. The concept of having this new tiered approach is right to have two dedicated storages for different purposes, as is already explained in this thread.
To me, backupcopy job is the real added value, since it allows for the creation of backup copies with longer retentions without impacting on the production storage (only one read from it), and without sacrificing the primary backup storage. That said, you can add wan acceleration on top of backupcopy if your secondary backup storage is remote to the primary, since is going to save you a lot on trasferred data. I would think that, once the cache is hot, the amount of transfers would be reduced significantly and you would probably do not need the intermediate storage.

Luca.

Jul 11, 2013 12:30 am

It's important to remember that Backup Copy jobs only copy incremental data, no matter what mode you use locally, and are a simple block read operation, so they're actually pretty I/O friendly. In general, unless you have 10Gb links between sites, I/O is unlikely to be the bottleneck for copy jobs, especially when using WAN Acceleration, which will top out at around 200Mbps, which isn't enough to put significant load on the source repository storage, or the secondary target. In other words, I believe you're probably overthinking this a little bit. Backup Copy jobs will normally be linked to primary jobs and start copying points as soon as they become available, but you can schedule times to stop copies, and you can also apply throttling during times of day.

Irene L. · Post by **Irene L.** » May 02, 2014 12:04 pm this post

I expect the answer to this question to be something like impossible to calculate and depends, but still hope for better) The question from partner who is working on quite a big project with 80 ESXi hosts 150TB of production data. The question is related to sizing backup repository in terms of IOPS and what he wants to get is abilty to restore 22TB of data (320 VMs) in 24 hours. I do have a lot of additional info from him, just not posting it here, cause don`t believe in the success of the operation) So if you can give some hint about this it would be much appreciated.
Thanks a lot!

Post by **dellock6** » May 02, 2014 12:56 pm this post

Hi Irina,
22TB/day means a storage that can read at 267 MB per second.
About IOPS, it depends on which kind of backups they run first, let's say they use the default 1MB block deduplication, after adding compression we can assume the block size is going to be 512k. With this block size, 267 MB seconds requires 534 IOPS with 512k block size. I would say even more because of deduplication, storage will need to read back and forth through the storage, so it's not completely sequential.
By the way, this is a design consideration that is oftern overlooked when sizing a repository, since numbers are usually calculated for backups more than restores, so thanks for bringing up this topic!

Luca.

Irene L. · Post by **Irene L.** » May 06, 2014 11:21 am this post

Luca, thank you very much for this!

R&D Forums

Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Re: Veeam 7 Reference Architecture

Who is online