Sanity Check Please

Kimboaticus · Post by **Kimboaticus** » Jun 21, 2013 3:37 pm this post

Hi All,

New to Veeam and liking what I see so far. I have been playing around with the trial for a couple of weeks now and what I would like to do is translate my backup and recovery goals into jobs and I am a little unclear about a few things. First, let me outline my environment and what I want to achieve:

I have two sites, one of which is the main site which contains the primary SAN and one vSphere 5.1 host. About half of my production VM's run on this host. All VM's at both sites reside on the datastores on the primary SAN. This SAN is a smaller but better performing SAN. The second site is about a mile away, linked to the main site by multiple private fiber links. The sites communicate at Layer 2 so response time between sites is instant. The second site has another vSphere 5.1 host in the same cluster and the backup SAN. This host runs the other half of my production VM's. The SAN at this site is about twice as big as the primary but has less performance. This SAN contains no live production VM's, only backup and replication data. This backup SAN was sized to hold the backups and replicas and operate as the live datastore in case of the loss of the primary.

Currently, we are using VDP for backup and VR to do replicas. VDP runs nightly, VR runs multiple times throughout the day. Incidentally, we are using the "free" version of VDP so it cannot do item level backup of SQL and Exchange, which is a big reason we are moving to Veeam. The backup job is set to retain data for 60 days. What we are protecting is a couple of domain controllers, exchange 2010 (not currently in a DAG but we want to get there eventually), a couple of web servers, a SQL server, a couple of file servers, and a few assorted other VM's for a total of less than 30 VM's. Some of these VM's will be eliminated when their functionality is migrated to one of the existing VM's so I would say 20-25 VM's will be what we need to look after. At the moment, I would say there is about 2TB of data to protect. I have 10TB of space allocated for backup storage on the secondary SAN. The Veeam server will run on the secondary host and will have an RDM to that 10TB space. Veeam will run on a 2012 Standard server with dedupe enabled on that 10TB volume with it set to dedupe anything older than 7 days (should I change that?).

My main goals are to 1)recover all production VM's at the secondary site within 3-4 hours (faster is better obviously) 2)recover VM's to a state as close in time as possible to the time of loss of the primary 3)retrieve files, email, SQL data going back at least two months and longer if possible, preferably 6 months or more if storage space permits.

With Veeam I had in my mind that perhaps I dont need to do replication anymore. If I spread my VM's across a few jobs (one for domain controllers, one for exchange, one for SQL, one for file servers, one for everything else) and have these jobs set to run every x hours (this will change depending on the job, a larger interval for things like miscellaneous servers and domain controllers, shorter for exchange, SQL and file servers) this will give me backups plus the multiple recovery points in a day that i got with VR. I was thinking of running these jobs as forward incremental with Synthetic Fulls on Wednesday and Active Fulls on Saturday. The question then becomes the retention policy. If I have one of these jobs set to run every 4 hours and I want to be able to recover a file or an entire VM from up to 6 months ago, does that mean the retention policy would be (6 points per day) X (180 days) = 1080? Or is there a better way to do what I am trying to accomplish? Should I stick to a similar setup as I have with VDP and VR and run the same jobs above except only once per day which would give a retention policy of 180 for 6 months and rely on replication to be able to recover from multiple points in a day?

Anyway, that a lot that I just wrote. I am sure other questions will come up over time but the best way to setup the jobs is the big one right now. Can anyone give me any advice on what I am planning? Thanks in advance!

Cheers!

Post by **veremin** » Jun 21, 2013 4:27 pm this post

What we are protecting is a couple of domain controllers, exchange 2010 (not currently in a DAG but we want to get there eventually), a couple of web servers, a SQL server, a couple of file servers, and a few assorted other VM's for a total of less than 30 VM's.

Don’t forget to enable Application Aware Image Processing, which is necessary for backing up/replicating VM running VSS-aware applications (such as Active Directory, Microsoft SQL, Microsoft Exchange, SharePoint), since this functionality guarantees backup transactional consistency of such VMs.

I was thinking of running these jobs as forward incremental with Synthetic Fulls on Wednesday and Active Fulls on Saturday

It’s definitely an overkill. If I were you I would just stick with weekly synthetic full, and monthly active full.

Or I would consider reversed incremental mode, instead. It doesn't have such a demand for regular full backups, and also it is not that space demanding.

Recover all production VM's at the secondary site within 3-4 hours (faster is better obviously)

If I have one of these jobs set to run every 4 hours and I want to be able to recover a file or an entire VM from up to 6 months ago, does that mean the retention policy would be (6 points per day) X (180 days) = 1080?

Then, you shouldn’t turn a blind eye on replication functionality, which was designed specifically to guarantee minimal PRO/RTO in case of disaster. Even though, Instant VM Recovery is fantastic feature, it isn’t always advisable to run simultaneously a big amount of VMs directly from backup file since it might have a negative effect on performance.

retrieve files, email, SQL data going back at least two months and longer if possible, preferably 6 months or more if storage space permits.

You've mentioned that you’re eager to store monthly worth backup data on deduped volume. So, please be aware that any type of restoration from dedudeped volume will certainly take some time due to underlying nature of such volume.

Additionally, you might want to specify backup proxy in Direct-SAN mode in order to increase backup performance rates.

Hope this helps.
Thanks.

Kimboaticus · Post by **Kimboaticus** » Jun 21, 2013 5:23 pm this post

Thanks for the comments Vladimir. A couple more questions based on what you said:

1)

v.Eremin wrote:Then, you shouldn’t turn a blind eye on replication functionality, which was designed specifically to guarantee minimal PRO/RTO in case of disaster. Even though, Instant VM Recovery is fantastic feature, it isn’t always advisable to run simultaneously a big amount of VMs directly from backup file since it might have a negative effect on performance.

I see your point about using the replicas for a DR scenario. However, would I not be trading recovery points in the backup job and simply adding them to the replicas if I am looking to recover files from months ago, or are you suggesting long term backup jobs for file recovery and multiple replicas per day for DR purposes? I guess the real question is what is the best way to provide recent entire VM recovery and still recover files from months before? Use a backup job to do both functions, split the tasks into backup and replication jobs, or some other thing? If I stick with my original plan for file recovery, was my assumption about the retention policy correct?

2)

v.Eremin wrote:Additionally, you might want to specify backup proxy in Direct-SAN mode in order to increase backup performance rates.

As the Veeam Enterprise server (there is only one which is also the only proxy) is virtualized, would I need to add virtual NICs to the server and attach them to the virtual switches used for iSCSI access to the SAN or would I need physical NICs in the host that plug into the the switch the SAN attaches to and then link those NICs to virtual switches that the Veeam server is attached to?

Post by **yizhar** » Jun 21, 2013 10:55 pm this post

Hi.

Regarding putting all your backups on the Windows 2012 dedup volume.
Please note that this is quite new technology, and I wouldn't just go with it for such large volumes.
Do you know how reliable it would be (what are the chances of corruption)?
How would it work with your specific storage system?
Do you plan to store it on a RAID5 volume built from large capacity SATA (or nearline) disks?
How well would it perform?
What are expected side effects of the dedup process? (I expect it to create load on the datastore which might conflict with other activities such as replica/backup to the same set of disks on the DR storage).

You plan to have about 2TB of production data, and about 10TB for backups/replica. So I would go with a different approach.

1. Create a VMFS volume of about 2TB (with option to grow in future).
Use replicas (either with Veeam or the VR - each has some advantages over the other).
This will provide the needed low RTO/RPO for DR scenarios (and also possible for planned maintenance of production storage).
The replicas will run several times a day depending on the role and data of each server - some VM will have near CDP schedule, while other only daily or no replica needed at all.

2. Create an NTFS volume (RDM or whatever) of about 3TB,
Use daily backups for short term recovery - backup once a day (remember that you have frequent replicas during the day), and use either forward or reverse incrementals. Keep last 10-14 copies (days).

3. Create another NTFS volume of about 4TB.
Use it for weekly backups which will run only once per week.
Use reverse incremental and keep last 30 copies (last 30 weeks).
If you wish to use dedup, then enable it only on this volume for long term weekly backups,
but as mentioned before please note that the gain of win2012 dedup will have costs (performance and load) and risks (new technology).
BTW - You can keep using VDP for now as it has a strong dedup engine and retention capabilities not present with current version of Veeam (some enhancements are expected in next V7 version such as GFS rotation)
I wouldn't destroy your VDP backups for now - just maybe schedule them weekly, until/if you decide that these are no longer needed/useful.

4. Please note that it is recommended to have backups on at least one other media (not good idea to have all backups on a single storage device). Maybe use tape, external disks, or another server with SATA disks on the production site to keep an additional copy of the backups.

Some more questions and tips:
Which storage devices do you use?

Post by **yizhar** » Jun 21, 2013 11:11 pm this post

Hi.

Another tip about your VMware design:

How many fiber links do you have between sites?
Which type of SAN do you have (FC/ISCSI/Other)?

I would consider also the "traditional" design, even when you have those several fiber links:

Put both production servers at production site.
Have a 3rd server at DR site. This server will normally run only a virtual (online) DC and the Veeam backup VM, and will be used to run replicas if needed in DR scenario only (if the whole production site is down for long time).
OK, also for testing.

Some Advantages -
if the first VMware host needs to go down (planned or error), in your design this means moving all the VM to run at DR site. How many links can you provide for VMotion? How many links does the remote production server has to the storage at production site? and to the production LAN?
If you have both production servers at production site, you can take advantages of more network links to storage/vmotion/lan , and there are other benefits.
A server going down is a common scenario (you do need to upgrade VMware from time to time), unlike DR situation of total site loss which is rare.
You should plan to have best business continuity for both cases and your design is more focused on worst case then on common scenarios IMHO.

What do you think?

Post by **veremin** » Jun 24, 2013 8:20 am this post

The real question is what is the best way to provide recent entire VM recovery and still recover files from months before?

Daily backup of whatever mode with 60 restore points, plus, replication job that runs throughout the day on specified intervals (per 1 hour, per 2 hours).

With regards to Direct SAN proxy mode, If it’s possible you might want to add a physical proxy server and specify it in Direct SAN mode, so that required data will be retrieved directly from production storage over SAN fabric. More information regarding necessary configuration steps can be found here.

Thanks.

R&D Forums

Sanity Check Please

Re: Sanity Check Please

Re: Sanity Check Please

Re: Sanity Check Please

Re: Sanity Check Please

Re: Sanity Check Please

Who is online