Discussions specific to the VMware vSphere hypervisor
Post Reply
patrickreid
Novice
Posts: 5
Liked: never
Joined: Nov 26, 2012 3:54 am
Full Name: Patrick Reid
Contact:

Disaster Recovery Test

Post by patrickreid » Jun 30, 2013 2:35 pm

We have a DR test coming up and this is the first time we have used Veeam during a test. Please review and point out anything I am missing or could improve upon. Thanks!

Production:
(5) IBM blades running esxi 5.1
iscsi san with 20TB
(30) VMs

Currently running nightly incremental backups M-F to 10TB readynas on LAN. On weekends, full backup roll back from Friday is copied to 3TB USB drive for offsite rotation on Monday.

DR:
(4) esxi 5.1 host
8 TB SAN

Bare metal recovery plan:
Ship USB drive to DR site
From vsphere client on my laptop attach to DR esx host directly and carve up datastores.
Create a new VM and load windows server 08.
Clone VM
Load VCenter and import all 4 host on first VM
Load veeam on clone VM and import vcenter to infrastructure.
Add datastore to veeam VM and configure as a repo
import backups on USB to repo
restore production VMs

yizhar
Service Provider
Posts: 181
Liked: 48 times
Joined: Sep 03, 2012 5:28 am
Full Name: Yizhar Hurwitz
Contact:

Re: Disaster Recovery Test

Post by yizhar » Jun 30, 2013 4:14 pm

Hi.

Some tips:

1. You should consider the slow speed of external usb drives, which means that just copying (or restoring from) VBK files might take more then 24 hours.
So regardless of the specific steps you take, you'll get:
RTO = very long, depending on the total size of backups. I assume it will be more then 30 hours.
RPO = very long (several days), as your DR plan is based on weekly backups.
Please check with your management if those RPO and RTO values are acceptable for your organization.
What is the size of backup files on the USB drive?
How long does it take to copy from readynas to USB drive?
Do you have USB3 adapters (at primary and DR sites)?

2. I suggest that you move to physical backup servers on both primary and DR sites.
At DR site you can convert one of the existing esxi servers to physical windows server, or purchase dedicated one.
Using physical Veeam server will allow you to attach USB3 for faster transfer, and has other advantages.

3. Regarding the procedure you mention, well I didn't fully followed and understood the steps you wrote,
but anyway:
The VCenter at DR site can and should be up and running in place, managing the DR servers.

4. Do you have a WAN connection between sites?

5. Do you have a plan regarding user access?
Bringing the servers up at DR site is a good start, which should be followed by some means of access for users to be able to work on remote restored servers.

6. One of the scenarios you should plan for:
The main site is down for some reason, but the readynas is available.
In such case you can ship the readynas to DR site and use it to restore VMs, which should work faster then external USB disk.

I guess there are many other things to improve, but in general I think that you should consider changes to your plan, because it is based on a slow USB disk,
and on weekly backups only.

Yizhar

patrickreid
Novice
Posts: 5
Liked: never
Joined: Nov 26, 2012 3:54 am
Full Name: Patrick Reid
Contact:

Re: Disaster Recovery Test

Post by patrickreid » Jul 01, 2013 5:04 pm

yizhar wrote:Hi.

Some tips:

1. You should consider the slow speed of external usb drives, which means that just copying (or restoring from) VBK files might take more then 24 hours.
So regardless of the specific steps you take, you'll get:
RTO = very long, depending on the total size of backups. I assume it will be more then 30 hours.
RPO = very long (several days), as your DR plan is based on weekly backups.
Please check with your management if those RPO and RTO values are acceptable for your organization.

What is the size of backup files on the USB drive? 2.5TB
How long does it take to copy from readynas to USB drive? 23 hours
Do you have USB3 adapters (at primary and DR sites)? No 2.0

2. I suggest that you move to physical backup servers on both primary and DR sites.
At DR site you can convert one of the existing esxi servers to physical windows server, or purchase dedicated one.
Using physical Veeam server will allow you to attach USB3 for faster transfer, and has other advantages.
We are considering moving to physical at primary when 7 is released to attach tape library. Obviously we would then adjust our DR contract accordingly to include another physical server and tape library.
I could allocate one of the 4 DR host to be a physical veeam server. Don't I still need a proxy VM to restore from? I don't see what the advantage would be to have the veeam server be physical other than the repo be local USB3.0 vs across network.


3. Regarding the procedure you mention, well I didn't fully followed and understood the steps you wrote,
but anyway:
The VCenter at DR site can and should be up and running in place, managing the DR servers.
Our DR site is shared recovery site with no access until a disaster or test. In this situation I build the entire environment from scratch. That is why I include the step of installing fresh VMs to run vCenter and Veeam.

4. Do you have a WAN connection between sites?
No see #3

5. Do you have a plan regarding user access?
Bringing the servers up at DR site is a good start, which should be followed by some means of access for users to be able to work on remote restored servers.
Test users will establish VPN connection to DR site to perform testing

6. One of the scenarios you should plan for:
The main site is down for some reason, but the readynas is available.
In such case you can ship the readynas to DR site and use it to restore VMs, which should work faster then external USB disk.
I am planning for this now. Long term goal is to backup to another SAN at primary site, then copy backups across WAN to another facility that host readynas. Also moving from USB to LTO5 when veeam 7 releases for offsite storage.

I guess there are many other things to improve, but in general I think that you should consider changes to your plan, because it is based on a slow USB disk,
and on weekly backups only.
I totally agree, the USB is the only method we could produce without dedicated hardware on a fast WAN link. I am curious how you and others maintain offsite backups. Particularly what kind of backup size and wan links used.
Thanks for the thorough review


Yizhar

yizhar
Service Provider
Posts: 181
Liked: 48 times
Joined: Sep 03, 2012 5:28 am
Full Name: Yizhar Hurwitz
Contact:

Re: Disaster Recovery Test

Post by yizhar » Jul 01, 2013 5:45 pm

Hi.

> We are considering moving to physical at primary when 7 is released to attach tape library.
Please note that tape support will not be the same depending on your Veeam edition.
Standard = copy files only (VBK or other), but not with all options.
Enterprise = more advanced options and tracking of VM and restore points on the tapes.
So if you plan to purchase tape library, I suggest that you also check your Veeam edition if it is not Enterprise.

> Obviously we would then adjust our DR contract accordingly to include another physical server and tape library.
If you have a tape library at primary site, you can consider a single tape drive (instead of library) at DR site to save costs, unless backups are going to span many tapes.
Note that restore from tape (and backup to tape) will be 2 steps process, so even when using fast tape it will take some time for staging,
depending on backup size, and local disk speed.
This can be acceptable as you will probably be able to stage from tape to disk at high speeds (I assume about 400gb/hour if you have fast local disks and controller), but much slower if using low cost NAS device.

So you should plan both backup servers (at primary and DR site) to have fast disk storage to store backups on disk and stage backups from/to tapes.
For example - when you plan the physical backup server, I recommend that it will have a fast RAID controller with BBWC and local disk storage (several SATA/NearLine SAS are good candidates). This will replace the readynas and provide much better performance, stability and simplicity.

> Also moving from USB to LTO5 when veeam 7 releases for offsite storage.
If/when you plan for tapes, go with LTO6 instead of LTO5 .

veremin
Product Manager
Posts: 17119
Liked: 1480 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Disaster Recovery Test

Post by veremin » Jul 02, 2013 10:55 am

My 2 cents:

As far as I can understand, at some point there will be WAN connection present between two sites and this will be a time for you to consider thoughtfully different options:

1) Utilization of Veeam Backup Copy Job. The latter will handle a process of copying backup data offsite for you. Depending on VB&R edition, you will/will not be provided with special WAN Accelerator feature, that is specifically tuned for Veeam data transfers across the WAN.

2) Replication. This functionality was designed specifically to guarantee minimal PRO/RTO in case of disaster. Many of our customers successfully use replication over WAN as the means of protecting the most crucial VMs and have local backups for other generic machines. So, if I were you, I wouldn’t exclude replication from my DR scenario.

Thanks.

patrickreid
Novice
Posts: 5
Liked: never
Joined: Nov 26, 2012 3:54 am
Full Name: Patrick Reid
Contact:

Re: Disaster Recovery Test

Post by patrickreid » Jul 05, 2013 6:53 pm

yizhar wrote:Hi.

> We are considering moving to physical at primary when 7 is released to attach tape library.
Please note that tape support will not be the same depending on your Veeam edition.
Standard = copy files only (VBK or other), but not with all options.
Enterprise = more advanced options and tracking of VM and restore points on the tapes.
So if you plan to purchase tape library, I suggest that you also check your Veeam edition if it is not Enterprise.

> Obviously we would then adjust our DR contract accordingly to include another physical server and tape library.
If you have a tape library at primary site, you can consider a single tape drive (instead of library) at DR site to save costs, unless backups are going to span many tapes.
Note that restore from tape (and backup to tape) will be 2 steps process, so even when using fast tape it will take some time for staging,
depending on backup size, and local disk speed.
This can be acceptable as you will probably be able to stage from tape to disk at high speeds (I assume about 400gb/hour if you have fast local disks and controller), but much slower if using low cost NAS device.

So you should plan both backup servers (at primary and DR site) to have fast disk storage to store backups on disk and stage backups from/to tapes.
For example - when you plan the physical backup server, I recommend that it will have a fast RAID controller with BBWC and local disk storage (several SATA/NearLine SAS are good candidates). This will replace the readynas and provide much better performance, stability and simplicity.

> Also moving from USB to LTO5 when veeam 7 releases for offsite storage.
If/when you plan for tapes, go with LTO6 instead of LTO5 .
Thanks for the tips. We do have enterprise and I will make a note to get a LTO6 tape library.
When you say two step process, does that mean when restoring from tape, you must first restore to disk before you then restore the VMs?

v.Eremin wrote:My 2 cents:

As far as I can understand, at some point there will be WAN connection present between two sites and this will be a time for you to consider thoughtfully different options:

1) Utilization of Veeam Backup Copy Job. The latter will handle a process of copying backup data offsite for you. Depending on VB&R edition, you will/will not be provided with special WAN Accelerator feature, that is specifically tuned for Veeam data transfers across the WAN.

2) Replication. This functionality was designed specifically to guarantee minimal PRO/RTO in case of disaster. Many of our customers successfully use replication over WAN as the means of protecting the most crucial VMs and have local backups for other generic machines. So, if I were you, I wouldn’t exclude replication from my DR scenario.

Thanks.
There is no WAN link between the sites. This is a cold site contract. It is only available during a disaster or test.
We do however have the edition that includes WAN acceleration

Replication is definitely something I am considering. Do you currently replicate across a WAN? What size VMs and bandwidth do you have? Typical replication times?

We tested replication on our LAN to a test environment and I was impressed with the speed, but not sure how it would work over our WAN.

yizhar
Service Provider
Posts: 181
Liked: 48 times
Joined: Sep 03, 2012 5:28 am
Full Name: Yizhar Hurwitz
Contact:

Re: Disaster Recovery Test

Post by yizhar » Jul 06, 2013 7:57 pm

patrickreid wrote: Thanks for the tips. We do have enterprise and I will make a note to get a LTO6 tape library.
When you say two step process, does that mean when restoring from tape, you must first restore to disk before you then restore the VMs?
Yes, Exactly.
Step 1 = Relevant backup files are copied from tape to disk repository.
Step 2 = Restore from disk repository to VMware datastore.
Even if Veeam interface will make it look like a single step (single wizard to restore from tape), it will do the 2 steps under the hood.
To Veeam stuff - please correct me if I'm wrong.
There is no WAN link between the sites. This is a cold site contract. It is only available during a disaster or test.
We do however have the edition that includes WAN acceleration
Replication is definitely something I am considering. Do you currently replicate across a WAN? What size VMs and bandwidth do you have? Typical replication times?
We tested replication on our LAN to a test environment and I was impressed with the speed, but not sure how it would work over our WAN.
Yes, we do both replication and backup over Wan.
One client - replicating about 6 VMs over 30mbps link. Total size about 800gb.
One VM with oracle, and we replicate it every 2 hours.
Other VMs we replicate once a day.

Other client - backup over Wan of about 15 VMs.
Wan link = 20mbps.
Total size about 1000gb. (backup size about 500gb).
2 SQL servers, other servers are DC, TS, IIS roles.
We do only weekly offise backups currently for them.
It takes about 12 hours (total for all VMs) each weekend.
If we were doing daily backups over Wan I assume it would take about 6 hours.
With the new Wan acceleration feature it should be much faster, and without snapshotting the production VMs.

Yizhar

veremin
Product Manager
Posts: 17119
Liked: 1480 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Disaster Recovery Test

Post by veremin » Jul 08, 2013 8:37 am

To Veeam stuff - please correct me if I'm wrong.
Yep, it will be, indeed, tape-to-disk-to-VI restore process, during which you will be asked to what repository a given VM should be restored first.
With the new Wan acceleration feature it should be much faster, and without snapshotting the production VMs.
Yes, since Veeam Backup Copy job will work on backup data level, it won’t cause any effect on your production environment. Thanks.

patrickreid
Novice
Posts: 5
Liked: never
Joined: Nov 26, 2012 3:54 am
Full Name: Patrick Reid
Contact:

Re: Disaster Recovery Test

Post by patrickreid » Jul 08, 2013 3:35 pm

I don't understand why snapshotting would not happen? Are you talking about on the copy job alone? Snapshotting is still required when a backup runs correctly?

veremin
Product Manager
Posts: 17119
Liked: 1480 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Disaster Recovery Test

Post by veremin » Jul 08, 2013 3:40 pm

Normal Backup/Replication job will still require snapshot to be taken. However, I was talking there about new Veeam Backup Copy Job that will not with your virtual infrastructure, but with backup data, instead. Thanks.

foggy
Veeam Software
Posts: 18538
Liked: 1605 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Disaster Recovery Test

Post by foggy » Jul 08, 2013 3:43 pm

patrickreid wrote:So I would need the storage capacity for the production VMs and their respective backups?
You will need enough space on the backup repository to store all backup files that contain the required VMs you are restoring from the tape. Also, you will need the space to restore the VM on the primary storage (or restore it to the original location, overwriting the original VM).

yizhar
Service Provider
Posts: 181
Liked: 48 times
Joined: Sep 03, 2012 5:28 am
Full Name: Yizhar Hurwitz
Contact:

Re: Disaster Recovery Test

Post by yizhar » Jul 08, 2013 10:03 pm

patrickreid wrote:I don't understand why snapshotting would not happen? Are you talking about on the copy job alone? Snapshotting is still required when a backup runs correctly?
You are correct.

With current Veeam V6.x - if I do both local and remote backups,
I do snapshot and get data from production systems to repository (either local or remote).

With the coming V7 and copy job, only snapshot and get data from production to on site repository,
then will be able to sync and create another copy of the backup from onsite repository to remote repository.

Yizhar

Post Reply

Who is online

Users browsing this forum: No registered users and 18 guests