Comprehensive data protection for all workloads
Post Reply
BrettM
Novice
Posts: 9
Liked: never
Joined: Sep 01, 2011 10:03 pm
Full Name: Brett Mullins
Contact:

Local and Remote Backups

Post by BrettM »

I know this is long and much of this has been covered in other topics, but I'm having trouble condensing it all into a comprehensive solution. The user manual is terrible and has only vague hints about the best ways, and even in the forums the info is scattered all over the place.

I need a way to have my VM backups be both local AND remote; local for the sake of fast restores and local DR, remote for site-level DR and enterprise archival needs. So how do I do this most efficiently? What uses the least bandwidth and/or takes the least time? And preferably uses the least space? (For the record I'm using ESXi 4.1u1)

Copy to Secondary Storage:
One way suggested in the forums is to simply take backups needing to be archived and copy them to portable drives and then store these offsite. In another forum topic Gustav suggests this is at least part of the reason why v5 came up with a new naming convention since apparently a number of folks on v4 did precisely this. But this solution relies on user intervention to copy backup files to secondary storage, send it offsite, keep track of it, and bring it back onsite or securely dispose of it once expired, all of which must be done manually. This is only marginally better than using legacy tape solutions and seems a poor compromise when a modern enterprise level business has the equipment and bandwidth necessary to move this stuff to a DR site automatically.

Multiple Backup Jobs:
Another way to handle this is with multiple backup jobs. However, Veeam can't write to multiple targets at once, nor can it run multiple jobs at the same time against the same VM. So one must do two consecutive backups, with one pointed offsite at a Linux target if one doesn't want the backup to take forever (which we must create ourselves, no appliance provided). This not only creates additional production server overhead (IOPS for two snapshots), but creates two separate sets of backups. But compared to replication (below), is it faster or slower, better or worse, in people's experience?

Replication:
A third way to do this is to replicate the backups, but Veeam v5 can't do it natively (which is perplexing since Veeam can replicate VMs) so I must rely on something else like DFS-R or Rsync for replication. I don't mind using either one, and Gustav recommends rsync in the forum FAQ, but each has its problems as documented in various forum topics. Even using third gen DFS-R (2008 R2), the staging area must be sized big enough and progress monitoring is almost nonexistent, while rsync has problems with Veeam v5 filenames along with NTFS file permissions problems. Rsync with "fuzzy" switch (mentioned by someone in the forums) and a way to monitor progress seems preferable, but the Veeam 5 naming convention is a high hurdle to get over considering Gustav's comments in another topic regarding upcoming v6's reliance on this new naming convention, while DFS-R can handle the name changes since it assigns unique file ID's to each replicated file regardless of file name. And while cross-file DFS-R (equivalent to the fuzzy switch) is only available on enterprise and datacenter versions, we have datacenter (as I'd expect anyone running more than 3 VM's per physical host processor would have, it's by far the most efficient licensing model at higher densities). Even so, I'm inclined to use rsync if the filename hurdle can be jumped. Suggestions on rsync anyone? Or another file replication solution?

(For the record - it would be *really* nice if Veeam B&R could do this replication on its own in v6 or beyond, preferably with dedupe across ALL files going across the WAN, much like fuzzy rsync or cross-file DFS-R can already do.)


Backup Job Type:
In addition, what's the best backup method? Best for an offsite target if doing two backups? Best if I'm going to be using replication? My current observations are:

1. Local Incremental: These are supposedly the most efficient in terms of offsite replication due to small increment sizes via compression and dedupe - but then I either have to keep increments forever for archival purposes (not recommended), or I have to do new full backups or create new synthetic backups periodically, both of which I presume would be enormous hogs in terms of WAN bandwidth even using DFS-R or Rsync (though cross-file DFS-R or fuzzy rsync would help some with this).

2. Remote Incremental: Suggested in the manual as being the best way to get backups offsite, but not only do I have the problem of getting the initial backup seed offsite if it's big (no built-in backup staging), I still have the same problems with replication back to the local site, and I need a remote Linux target to efficiently process periodic remote synthetic backup creation. If I have to do periodic full backups this solution becomes very time consuming if not downright unusable since the new files would take a lot of time getting to the remote site and then an equal amount replicating back.

3. Local Reverse-Incremental: The full backup file is presumably merely "modified" by rolling the latest changes into it, rather than being completely recreated each day from scratch, in which case DFS-R or Rsync should be able to handle replicating over the changes. If I don't care about separate periodic fulls and I'm primarily concerned about local backups first with offsite a secondary, then I presume this is the best way to do it, though inital seed replication will take awhile. But I am completely reliant on DFS-R or Rsync to efficiently figure out what part of the full backup file has changed and only replicate over those block changes. Plus, reverse incremental is not recommended by the user manual for off-site stuff, and many others doing off-site replication in the forums seem to be using forward incremental as well, so am I missing something here?

4. Remote Reverse-Incremental: Without a remote Linux target, this would be an unworkable solution since Veeam would be trying to locally roll incremental changes into the remote backup file. With remote Linux target the local Veeam B&R will send only the backup increments to it, and presumably(?) the remote target's Veeam agent then does the work of rolling the increments into the full backup file. If I don't care about periodic fulls and I'm primarily concerned about getting backups offsite fast first, then I presume this is the best way to do it since it combines the space-saving reverse-incremental backup with having changes rolled in at the remote site, though I'm still reliant on DFS-R or RSync to get changes replicated back locally. And the overall backup op will take longer, since increment info goes across the WAN twice, once to remote target then replicated back again to local.

Remove Snapshots and other unneeded files:
Another way to reduce WAN backup times (and file sizes) would be to use Gustav's method of removing the pagefile and other temporary/unneeded files from the backup (by putting them on a separate VM disk and then excluding it from backup); are there other ways to reduce size? Any other files folks might recommend somehow excluding/relocating?
foggy
Veeam Software
Posts: 21133
Liked: 2140 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Local and Remote Backups

Post by foggy »

Brett, you show very deep understanding of backup peculiarities, very good post describing multiple aspects of different strategies. Though I would say that there are no direct answers to most of your questions (let other members correct me). To define the optimal method of backup you need to test to determine what is the most effective in your particular case. No universal answer but lots of recommendations for this or that way of backup each having this or that pros and cons. You have described all of them and understand well where you can win and where lose in this or that case. Now you have to specify what is more important for you and what has less priority and find what suits you best using your own tests and/or learning from other users' experience in similar threads on this forum.
It's hardly possible to say what will work best and be optimal in your case and to predict which solution as a whole will be the fastest and will consume less network and hardware resources as it depends on multiple factors: VMs themselves/nature of processed data, link type and other parameters of your environment, which you do not describe in your post. We can just provide what is more preferable in this or that case in general but your post shows that you already know what can affect backup speed, etc. and in what way. There is no universal/ideal solution, our customers successfully use different approaches (like using multiple backup jobs vs. replicating backup files using rsync-like tools, for example).
NetWise
Influencer
Posts: 13
Liked: 3 times
Joined: Aug 26, 2009 5:52 pm
Full Name: Avram Woroch
Contact:

Re: Local and Remote Backups

Post by NetWise »

While I can easily agree that there is no one best way or a way that will work for all use cases, I think the above post by the intial poster is pretty typical. If nothing else, it exactly mirrors my requirements. I have 3 sites, 1 of which is at a CoLo (SiteA) and customer facing so "more critical" than the other two which are internal/corporate networks (SiteB and SiteC). Local backups in my tests of Veeam are working amazing, no complaints. SureBackup tested, U-AIR options working as hoped, etc. Now I"m getting into the replication part of the backups and having some issues. I don't think its so much the product, as much as I'm not really finding the "best practice" or "typical use scenarios", etc. My needs, like the OP are:

* Backup locally to disk in SiteA, and SiteB/SiteC
* Have SiteA replicate to SiteC for DR purposes and offsite. This handles a loss of SiteA, so long as SiteC is up - which is almost always going to be the case.
* SiteB/SiteC would cross replicate. This handles the loss of either corporate site, and ignores the CoLo as really it is separate anyway.
* Backup Replication will do for almost all VM's. However some, considered key critical, we would want to do Veeam Replication to remote ESX(i) host so that they're avalable first and immediate without vPower - which works great, but for these VM's we'd make sure they're full VM replicas.

I think there are a lot of customers who would basically be looking at the same sort of setup. I agree I was a little surprised to find out that to do Replicas and Backups, I had to run two sequential jobs, and that the Replica could not be updated from the Backup job, etc. I'm happy to use DFSR (2008R2) but I suspect there are some caveats, recommendations and best practices that I should know about, before replicating 100GB-2TB files through my network......
averylarry
Veteran
Posts: 264
Liked: 30 times
Joined: Mar 22, 2011 7:43 pm
Full Name: Ted
Contact:

Re: Local and Remote Backups

Post by averylarry »

I've had the same questions/issues. My answer is -- wait unti about mid-November for Veeam version 6. It allows:

1) Multiple jobs to be run on the same VM.
2) Distributed architecture -- so most processing is done locally and only the least amount of necessary information is sent across a WAN.
3) Bandwidth limiting (I believe).
4) Run multiple VM's in 1 job at the same time.

Replications will have a much longer window with bandwidth limiting and the ability to run replications and backups at the same time, not to mention being able to run multiple VM's in 1 job at the same time. Distributed architecture should minimize the amount of data that needs to be pushed across a WAN.


Pagefile on a separate disk -- definitely. I also do that with my transaction logs -- Exchange and SQL. The local backup will back them up and delete them. They are excluded from replication (in a disaster, nobody (at my company) will care about what might be lost by not having transaction logs).


Backup job type --
I personally like incremental with synthetic fulls every week for local backup. I pretty much have the entire weekend available for backups/replications etc. My daily incrementals are quick and the synthetic fulls easily finish on Saturday. I like to use replication for the offsite WAN (you can just start the right up, no hassle with restoring or running direct from Veeam (I don't have storage vMotion)). Replication (currently?) only uses reverse incrementals.
chad5k1
Influencer
Posts: 14
Liked: never
Joined: Feb 26, 2009 10:55 am
Contact:

Re: Local and Remote Backups

Post by chad5k1 »

You have perfectly summed up the problems that I see and that indeed my company has.
We tried rsync for a long time, but it never worked that well for the reasons you have already outlined.
Now we use robust file copy, aka robocopy.
We run the job on a daily basis, then after the synethic fulls have run and replicated once per week, we run it with the /mir switch so that it deletes the obsolete backups offsite.
This ensures that if the worst happens, we still have old backups we can refer too offsite and only when the replication off the synethic fulls has completed do those old backups get deleted.
We are talking about 1-2tb of data once dedup'd and fully compressed.
If anyone has a better way I'd love to hear it, otherwise I hope this helps a little.
Regards
Richard
scott_mac
Enthusiast
Posts: 31
Liked: 1 time
Joined: Aug 18, 2011 2:35 pm
Full Name: Scott Mckenzie
Contact:

Re: Local and Remote Backups

Post by scott_mac »

I'm going to assume that i'm fairly small fry in the grand scheme of the above solutions being touted, but I still have similar requirements - whether what I do is scaleable I don't know....

I use Netgear's Replicate functionality from their ReadyNAS devices. I believe it is basically rSync underneath, but thus far I've had no issues at all. I use Local Incremental to an on site ReadyNAS that runs overnight. Then during the day, I use the Replicate software on the readyNAS to replicate the data to an offsite ReadyNAS - it's on a vlan so doesn't affect local traffic and i get very little loss of internet bandwidth for the replication.

We do still have the issue of how to deal with the inevitable Periodic Full, however for the moment I intend to tweak the schedule when I have a better understanding of the file sizes i'll be dealing with to maybe get the offsite version to work over a weekend. As it stands, I backup locally around 2TB of server data, which when compressed, deduped etc gives me around 18Gb which is easily managed to an offsite location. The 'Full' backups will be around 500Gb, so a different thing to consider but as i said, i'll deal with it when I have a fuller understanding - we're knew to this!

It may not be the best method, but the costs were relatively low and the system has worked faultlessly so far.
jimmymc
Service Provider
Posts: 34
Liked: 4 times
Joined: Dec 09, 2010 3:06 pm
Contact:

Re: Local and Remote Backups

Post by jimmymc »

I'm a 'plus 1' with this scenario.

The most simple and bandwidth friendly way would seem to be a full, followed by incremental forever.. but this is trouble in the long run as not only would you eventually run out of space; but just as important - the more VIBs you have, the more chance you have of one of them becoming corrupt, thus ruining the chain.

I had a some thoughts about how it would work from an MSP perspective:

- Local Veeam runs a full backup
- Resulting VBK can be sent to remote site via USB drive and copied into a 'customer folder'
- Local Veeam runs an incremental
- Resulting VIB can be sent to remote site and dropped into the 'customer folder' via some method over the Internet - FTP, SFTP, FTPS or whatever (but to be scaleable, would need to be over a 'standard internet protocol' as opposed to copying to a fileshare which means VPN (for example)- and this isn't that scaleable and as self-managing considering interoperable issues with different FW/VPN vendors)
- A 'standalone' Veeam service at remote site, creates a synthetic full out of the original VBK and any subsequent VIBs in the same folder (i.e. every new VIB that turns up in that folder, is injected into the VBK that also resides in said folder)
- Locally, you could still run synthetic fulls on a regular basis, but just as long as the VIB before the local synthetic full starts, is copied to the remote site (or a 'transfer/drop folder')

Seems to me, there are no fundemental changes in the technology required.. Veeam already have the ability to create synthetic fulls, so maybe there can be a 'standalone' service to accomplish this on the remote site?

The copying of the VIBs doesn't really need to be anything Veeam-coded either (although it would be cleaner in the long run); as there are plenty of products out there to manage things like this - a favourite of mine is Super Flexible File Sync.

Don't know what everyone else's thoughts are, and I may be way off base but I thought I'd inject my two-penneth!

Cheers, James
Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests