- Posts: 9
- Liked: never
- Joined: Sep 01, 2011 10:03 pm
- Full Name: Brett Mullins
I need a way to have my VM backups be both local AND remote; local for the sake of fast restores and local DR, remote for site-level DR and enterprise archival needs. So how do I do this most efficiently? What uses the least bandwidth and/or takes the least time? And preferably uses the least space? (For the record I'm using ESXi 4.1u1)
Copy to Secondary Storage:
One way suggested in the forums is to simply take backups needing to be archived and copy them to portable drives and then store these offsite. In another forum topic Gustav suggests this is at least part of the reason why v5 came up with a new naming convention since apparently a number of folks on v4 did precisely this. But this solution relies on user intervention to copy backup files to secondary storage, send it offsite, keep track of it, and bring it back onsite or securely dispose of it once expired, all of which must be done manually. This is only marginally better than using legacy tape solutions and seems a poor compromise when a modern enterprise level business has the equipment and bandwidth necessary to move this stuff to a DR site automatically.
Multiple Backup Jobs:
Another way to handle this is with multiple backup jobs. However, Veeam can't write to multiple targets at once, nor can it run multiple jobs at the same time against the same VM. So one must do two consecutive backups, with one pointed offsite at a Linux target if one doesn't want the backup to take forever (which we must create ourselves, no appliance provided). This not only creates additional production server overhead (IOPS for two snapshots), but creates two separate sets of backups. But compared to replication (below), is it faster or slower, better or worse, in people's experience?
A third way to do this is to replicate the backups, but Veeam v5 can't do it natively (which is perplexing since Veeam can replicate VMs) so I must rely on something else like DFS-R or Rsync for replication. I don't mind using either one, and Gustav recommends rsync in the forum FAQ, but each has its problems as documented in various forum topics. Even using third gen DFS-R (2008 R2), the staging area must be sized big enough and progress monitoring is almost nonexistent, while rsync has problems with Veeam v5 filenames along with NTFS file permissions problems. Rsync with "fuzzy" switch (mentioned by someone in the forums) and a way to monitor progress seems preferable, but the Veeam 5 naming convention is a high hurdle to get over considering Gustav's comments in another topic regarding upcoming v6's reliance on this new naming convention, while DFS-R can handle the name changes since it assigns unique file ID's to each replicated file regardless of file name. And while cross-file DFS-R (equivalent to the fuzzy switch) is only available on enterprise and datacenter versions, we have datacenter (as I'd expect anyone running more than 3 VM's per physical host processor would have, it's by far the most efficient licensing model at higher densities). Even so, I'm inclined to use rsync if the filename hurdle can be jumped. Suggestions on rsync anyone? Or another file replication solution?
(For the record - it would be *really* nice if Veeam B&R could do this replication on its own in v6 or beyond, preferably with dedupe across ALL files going across the WAN, much like fuzzy rsync or cross-file DFS-R can already do.)
Backup Job Type:
In addition, what's the best backup method? Best for an offsite target if doing two backups? Best if I'm going to be using replication? My current observations are:
1. Local Incremental: These are supposedly the most efficient in terms of offsite replication due to small increment sizes via compression and dedupe - but then I either have to keep increments forever for archival purposes (not recommended), or I have to do new full backups or create new synthetic backups periodically, both of which I presume would be enormous hogs in terms of WAN bandwidth even using DFS-R or Rsync (though cross-file DFS-R or fuzzy rsync would help some with this).
2. Remote Incremental: Suggested in the manual as being the best way to get backups offsite, but not only do I have the problem of getting the initial backup seed offsite if it's big (no built-in backup staging), I still have the same problems with replication back to the local site, and I need a remote Linux target to efficiently process periodic remote synthetic backup creation. If I have to do periodic full backups this solution becomes very time consuming if not downright unusable since the new files would take a lot of time getting to the remote site and then an equal amount replicating back.
3. Local Reverse-Incremental: The full backup file is presumably merely "modified" by rolling the latest changes into it, rather than being completely recreated each day from scratch, in which case DFS-R or Rsync should be able to handle replicating over the changes. If I don't care about separate periodic fulls and I'm primarily concerned about local backups first with offsite a secondary, then I presume this is the best way to do it, though inital seed replication will take awhile. But I am completely reliant on DFS-R or Rsync to efficiently figure out what part of the full backup file has changed and only replicate over those block changes. Plus, reverse incremental is not recommended by the user manual for off-site stuff, and many others doing off-site replication in the forums seem to be using forward incremental as well, so am I missing something here?
4. Remote Reverse-Incremental: Without a remote Linux target, this would be an unworkable solution since Veeam would be trying to locally roll incremental changes into the remote backup file. With remote Linux target the local Veeam B&R will send only the backup increments to it, and presumably(?) the remote target's Veeam agent then does the work of rolling the increments into the full backup file. If I don't care about periodic fulls and I'm primarily concerned about getting backups offsite fast first, then I presume this is the best way to do it since it combines the space-saving reverse-incremental backup with having changes rolled in at the remote site, though I'm still reliant on DFS-R or RSync to get changes replicated back locally. And the overall backup op will take longer, since increment info goes across the WAN twice, once to remote target then replicated back again to local.
Remove Snapshots and other unneeded files:
Another way to reduce WAN backup times (and file sizes) would be to use Gustav's method of removing the pagefile and other temporary/unneeded files from the backup (by putting them on a separate VM disk and then excluding it from backup); are there other ways to reduce size? Any other files folks might recommend somehow excluding/relocating?
- Veeam Software
- Posts: 20326
- Liked: 1919 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
It's hardly possible to say what will work best and be optimal in your case and to predict which solution as a whole will be the fastest and will consume less network and hardware resources as it depends on multiple factors: VMs themselves/nature of processed data, link type and other parameters of your environment, which you do not describe in your post. We can just provide what is more preferable in this or that case in general but your post shows that you already know what can affect backup speed, etc. and in what way. There is no universal/ideal solution, our customers successfully use different approaches (like using multiple backup jobs vs. replicating backup files using rsync-like tools, for example).
- Posts: 13
- Liked: 3 times
- Joined: Aug 26, 2009 5:52 pm
- Full Name: Avram Woroch
* Backup locally to disk in SiteA, and SiteB/SiteC
* Have SiteA replicate to SiteC for DR purposes and offsite. This handles a loss of SiteA, so long as SiteC is up - which is almost always going to be the case.
* SiteB/SiteC would cross replicate. This handles the loss of either corporate site, and ignores the CoLo as really it is separate anyway.
* Backup Replication will do for almost all VM's. However some, considered key critical, we would want to do Veeam Replication to remote ESX(i) host so that they're avalable first and immediate without vPower - which works great, but for these VM's we'd make sure they're full VM replicas.
I think there are a lot of customers who would basically be looking at the same sort of setup. I agree I was a little surprised to find out that to do Replicas and Backups, I had to run two sequential jobs, and that the Replica could not be updated from the Backup job, etc. I'm happy to use DFSR (2008R2) but I suspect there are some caveats, recommendations and best practices that I should know about, before replicating 100GB-2TB files through my network......
- Posts: 264
- Liked: 30 times
- Joined: Mar 22, 2011 7:43 pm
- Full Name: Ted
1) Multiple jobs to be run on the same VM.
2) Distributed architecture -- so most processing is done locally and only the least amount of necessary information is sent across a WAN.
3) Bandwidth limiting (I believe).
4) Run multiple VM's in 1 job at the same time.
Replications will have a much longer window with bandwidth limiting and the ability to run replications and backups at the same time, not to mention being able to run multiple VM's in 1 job at the same time. Distributed architecture should minimize the amount of data that needs to be pushed across a WAN.
Pagefile on a separate disk -- definitely. I also do that with my transaction logs -- Exchange and SQL. The local backup will back them up and delete them. They are excluded from replication (in a disaster, nobody (at my company) will care about what might be lost by not having transaction logs).
Backup job type --
I personally like incremental with synthetic fulls every week for local backup. I pretty much have the entire weekend available for backups/replications etc. My daily incrementals are quick and the synthetic fulls easily finish on Saturday. I like to use replication for the offsite WAN (you can just start the right up, no hassle with restoring or running direct from Veeam (I don't have storage vMotion)). Replication (currently?) only uses reverse incrementals.
- Posts: 16
- Liked: never
- Joined: Feb 26, 2009 10:55 am
We tried rsync for a long time, but it never worked that well for the reasons you have already outlined.
Now we use robust file copy, aka robocopy.
We run the job on a daily basis, then after the synethic fulls have run and replicated once per week, we run it with the /mir switch so that it deletes the obsolete backups offsite.
This ensures that if the worst happens, we still have old backups we can refer too offsite and only when the replication off the synethic fulls has completed do those old backups get deleted.
We are talking about 1-2tb of data once dedup'd and fully compressed.
If anyone has a better way I'd love to hear it, otherwise I hope this helps a little.
- Posts: 31
- Liked: 1 time
- Joined: Aug 18, 2011 2:35 pm
- Full Name: Scott Mckenzie
I use Netgear's Replicate functionality from their ReadyNAS devices. I believe it is basically rSync underneath, but thus far I've had no issues at all. I use Local Incremental to an on site ReadyNAS that runs overnight. Then during the day, I use the Replicate software on the readyNAS to replicate the data to an offsite ReadyNAS - it's on a vlan so doesn't affect local traffic and i get very little loss of internet bandwidth for the replication.
We do still have the issue of how to deal with the inevitable Periodic Full, however for the moment I intend to tweak the schedule when I have a better understanding of the file sizes i'll be dealing with to maybe get the offsite version to work over a weekend. As it stands, I backup locally around 2TB of server data, which when compressed, deduped etc gives me around 18Gb which is easily managed to an offsite location. The 'Full' backups will be around 500Gb, so a different thing to consider but as i said, i'll deal with it when I have a fuller understanding - we're knew to this!
It may not be the best method, but the costs were relatively low and the system has worked faultlessly so far.
- Service Provider
- Posts: 31
- Liked: 4 times
- Joined: Dec 09, 2010 3:06 pm
The most simple and bandwidth friendly way would seem to be a full, followed by incremental forever.. but this is trouble in the long run as not only would you eventually run out of space; but just as important - the more VIBs you have, the more chance you have of one of them becoming corrupt, thus ruining the chain.
I had a some thoughts about how it would work from an MSP perspective:
- Local Veeam runs a full backup
- Resulting VBK can be sent to remote site via USB drive and copied into a 'customer folder'
- Local Veeam runs an incremental
- Resulting VIB can be sent to remote site and dropped into the 'customer folder' via some method over the Internet - FTP, SFTP, FTPS or whatever (but to be scaleable, would need to be over a 'standard internet protocol' as opposed to copying to a fileshare which means VPN (for example)- and this isn't that scaleable and as self-managing considering interoperable issues with different FW/VPN vendors)
- A 'standalone' Veeam service at remote site, creates a synthetic full out of the original VBK and any subsequent VIBs in the same folder (i.e. every new VIB that turns up in that folder, is injected into the VBK that also resides in said folder)
- Locally, you could still run synthetic fulls on a regular basis, but just as long as the VIB before the local synthetic full starts, is copied to the remote site (or a 'transfer/drop folder')
Seems to me, there are no fundemental changes in the technology required.. Veeam already have the ability to create synthetic fulls, so maybe there can be a 'standalone' service to accomplish this on the remote site?
The copying of the VIBs doesn't really need to be anything Veeam-coded either (although it would be cleaner in the long run); as there are plenty of products out there to manage things like this - a favourite of mine is Super Flexible File Sync.
Don't know what everyone else's thoughts are, and I may be way off base but I thought I'd inject my two-penneth!