-
- Enthusiast
- Posts: 54
- Liked: 27 times
- Joined: Feb 10, 2012 8:43 pm
- Contact:
Best backup setup for mirroring backup respository
As I posted before, I am trying to setup multiple jobs so that I can have varied retention cycles (son-father-grandfather type). I've been using the reversed incremental and have my 3 sets of jobs setup to accomplish this. The plan seems to be working fine, for what it is. I didn't fully understand the reversed incremental process and just noticed that the VBK is changing size each night. I did some more reading here and in the FAQs and now understand that each night it is having to pull out the changed bytes and write them to the VBR files, and is injecting the new changes back into the VBK. Weird but awesome.
The problem with this, for my scenario, is that the next phase of my plan was to setup some robocopy jobs to mirror my backup folders to an offsite location each night. Since the big VBK changes each day, robocopy is going to re-copy all data each night instead of just the incremental data, which will take too long. I see in the FAQ that it mentions to just use Veeam to run the backup to a remote server repository, but I need 2 copies. It also suggests doing a local backup and then running a backup script using rsync. Will rsync copy just the data that changed in those big VBK's versus my robocopy job which only looks at the file and copies the whole thing?
If not, then what? Change the backup mode to forward incremental? I still don't understand all of it's sub options.
What is the best way to have my local copy and a mirrored set elsewhere and keep my tiered jobs setup? I would eventually like to have some decent ESXi hosts at my remote site and do "true" replication with Veeam, but right now that site just has a Windows box with a massive iSCSI JBOD for storage.
So what says you, oh great ones of this forum?
The problem with this, for my scenario, is that the next phase of my plan was to setup some robocopy jobs to mirror my backup folders to an offsite location each night. Since the big VBK changes each day, robocopy is going to re-copy all data each night instead of just the incremental data, which will take too long. I see in the FAQ that it mentions to just use Veeam to run the backup to a remote server repository, but I need 2 copies. It also suggests doing a local backup and then running a backup script using rsync. Will rsync copy just the data that changed in those big VBK's versus my robocopy job which only looks at the file and copies the whole thing?
If not, then what? Change the backup mode to forward incremental? I still don't understand all of it's sub options.
What is the best way to have my local copy and a mirrored set elsewhere and keep my tiered jobs setup? I would eventually like to have some decent ESXi hosts at my remote site and do "true" replication with Veeam, but right now that site just has a Windows box with a massive iSCSI JBOD for storage.
So what says you, oh great ones of this forum?
-
- VP, Product Management
- Posts: 6035
- Liked: 2860 times
- Joined: Jun 05, 2009 12:57 pm
- Full Name: Tom Sightler
- Contact:
Re: Best backup setup for mirroring backup respository
So honestly there is no "silver bullet" answer for this one. The option of using rsync is pretty good assuming your backup files are a reasonable size. As the size of the back and the amount of changes get larger, rsync can be less effective. Rsync has to scan the entire file on the source side, and the the target, and then only sends the difference, but this is obviously not ideal.
Here are some other "out of the box" ideas:
1. Setup a virtual machine as a repository (either Windows or Linux) and then use Veeam to replicate this VM to the remote site. The replication will only copy the changed blocks from that nights backup. It's an OK setup but the biggest issue is that it can be difficult to build a VM that has enough storage space to be efficient at backup, and if it every happens to be "out of sync" it can take a very long time. Works reasonably well for small environments though.
2. Use a Linux repository and DRDB to replicate offsite at the block level. This is a really nice option if you have the Linux skills. You can setup DRDB on pretty much any Linux distribution and it can do synchronous or asynchronous mirroring of the block device. You can use a distribution like OpenFiler, and setup offsite backups and HA using either virtual or physical servers on each site. This is a really nice, scalable setup, but the con is that you probably need some Linux skills to get this up and running.
3. Use the same DRDB concept with Windows by purchasing a product block level replication product for Windows. I know for example that Starwind iSCSI has an asynchronous replication option. I've never used it though and have no idea how well it will work.
I'm working on a blog entry that details the setup required for DRDB replication. I'm hoping to eventually combine this with OpenDedupe (SDFS) to have a poor mans equivalent of a dedupe appliance with asynchronous offsite mirroring. I'll post links to the blog entries as I get them finished over the next few weeks. I think the combination of DRDB and SDFS could be very powerful.
Here are some other "out of the box" ideas:
1. Setup a virtual machine as a repository (either Windows or Linux) and then use Veeam to replicate this VM to the remote site. The replication will only copy the changed blocks from that nights backup. It's an OK setup but the biggest issue is that it can be difficult to build a VM that has enough storage space to be efficient at backup, and if it every happens to be "out of sync" it can take a very long time. Works reasonably well for small environments though.
2. Use a Linux repository and DRDB to replicate offsite at the block level. This is a really nice option if you have the Linux skills. You can setup DRDB on pretty much any Linux distribution and it can do synchronous or asynchronous mirroring of the block device. You can use a distribution like OpenFiler, and setup offsite backups and HA using either virtual or physical servers on each site. This is a really nice, scalable setup, but the con is that you probably need some Linux skills to get this up and running.
3. Use the same DRDB concept with Windows by purchasing a product block level replication product for Windows. I know for example that Starwind iSCSI has an asynchronous replication option. I've never used it though and have no idea how well it will work.
I'm working on a blog entry that details the setup required for DRDB replication. I'm hoping to eventually combine this with OpenDedupe (SDFS) to have a poor mans equivalent of a dedupe appliance with asynchronous offsite mirroring. I'll post links to the blog entries as I get them finished over the next few weeks. I think the combination of DRDB and SDFS could be very powerful.
-
- Enthusiast
- Posts: 54
- Liked: 27 times
- Joined: Feb 10, 2012 8:43 pm
- Contact:
Re: Best backup setup for mirroring backup respository
Thanks for the reply Tom, it sounds like we've thought about some of the same things.
1) Interesting, hadn't thought of that. But I don't think I can really take this approach because my Veeam B&R server is a VM with a 6TB passthrough LUN to my SAN. Because it's larger than 2TB, I don't think CBT will work, will it? I know snapshotting doesn't at least.
2 & 3) Never heard of DRBD but it sounds terrific, I'll have to check it out. It's funny that you bring up SDFS OpenDedupe, I've been playing with it this week as well. I'm not convinced it's really ready for prime time yet, kind of buggy in different areas. It's a cool concept though, to have the dedupe at the file system level. What I don't like about the concept though, unless I'm missing something, is that you have to determine on the front end how much virtual deduped space you think SDFS will be able to handle on top of your actual/physical space (or keep expanding it). That's a bit backwards in thinking from what I've been used to with backup products that use dedupe to just create non-live deduped data that resides on physical storage with pre-known physical space limits. Regardless, I was getting very good deduplication in my testing at least. One big disk eater in our environment is SQL and DB2 backup dump files. We generate about 700GB a night across various servers. Those files dedupe extremely well. As a simple test on my SDFS test setup, I was able to copy 2 SQL bak files from 2 different nights that were about 40GB each, and it deduped like 98% of the data on the 2nd day's copy. Very nice. One interesting side tid-bit about SDFS too- I discovered that the developer of SDFS is a sales engineer for Symantec. It would seem like a conflict of interest to create a free deduplication system when Symantec sells several applications that have deduplication capabilities (though none that are an actual file system).
I am very anxious to see your blog post and what you've come up with combining DRBD. I'll have to check out DRBD and get a Linux box up and going in a VM to test. I'm comfortable with Linux and use it for a variety of uses in our environment. If anything, having a Linux machine at my remote DR site would be nice for testing Rsync since the Windows implementations I've been trying the last 24 hours seem to suck pretty bad. Really the only drawback to a Linux solution is that I'm the only person that has experience with it, so I prefer to implement things the rest of our team is comfortable with. This might be too good of a free setup to pass up though.
1) Interesting, hadn't thought of that. But I don't think I can really take this approach because my Veeam B&R server is a VM with a 6TB passthrough LUN to my SAN. Because it's larger than 2TB, I don't think CBT will work, will it? I know snapshotting doesn't at least.
2 & 3) Never heard of DRBD but it sounds terrific, I'll have to check it out. It's funny that you bring up SDFS OpenDedupe, I've been playing with it this week as well. I'm not convinced it's really ready for prime time yet, kind of buggy in different areas. It's a cool concept though, to have the dedupe at the file system level. What I don't like about the concept though, unless I'm missing something, is that you have to determine on the front end how much virtual deduped space you think SDFS will be able to handle on top of your actual/physical space (or keep expanding it). That's a bit backwards in thinking from what I've been used to with backup products that use dedupe to just create non-live deduped data that resides on physical storage with pre-known physical space limits. Regardless, I was getting very good deduplication in my testing at least. One big disk eater in our environment is SQL and DB2 backup dump files. We generate about 700GB a night across various servers. Those files dedupe extremely well. As a simple test on my SDFS test setup, I was able to copy 2 SQL bak files from 2 different nights that were about 40GB each, and it deduped like 98% of the data on the 2nd day's copy. Very nice. One interesting side tid-bit about SDFS too- I discovered that the developer of SDFS is a sales engineer for Symantec. It would seem like a conflict of interest to create a free deduplication system when Symantec sells several applications that have deduplication capabilities (though none that are an actual file system).
I am very anxious to see your blog post and what you've come up with combining DRBD. I'll have to check out DRBD and get a Linux box up and going in a VM to test. I'm comfortable with Linux and use it for a variety of uses in our environment. If anything, having a Linux machine at my remote DR site would be nice for testing Rsync since the Windows implementations I've been trying the last 24 hours seem to suck pretty bad. Really the only drawback to a Linux solution is that I'm the only person that has experience with it, so I prefer to implement things the rest of our team is comfortable with. This might be too good of a free setup to pass up though.
-
- Lurker
- Posts: 2
- Liked: never
- Joined: Dec 07, 2011 12:29 pm
- Full Name: Scott Sandstrom
- Contact:
Re: Best backup setup for mirroring backup respository
We do something very similar, but I went "old school unix". We use a NAS with CIFS shares enabled as our backup repository (you could also use a commercial product like a Iomega NAS, or Netgear NAS, etc.). We have a second unix box located in a separate location in the building that we mirror the repository to via a simple unix scp command setup as a cron job.
As I said, old school, but fast, and reliable.
As I said, old school, but fast, and reliable.
-
- Lurker
- Posts: 1
- Liked: never
- Joined: Mar 06, 2012 4:06 am
- Full Name: Jeff
- Contact:
Re: Best backup setup for mirroring backup respository
I am researching this as well. I am fairly new to the product, but from what I understand, the replication part of Veeam will only replicate a VM and register it on a remote ESXi host, no?
When people are using rsync, are they using a linux repository for the backups, or using rsync for Windows (which I have personally never used).
Thanks
When people are using rsync, are they using a linux repository for the backups, or using rsync for Windows (which I have personally never used).
Thanks
-
- VeeaMVP
- Posts: 6166
- Liked: 1971 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Best backup setup for mirroring backup respository
Rsync is used for copying backup files created by Veeam into another storage, replication is about those files. If you instead use Replication Jobs in Veeam, rsync or other tools will not be used.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- VP, Product Management
- Posts: 27377
- Liked: 2800 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: Best backup setup for mirroring backup respository
If you are new to the product I would recommend to review our sticky F.A.Q., which covers all existing job types and contains other useful information:monkeybagel wrote:I am researching this as well. I am fairly new to the product, but from what I understand, the replication part of Veeam will only replicate a VM and register it on a remote ESXi host, no?
>>> READ FIRST : [FAQ] FREQUENTLY ASKED QUESTIONS <<<
Hope this helps!
-
- Enthusiast
- Posts: 54
- Liked: 27 times
- Joined: Feb 10, 2012 8:43 pm
- Contact:
Re: Best backup setup for mirroring backup respository
first off, awesome username, haha
I think what I'm going to do is just use the latest Robocopy x64 that comes with Win2k8 R2 that my Veeam server/repository is running on, and use the /MT option (multi-thread). I did some testing with it and it sped up my network transfers by double. Also, I am lucky in that I have a 500mbit fiber WAN connection to my DR site. With my previously backup data though, I never achieved better than 15-16MB/s across that link using the old Robocopy that comes with Win2k3. Now using the new robocopy with /MT (just used default of 8 threads), it pounds our WAN connection and I get every bit of that 500mbit pipe. So for now, until I come up with some sort of reliable and easy to manage de-duplicating transfer to replicate my Veeam repository, I'm just going to do that and hope I can transfer the nightly changes in my allotted time window.
I tried both DeltaCopy and Syncrify ("commercial" version of Deltacopy), both for Windows, and I didn't have good results. They did not seem to recognize only the changed blocks, they instead seemed to just re-copy the entire file over the network. But, I suppose I could have been doing something wrong. Maybe I would have better results using one of those 2 apps and sending to a Linux box running "real" rsync daemon.monkeybagel wrote:.......
When people are using rsync, are they using a linux repository for the backups, or using rsync for Windows (which I have personally never used).
I think what I'm going to do is just use the latest Robocopy x64 that comes with Win2k8 R2 that my Veeam server/repository is running on, and use the /MT option (multi-thread). I did some testing with it and it sped up my network transfers by double. Also, I am lucky in that I have a 500mbit fiber WAN connection to my DR site. With my previously backup data though, I never achieved better than 15-16MB/s across that link using the old Robocopy that comes with Win2k3. Now using the new robocopy with /MT (just used default of 8 threads), it pounds our WAN connection and I get every bit of that 500mbit pipe. So for now, until I come up with some sort of reliable and easy to manage de-duplicating transfer to replicate my Veeam repository, I'm just going to do that and hope I can transfer the nightly changes in my allotted time window.
-
- Enthusiast
- Posts: 54
- Liked: 27 times
- Joined: Feb 10, 2012 8:43 pm
- Contact:
Re: Best backup setup for mirroring backup respository
Just wanted to give a follow-up. Robocopy'ing my repository is a no-go, there is simply too much data changing each night using reversed incrementals to robocopy the new data over to our remote site. It would probably be okay if I switched to forward incremental but I have somehow managed to already fill up my new VNX SAN and have no budget to get more space until 2013 (palm-to-face).
Tom- did you ever put together a blog entry about using DRBD and SDFS? I haven't gotten around to playing with it yet but if someone where to put together a howto/faq (hint hint), it would probably get me inspired
I've also been considering trying out Starwind like you mentioned. I could install it in a vm on the same ESX host that my Veeam B&R vm lives on, and would assume the 10GB virtual network between the 2 vm's would allow for pretty speedy writes to the Starwind deduped data store. Then I'd have it set to replicate asynchronously to a physical Windows host at my DR site across the WAN.
Is anyone here doing either of these setups? Either DRBD/SDFS or Starwind iSCSI with dedupe+replication?
Tom- did you ever put together a blog entry about using DRBD and SDFS? I haven't gotten around to playing with it yet but if someone where to put together a howto/faq (hint hint), it would probably get me inspired
I've also been considering trying out Starwind like you mentioned. I could install it in a vm on the same ESX host that my Veeam B&R vm lives on, and would assume the 10GB virtual network between the 2 vm's would allow for pretty speedy writes to the Starwind deduped data store. Then I'd have it set to replicate asynchronously to a physical Windows host at my DR site across the WAN.
Is anyone here doing either of these setups? Either DRBD/SDFS or Starwind iSCSI with dedupe+replication?
-
- VP, Product Management
- Posts: 6035
- Liked: 2860 times
- Joined: Jun 05, 2009 12:57 pm
- Full Name: Tom Sightler
- Contact:
Re: Best backup setup for mirroring backup respository
So the problem I ran into with DRBD is that, to do true WAN based async replication, you have to purchase a commercial addon, specifically DBDR Proxy. Without this you can lag the target, but only by the size of the buffer, which wasn't what I was looking for. Might be fine for MAN style replication but just wasn't interesting enough for the use cases I was trying to address.
-
- VP, Product Management
- Posts: 6035
- Liked: 2860 times
- Joined: Jun 05, 2009 12:57 pm
- Full Name: Tom Sightler
- Contact:
Re: Best backup setup for mirroring backup respository
So things change quickly in the tech universe. OpenDedupe's SDFS now has built in capability to perform post-dedupe, async, folder based replication. Not only that, it can also store deduped chunks in Amazon S3 and Windows Azure Storage and you can configure local cache. They have also added support for file system based snapshots.
Initial testing shows that dedupe can be very impressive, especially with small block sizes (4k), although that of course needs a lot of memory. My current tests are pretty small scale just in my home lab, but it's cool none the less, and replication works quite well. I currently have ~300GB of raw Veeam backups using about 62GB of actual space. The dedupe ratio is about 79%, but I expect that to grow as I add more retention points.
Honestly, this is about the same as if I used reverse incremental backups within Veeam. With reverse incremental and approximately the same retention, this mode is using ~70GB on disk, however, this works heavily against offsite replication. With SDFS, I'm using forward incremenals, which means my daily incremental backups are faster, and I have configured the engine to replicate to another appliance, but it only replicates changed blocks. Because of the small block size I'm seeing a typical 67% dedupe ratio even of incremental files, which means much less data to be replicated, and even when running another full, very minimal data must be replicated to the other side.
I'm hoping to find some time to play with the storing chunks in S3 sometime soon, but it certainly seems viable based on the architecture. It's amazing how much this software has matured in such a short period of time. Might be worth playing with for those "do-it-yourself" types that need to keep a longer archive of backups online or are looking for an efficient way to replicate backups offiste.
I'm currently running SDFS 1.1.8 on RHEL6 U3, but there are versions for Windows as well as a virtual appliance with a simple web GUI. The virtual appliance is version 1.1.5, but you can update the SDFS version within it to 1.1.8 to get the latest bug fixes. I'll try to do a blog entry in a couple of weeks to outline the complete setup but overall it was very simple if you're at all familiar with Linux.
Initial testing shows that dedupe can be very impressive, especially with small block sizes (4k), although that of course needs a lot of memory. My current tests are pretty small scale just in my home lab, but it's cool none the less, and replication works quite well. I currently have ~300GB of raw Veeam backups using about 62GB of actual space. The dedupe ratio is about 79%, but I expect that to grow as I add more retention points.
Honestly, this is about the same as if I used reverse incremental backups within Veeam. With reverse incremental and approximately the same retention, this mode is using ~70GB on disk, however, this works heavily against offsite replication. With SDFS, I'm using forward incremenals, which means my daily incremental backups are faster, and I have configured the engine to replicate to another appliance, but it only replicates changed blocks. Because of the small block size I'm seeing a typical 67% dedupe ratio even of incremental files, which means much less data to be replicated, and even when running another full, very minimal data must be replicated to the other side.
I'm hoping to find some time to play with the storing chunks in S3 sometime soon, but it certainly seems viable based on the architecture. It's amazing how much this software has matured in such a short period of time. Might be worth playing with for those "do-it-yourself" types that need to keep a longer archive of backups online or are looking for an efficient way to replicate backups offiste.
I'm currently running SDFS 1.1.8 on RHEL6 U3, but there are versions for Windows as well as a virtual appliance with a simple web GUI. The virtual appliance is version 1.1.5, but you can update the SDFS version within it to 1.1.8 to get the latest bug fixes. I'll try to do a blog entry in a couple of weeks to outline the complete setup but overall it was very simple if you're at all familiar with Linux.
-
- Veteran
- Posts: 367
- Liked: 41 times
- Joined: May 15, 2012 2:21 pm
- Full Name: Arun
- Contact:
Re: Best backup setup for mirroring backup respository
Just to add a note.. spoke to Starwinds support now..Starwinds ISCI initiator only support IP based connections. If both the primary and secondary sites are connected via fiber..then you need to use a media convertor at both ends as Starwinds ISCI has no support for FC connections.
I was counting on RSYNC intially.. however after reading Tom's post.. if the amount of data to be replicated gets larger..then it is not ideal.
I am not quite familiar on the Linux platform too...and having a VM with a large storage space does not work for me.
Hope there is a better way to repicate to offisite storage...or perhaps if there was a way to have it within Veeam itself in future.
I was counting on RSYNC intially.. however after reading Tom's post.. if the amount of data to be replicated gets larger..then it is not ideal.
I am not quite familiar on the Linux platform too...and having a VM with a large storage space does not work for me.
Hope there is a better way to repicate to offisite storage...or perhaps if there was a way to have it within Veeam itself in future.
-
- VeeaMVP
- Posts: 6166
- Liked: 1971 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Best backup setup for mirroring backup respository
If you better check Tom's post, OpenDedupe is also available on Windows, or you can even use their preconfigured virtual appliance if you want, much easier than configuring it on linux if you are not familiar with it.
Luca.
Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- Enthusiast
- Posts: 45
- Liked: 2 times
- Joined: Apr 10, 2012 5:50 pm
- Contact:
[MERGED] Best way to duplicate backup to offsite servers?
First off, forgive me if there was already a thread that specifically recommended the best way to do this (I've searched and ready a dozen but none have fully fit what I'm trying to do, at least from what I've seen). I've read about a bunch of different metohds (rsync, FastSCP, WinSCP, etc) and I just wanted to quickly ask again as to the best method to do what I want to do under Veeam B&R 6.5. Here's my scenario:
I have a single Veeam B&R server setup which is backing up a few ESXi 5.1 hosts (10-15 VMs) to local storage (the same server is acting as the VMware backup proxy). The VBK backup file (reversed incremental) is going to be around 500GB and I want to replicate this file to another server in a different building half a kilometer away over a 1gbit connection and in the near future to a 3rd server in a different city over a 10mbit connection. What is my best bet to do this without having to send all the backup files every time? I have tried a FileCopy job scheduled to run after the VMware backup job completes, and although this works much faster than Robocopy, etc (it saturates 100% of the 1gbit connection unlike Robocopy), it appears that it sends all the files in their entirety each time and not just the changes. Since the VBK, etc, filenames are always different I am copying the entire folder and not just a specific file. Would I be better off with rsync or WinSCP or something like that? The offsite servers are regular Windows Server 2008 R2 boxes.
Please note I am willing to change from Reversed Incremental backups to Incrementals if it would work better in this scenario.
I had always hoped that Veeam B&R would have the option to duplicate backups to additional servers (with the ability to just send the changes, not the entire files) - it is what we were doing in Symantec Backup Exec (and still doing for some physical server backups). I suppose if I change to incremental backups that I could just send the increments each time... hmmm..
Anyways, just looking for some of your expert opinions on this. Thanks in advance!
EDIT: I knew the topic would get merged Pretty quickly too! lol
I have a single Veeam B&R server setup which is backing up a few ESXi 5.1 hosts (10-15 VMs) to local storage (the same server is acting as the VMware backup proxy). The VBK backup file (reversed incremental) is going to be around 500GB and I want to replicate this file to another server in a different building half a kilometer away over a 1gbit connection and in the near future to a 3rd server in a different city over a 10mbit connection. What is my best bet to do this without having to send all the backup files every time? I have tried a FileCopy job scheduled to run after the VMware backup job completes, and although this works much faster than Robocopy, etc (it saturates 100% of the 1gbit connection unlike Robocopy), it appears that it sends all the files in their entirety each time and not just the changes. Since the VBK, etc, filenames are always different I am copying the entire folder and not just a specific file. Would I be better off with rsync or WinSCP or something like that? The offsite servers are regular Windows Server 2008 R2 boxes.
Please note I am willing to change from Reversed Incremental backups to Incrementals if it would work better in this scenario.
I had always hoped that Veeam B&R would have the option to duplicate backups to additional servers (with the ability to just send the changes, not the entire files) - it is what we were doing in Symantec Backup Exec (and still doing for some physical server backups). I suppose if I change to incremental backups that I could just send the increments each time... hmmm..
Anyways, just looking for some of your expert opinions on this. Thanks in advance!
EDIT: I knew the topic would get merged Pretty quickly too! lol
-
- Enthusiast
- Posts: 45
- Liked: 2 times
- Joined: Apr 10, 2012 5:50 pm
- Contact:
Re: Best backup setup for mirroring backup respository
I'm going to try switching from reversed incrementals to forever-incrementals w/ "Transform previous full backup chains into rollbacks" every Saturday (I need to backup to Tape for offsite safe-storage a few times a month) and robocopying the incremental files over to the offsite hosts... we'll see how that goes. If that fails I may try OpenDedupe or something.
If anyone has any other feedback I'd still appreciate hearing from you.
If anyone has any other feedback I'd still appreciate hearing from you.
-
- Enthusiast
- Posts: 88
- Liked: 25 times
- Joined: Sep 25, 2012 7:57 pm
- Contact:
Re: Best backup setup for mirroring backup respository
I am new to Veeam as well, and researched a lot of options for this very situation. I am going the route of creating secondary "offsite" jobs, containing the same VM's as the "normal" jobs, but pointing to a repository at a remote site (a NAS hosted by Windows, using the Veeam agent).
To get over the hump of initial backup, I am (actually running as we speak) pre-seeding the NAS with the first fulls, and will then deliver it to the remote site and map backups to it. In theory this should work fine. Will see how it goes in real life.
I will just fire the "offsite" job when the main completes. I am using forward incrementals with syn fulls over the weekend. The remote job set for WAN dedupe.
This method seemed the cleanest way for my situation, where WAN bandwidth is an issue.
To get over the hump of initial backup, I am (actually running as we speak) pre-seeding the NAS with the first fulls, and will then deliver it to the remote site and map backups to it. In theory this should work fine. Will see how it goes in real life.
I will just fire the "offsite" job when the main completes. I am using forward incrementals with syn fulls over the weekend. The remote job set for WAN dedupe.
This method seemed the cleanest way for my situation, where WAN bandwidth is an issue.
-
- VP, Product Management
- Posts: 6035
- Liked: 2860 times
- Joined: Jun 05, 2009 12:57 pm
- Full Name: Tom Sightler
- Contact:
Re: Best backup setup for mirroring backup respository
I've spent a lot of time playing with OpenDedupe/SDFS in my lab environment. It's a powerful inline solution that has produced some pretty impressive results. I configured a simply Linux system (CentOS 6.3 64-bit) with a 200GB volume for my backup repository, and configured it as a 1TB virtual volume with 4K chunksize (small to get maximum dedupe). The entire backup of my lab, uncompressed, is about 180GB, so it's not like I have a lot of data. I configured Veeam with based dedupe settings (No compression, LAN target, forward incremental, weekly full) and ran my initial full backups. The first backup runs used about 55GB on disk, so a savings of around 70%. Not too bad, similar to the savings I would see from Veeam compression/dedupe.
As the system continued to run incrementals and full backups for 30 days, the amount of raw backups stored on disk grew to around 700GB of VBK/VIB files, however, the space consumed on disk grew to only ~70GB. Pretty impressive, a 90% savings all performing inline dedupe. Even better, Surebackup performance was tolerable, although not great (on average each system takes about 50% longer to boot than a non-deduped volume running on the same physical disk) I was able to configure replication and the system did a great job minimizing the amount of traffic being sent to the offline replica, sending only the actual unique blocks that were added each day. I have not yet had a change to play with storing the check data to a cloud service, although this is also supported by the solution.
I did have some significant issues with getting SDFS running, primarly I had to throw almost 12GB of memory at the Linux system to get the 1TB virtual store to not run out of memory, and I had to increase the java heap even though the guide indicated that this should be required for the small store I was using. This memory usage seems far more than what is stated in the users guide. There also seems to be very limited deployment of this solution in production and it was difficult to find a lot of information about proper deployment or what exactly to expect. This would cause me significant concern when using it as a location to store my backup files as reliability has to be a high priority for storing backup data.
That being said, after giving the system the required memory, I performed quite a number of "hard" crashes, including power outages, etc., and the SDFS always ran a consistency checker to verify the state of the chunk store before bringing it online. The overall technology behind this seems to be well thought out, it's simply a matter of getting enough deployments and time to prove it's reliability since there's really no one to stand behind it. I'm not sure I could trust my backups to it yet, at least as the "only" location, but it's come a long way in a fairly short amount of time.
As the system continued to run incrementals and full backups for 30 days, the amount of raw backups stored on disk grew to around 700GB of VBK/VIB files, however, the space consumed on disk grew to only ~70GB. Pretty impressive, a 90% savings all performing inline dedupe. Even better, Surebackup performance was tolerable, although not great (on average each system takes about 50% longer to boot than a non-deduped volume running on the same physical disk) I was able to configure replication and the system did a great job minimizing the amount of traffic being sent to the offline replica, sending only the actual unique blocks that were added each day. I have not yet had a change to play with storing the check data to a cloud service, although this is also supported by the solution.
I did have some significant issues with getting SDFS running, primarly I had to throw almost 12GB of memory at the Linux system to get the 1TB virtual store to not run out of memory, and I had to increase the java heap even though the guide indicated that this should be required for the small store I was using. This memory usage seems far more than what is stated in the users guide. There also seems to be very limited deployment of this solution in production and it was difficult to find a lot of information about proper deployment or what exactly to expect. This would cause me significant concern when using it as a location to store my backup files as reliability has to be a high priority for storing backup data.
That being said, after giving the system the required memory, I performed quite a number of "hard" crashes, including power outages, etc., and the SDFS always ran a consistency checker to verify the state of the chunk store before bringing it online. The overall technology behind this seems to be well thought out, it's simply a matter of getting enough deployments and time to prove it's reliability since there's really no one to stand behind it. I'm not sure I could trust my backups to it yet, at least as the "only" location, but it's come a long way in a fairly short amount of time.
-
- Veeam ProPartner
- Posts: 252
- Liked: 26 times
- Joined: Apr 05, 2011 11:44 pm
- Contact:
Re: Best backup setup for mirroring backup respository
But will it work for larger scale deployments where a single full is in terabytes? How much ram would be needed? 100GB?
When you are talking about 700GB for fulls and incrementals, that often time can be pushed over the wire without any kind of WAN optimizations or dedupe (even though with OPT and dedupe it can be faster).
When you are talking about 700GB for fulls and incrementals, that often time can be pushed over the wire without any kind of WAN optimizations or dedupe (even though with OPT and dedupe it can be faster).
-
- VP, Product Management
- Posts: 6035
- Liked: 2860 times
- Joined: Jun 05, 2009 12:57 pm
- Full Name: Tom Sightler
- Contact:
Re: Best backup setup for mirroring backup respository
Completely agree, the small scale testing is not proof that it will scale, however, that wasn't the point of the testing and I pretty much said as much in my post.
However, I was using the smallest chunck size (4K) in an attempt to stress my minimal setup as much as possible. For example, the default chunk size is 128K, which would lower dedupe, but also lower the memory requirements by 32x over my test setup. I'd suspect that a chunk size of 16k would still provide significant dedupe while lowering the memory required by 75%. There's no inline-dedupe solution that doesn't require a LOT of memory. The stated requirement from the Opendedupe website is 5TB storage per GB/RAM using 128K chunks. That's 120GB per GB/RAM at 4K chunks. Opendedupe/SDFS is designed to scale to a petabyte. It has to ability to support RAIN style deployment and run multiple DSE engines distributed.
Once again, I actually probably wouldn't choose to use it, but then again, if I were very cash strapped, and looking for a way to backup my 75 remote sites across T1 connectivity, the actual costs would likely be minimal and the savings would be significant. If my other option was "nothing", or local only backups, I might consider it.
However, I was using the smallest chunck size (4K) in an attempt to stress my minimal setup as much as possible. For example, the default chunk size is 128K, which would lower dedupe, but also lower the memory requirements by 32x over my test setup. I'd suspect that a chunk size of 16k would still provide significant dedupe while lowering the memory required by 75%. There's no inline-dedupe solution that doesn't require a LOT of memory. The stated requirement from the Opendedupe website is 5TB storage per GB/RAM using 128K chunks. That's 120GB per GB/RAM at 4K chunks. Opendedupe/SDFS is designed to scale to a petabyte. It has to ability to support RAIN style deployment and run multiple DSE engines distributed.
Once again, I actually probably wouldn't choose to use it, but then again, if I were very cash strapped, and looking for a way to backup my 75 remote sites across T1 connectivity, the actual costs would likely be minimal and the savings would be significant. If my other option was "nothing", or local only backups, I might consider it.
-
- Enthusiast
- Posts: 45
- Liked: 2 times
- Joined: Apr 10, 2012 5:50 pm
- Contact:
Re: Best backup setup for mirroring backup respository
I thought about doing what you are trying also, but as it is I already have the servers getting hit once for regular backups, a 2nd time for replication of VMs to a different building ESXi box for disaster recovery (should the main server room burn down, etc, we could get things up and running very quickly again), and so I would then have to have a 3rd job hitting each source server to replicate across the 10mbit WAN to the server in a different city after the first two jobs are completed.jpeake wrote:I am new to Veeam as well, and researched a lot of options for this very situation. I am going the route of creating secondary "offsite" jobs, containing the same VM's as the "normal" jobs, but pointing to a repository at a remote site (a NAS hosted by Windows, using the Veeam agent).
I wonder how much less bandwidth is needed using jpeake's method above vs just robocopying the incremental files across the WAN (which is what I'm trying now)? So far I think the incremental .vib files are around 5GB so across 10mbit its not too bad, but I haven't converted our accounting system (Dynamics AX, 4 VMs), Exchange server VMs, or main SQL server VMs over from Hyper-V to ESXi yet so once those are converted I'm betting the incrementals will be closer to 20GB and robocopy will take a long time to copy that. (I am robocopying the main .vbk across the wan right now, 124GB, after 6hrs its only 16% done, so thats gonna take 37.5hrs total for 124GB, so for 20GB its approximately going to take 6hours to robocopy the incrementals over... not to mention since I am doing incremental 'transforms' for tape backups on Saturdays that means every week robocopy will try to copy the entire new .vbk file over the WAN after the transform, lol... yeah, so this isn't going to work (at least not over a 10mbit connection, maybe 100mbit would...).
So perhaps I will have to try what jpeake is doing... I don't see any other options really...
Like I said earlier, I really hope one of the feature requests that Veeam is looking into is to be able to replicate backups across a WAN sending only the changes ('delta' changes or whatever the term is). That's what we did/still do with BackupExec and it works well (duplicate backup-sets to additional locations).
Anyways, enough of a rant, feel free to comment
-
- VeeaMVP
- Posts: 6166
- Liked: 1971 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Best backup setup for mirroring backup respository
One cons however you will have to run two backup jobs to have both local and offsite copies, that's why everyone is trying to use replication solutions "outside" of Veeam. On windows, I tested allway sync thanks to another user on these forums who suggested it: for what I saw, is much faster than robocopy, it's able to use almost the whole available bandwidth if the target is not somewhere else like source or target storage, and has an easy gui to configure scheduling and other options.
I configured Veeam backup tu use wan optimization even if it's saving to a local storage, so I can have smaller backup files to replicate on wan.
Luca.
I configured Veeam backup tu use wan optimization even if it's saving to a local storage, so I can have smaller backup files to replicate on wan.
Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- Influencer
- Posts: 21
- Liked: never
- Joined: Feb 01, 2010 7:41 pm
- Full Name: Shawn Barnhart
- Contact:
[MERGED] Backup/copy to a repository from another repository
[I think this is kind of a feature request, but maybe there's a way to do this already that I'm missing..]
I have several sites where I run two separate Veeam jobs to two separate repositories but with the same VMs. Repository "A" has a very short retention (2-3 backups) and repository "B" has a long retention (14 days and longer). Repository A is kept to a short duration so that the repository size stays small enough to put on tape. Repository "B" is what we mostly use for restores so that we don't have to deal with tapes for 99% of restore operations.
This works in many situations, but in some situations it doesn't -- the volume of data and number of VMs involved makes it difficult to effectively run multiple backups of the same VMs -- you end up with waits, overlaps, etc.
What would be nice would be a way to run a single real backup of the source systems, and then run synthetic backups to other repositories using the repository of the "real" backup as the source data. This would take the load off the production VM environment & storage and I would think be faster than a traditional backup, too, especially if the synthetic backup repositories were on storage independent of the source storage.
I have several sites where I run two separate Veeam jobs to two separate repositories but with the same VMs. Repository "A" has a very short retention (2-3 backups) and repository "B" has a long retention (14 days and longer). Repository A is kept to a short duration so that the repository size stays small enough to put on tape. Repository "B" is what we mostly use for restores so that we don't have to deal with tapes for 99% of restore operations.
This works in many situations, but in some situations it doesn't -- the volume of data and number of VMs involved makes it difficult to effectively run multiple backups of the same VMs -- you end up with waits, overlaps, etc.
What would be nice would be a way to run a single real backup of the source systems, and then run synthetic backups to other repositories using the repository of the "real" backup as the source data. This would take the load off the production VM environment & storage and I would think be faster than a traditional backup, too, especially if the synthetic backup repositories were on storage independent of the source storage.
-
- Enthusiast
- Posts: 88
- Liked: 25 times
- Joined: Sep 25, 2012 7:57 pm
- Contact:
Re: Backup/copy to a repository from another repository?
if you are using normal incremental, maybe just copy the full backup to the 2nd repo, and use a robocopy command in a post job action to copy the vib file each day to the 2nd repo when the main job completes.
Then if you need to restore from the 2nd repo you can import the backups and have your way with them.
Then if you need to restore from the 2nd repo you can import the backups and have your way with them.
-
- Influencer
- Posts: 21
- Liked: never
- Joined: Feb 01, 2010 7:41 pm
- Full Name: Shawn Barnhart
- Contact:
Re: Backup/copy to a repository from another repository?
Thanks for the suggestion, but it's not at all as flexible as I'd envision.jpeake wrote:if you are using normal incremental, maybe just copy the full backup to the 2nd repo, and use a robocopy command in a post job action to copy the vib file each day to the 2nd repo when the main job completes.
Then if you need to restore from the 2nd repo you can import the backups and have your way with them.
In the perfect world, I'd run a backup of all VMs with the real job and I'd have possibly several different synthetic jobs that pulled from the "real" job repository, each of which that would write to their own repository with differing retention periods and most synthetic jobs would have only a subset of the VMs from the "real" job.
That way I could have a "messaging" synthetic & repository with email-related VMs, one for App Servers, one for DB servers, one for AD components, etc, and each would have their own retention period, a to-tape repository for those VMs that need to go to tape with a short retention, and so on.
-
- Enthusiast
- Posts: 39
- Liked: never
- Joined: Jun 26, 2010 2:27 pm
- Full Name: chris h
- Contact:
Re: Best backup setup for mirroring backup respository
VEEAM really needs to build multiple targets into their proxy and/or their data respository. I use Ahsay OBS (online backup server) and RPS (replication server). They do this with VMWware VM's, Hyper-V VM's, and application level... You first backup to the OBS server and then the OBS server sends it to the RPS server. Of course VEEAM blows Ahsay away in regards to VM backups... But... Ahsay has it where their soution send it to one or multiple offsite targets... This is the only reason I even have Ahsay in my environment. I don't see why we would use a third party application/solution to send our backups offsite. It's silly to have two jobs to accomplish this. It should be done via one interface and one jobs... Or.. A sub job of a parent job.. We need to hammer VEEAM with feature requests like this...
Another HUGE feature request is to void needing VPN tunnel to send jobs offsite. We should be able to send our backup jobs offsite via HTTPS like Ahsay does. This is especially important for service providers. It's just not practical in many cases to build VPN tunnel to send your backups offsite... Sure this makes sense for replicas for you want to power up your hot spare site and go live.. But.. In many instances (like 99%) of mine we simply want to send backups offsite without VPNs and have one job to save to multiple targets without additional vendors...
Another HUGE feature request is to void needing VPN tunnel to send jobs offsite. We should be able to send our backup jobs offsite via HTTPS like Ahsay does. This is especially important for service providers. It's just not practical in many cases to build VPN tunnel to send your backups offsite... Sure this makes sense for replicas for you want to power up your hot spare site and go live.. But.. In many instances (like 99%) of mine we simply want to send backups offsite without VPNs and have one job to save to multiple targets without additional vendors...
-
- VeeaMVP
- Posts: 6166
- Liked: 1971 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Best backup setup for mirroring backup respository
Hey Chris, you forgot white labeling to create custom Veeam deployments and offer it for Backup services, "like Ahsay does"
Kidding obviously, don't get offended
Having more than one target per job is a request many have done, and I always was a big supporter of this feature, so thanks for your additional vote!
Luca.
Kidding obviously, don't get offended
Having more than one target per job is a request many have done, and I always was a big supporter of this feature, so thanks for your additional vote!
Luca.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- Lurker
- Posts: 2
- Liked: never
- Joined: Sep 19, 2012 11:26 am
- Full Name: Bjørn-Ove Kiil
- Contact:
[MERGED] Best way to do backup when replicating to an offsit
Hi,
I'm looking in to how to best implement Veeam B & R and replicate the backup to an offsite location with 10 mbit/s bandwith. Today I'm running with reversed backup, and I've done the registry hack so the vbk don't change name. Then, I've created a script that start when the backup job is finished. The replication and scripts work, but it's too much changes/data to the vbk files. This means that the replication job don't manage to finish up during the next 24 hours....and then I've got a problem!
So, I'm seeking advice on how to best do the backup and replicate this to the offsite location ? Is it possible to do a forward incremental without doing a synthetic full or active full ?
Thanks!
/BOK
I'm looking in to how to best implement Veeam B & R and replicate the backup to an offsite location with 10 mbit/s bandwith. Today I'm running with reversed backup, and I've done the registry hack so the vbk don't change name. Then, I've created a script that start when the backup job is finished. The replication and scripts work, but it's too much changes/data to the vbk files. This means that the replication job don't manage to finish up during the next 24 hours....and then I've got a problem!
So, I'm seeking advice on how to best do the backup and replicate this to the offsite location ? Is it possible to do a forward incremental without doing a synthetic full or active full ?
Thanks!
/BOK
-
- VP, Product Management
- Posts: 27377
- Liked: 2800 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: Best backup setup for mirroring backup respository
Hi BOK,
Actually you can run synthetic fulls and reversed incremental backups without any traffic penalty over the WAN link. Please review this topic for available options, if you still have any questions, please let us know.
Thanks!
Actually you can run synthetic fulls and reversed incremental backups without any traffic penalty over the WAN link. Please review this topic for available options, if you still have any questions, please let us know.
Thanks!
-
- Influencer
- Posts: 23
- Liked: 1 time
- Joined: Jan 14, 2013 3:50 pm
- Contact:
[MERGED] How to have a local and an offsite backup?
I have three offices: Oxford, Jackson, and Gulfport. Each office has a MS 2008R2 Hyper-V Server with a local SAN with a handful of Server 2008R2 VMs for domain controllers, file shares, etc. I have Veeam Backup and Replication Manager installed in Oxford. Each office backups up their local servers the local SAN in that office. However, I would like Oxford to become a disaster recovery site, so that Jackson and Gulfport backup to Oxford as well, and Oxford backs up to Jackson. How can I accomplish this? It appears that I cannot run 2 backup jobs on the same server simultaneously. The offices are connected by T-1 lines, and each office has about a terabyte in data, so an initial backup of the data over the T-1 would take several weeks.
-
- Veeam ProPartner
- Posts: 252
- Liked: 26 times
- Joined: Apr 05, 2011 11:44 pm
- Contact:
Re: Best backup setup for mirroring backup respository
1) you will need to seed backups (make a backup, copy to a portable media, bring to another location, point backup job to seeded data)
2) you will need backup proxies at all locations
3) 1TB of data at each site and T1 MAY be OK, depending on how much data is changing at each office. If a lot of data is changing - then you will run into trouble with backups not having enough time to complete before the next job is scheduled to run. T1 is very slow for this application and you may be better off signing up for a physical backup tape/drive off-siting.
4) you may benefit from replication jobs instead of backups as backup also require time to transform the previous files (make synthetic full or new reverse incremental one), on the other hand replication is just copying changed data into a new snapshot
2) you will need backup proxies at all locations
3) 1TB of data at each site and T1 MAY be OK, depending on how much data is changing at each office. If a lot of data is changing - then you will run into trouble with backups not having enough time to complete before the next job is scheduled to run. T1 is very slow for this application and you may be better off signing up for a physical backup tape/drive off-siting.
4) you may benefit from replication jobs instead of backups as backup also require time to transform the previous files (make synthetic full or new reverse incremental one), on the other hand replication is just copying changed data into a new snapshot
Who is online
Users browsing this forum: jmaude and 283 guests