Comprehensive data protection for all workloads
Post Reply
Shogan

Best method to access NAS storage for backup performance?

Post by Shogan »

Hi guys,

Forgive me if this has been covered before, but I have been having trouble (404 errors etc) whilst trying to search the forum lately.

Basically, I have a Veeam Backup server (VM) running in one ESX cluster in vCenter, which is backing up VMs from this cluster across a 1GBit fibre network link between data centres to a QNAP TS-859U-RP+ Turbo NAS (RAID 6 / 8 SATA disks) running in the other data centre. This other data centre contains another cluster of ESX hosts.

So my options for attaching the storage have been:

1. Mount NFS share to all ESX hosts in secondary data centre cluster and backup to this NFS datastore via the ESX I/O stack (as I have the entire vCenter instance added in the Veeam B&R VM, hence can see both clusters of ESX hosts and their storage/VMs from within this VM.

2. I have added a second vNIC to the Veeam Backup VM server and set this vNIC's network to be the storage layer network VLAN that the NAS sits on, giving it an IP on this range. Therefore, I can also access the storage of this NAS via CIFS share direct from the Veeam VM. i.e. \\x.x.x.x\VeeamBackup01.

3. Linux Server method (Add Server in Veeam B&R)

I have tried method 1 and 2 above, and get the following results:

Method 1 (via ESX IO stack):

11 of 11 VMs processed (0 failed, 0 warnings)

Total size of VMs to backup: 126.50 GB
Processed size: 126.50 GB
Processing rate: 80 MB/s
Start time: 21/08/2011 19:01:14
End time: 21/08/2011 19:28:14
Duration: 0:26:59

Method 2 (via CIFS share direct from Veeam B&R VM):

11 of 11 VMs processed (0 failed, 0 warnings)

Total size of VMs to backup: 126.50 GB
Processed size: 126.50 GB
Processing rate: 116 MB/s
Start time: 23/08/2011 13:43:14
End time: 23/08/2011 14:01:55
Duration: 0:18:40

As you can see method 2 seems to be quicker. (These tests were both done by first running a Full Backup, then running an incremental straight afterwards). My questions are:

1. Should I expect the CIFS share method to be slightly quicker than going via the ESX IO stack?
2. What is the recommendation in my above situation to access the NFS storage? I am not sure about the "Linux Server" method and what benefits this may have. In my situation, would it be worth trying this method?
Vitaliy S.
VP, Product Management
Posts: 27377
Liked: 2800 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Best method to access NAS storage for backup performance

Post by Vitaliy S. »

Shogan wrote:1. Should I expect the CIFS share method to be slightly quicker than going via the ESX IO stack?
It depends on the hardware configuration of your ESX host and the NAS device itself.
Shogan wrote:2. What is the recommendation in my above situation to access the NFS storage? I am not sure about the "Linux Server" method and what benefits this may have. In my situation, would it be worth trying this method?
I would not recommend using ESX(i) host as destination for your backups, because first of all you will have an additional overhead on your production host, secondly a process of recovering VM data might be a bit complicated when this ESX(i) host goes down.

I would prefer to go with the 3rd variant, as in case of a Linux server being used (make sure you have a modern CPU or at least 2-4 vCPUs for this server) we will deploy a tiny agent which will be responsible for working with target backup storage files (VBK, VRB, VIB).

Also please be aware that forward incremental backup mode should be faster than reversed incremental with network attached storage, because of the 3x less load in terms of I/O operations on the target storage.
Shogan

Re: Best method to access NAS storage for backup performance

Post by Shogan »

Hi Vitaly, thanks for your reply.

For the third variant (Linux Server), are there any recommendations or familiar choices you would have in terms of the distribution to use? I am familiar with uBuntu Linux myself (the GUI of course) but I have a little bit of experience using the shell too.

Also, should this Linux Server be running in the same data center as the NAS storage? Then the Veeam Backup server just connects to it via the 1Gbit fibre link between data centres. Are there any requirements I need to look at for the linux server VM?

Lastly, about the incremental mode, am I right in thinking that the reverse incremental mode is more safe in terms of being able to restore data if one of those reverse incremental backup files got corrupted/lost as compared to losing a normal incremental job file?

i.e. losing a reverse incremental job file means I can still restore the server to its latest backup state, and going back further until I hit the reverse incremental job file that was lost or corrupted. But if I wanted to restore a VM using the normal incremental job files, if one of those were lost, I could only restore the oldest version of the VM (when full backup ran, then going forward getting to latest version forward until I hit the corrupted / lost incremental job file) ?

From what I understand, reverse incremental helps in this situation - therefore it is up to us to decide whether the better performance is worth this risk in terms of losing an incremental job file.

Cheers!
Vitaliy S.
VP, Product Management
Posts: 27377
Liked: 2800 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Best method to access NAS storage for backup performance

Post by Vitaliy S. »

Shogan wrote:For the third variant (Linux Server), are there any recommendations or familiar choices you would have in terms of the distribution to use? I am familiar with uBuntu Linux myself (the GUI of course) but I have a little bit of experience using the shell too.
You can use any distribution that has Perl pre-installed, Ubuntu linux should be just fine.
Shogan wrote:Also, should this Linux Server be running in the same data center as the NAS storage? Then the Veeam Backup server just connects to it via the 1Gbit fibre link between data centres. Are there any requirements I need to look at for the linux server VM?
There are no special requirements for the Linux box. Please take a look at our Release Notes document for more info: http://www.veeam.com/files/release_note ... _notes.pdf
Shogan wrote:Lastly, about the incremental mode, am I right in thinking that the reverse incremental mode is more safe in terms of being able to restore data if one of those reverse incremental backup files got corrupted/lost as compared to losing a normal incremental job file?
With reversed incremental in order to restore to the latest state you do not need to have full backup chain, while with forward incremental you will need all the files (VBK + VIB), meaning that if you lose one of the restore points (incremental runs) restore to the latest VM state will not be possible.

Here you go an existing topic that compares both modes: incremental backups or reverse incremental??

Thanks!
Shogan

Re: Best method to access NAS storage for backup performance

Post by Shogan »

Thanks again Vitaly,

I will check the release notes. I will also be setting up a Linux VM to test with. I have told management about the less stressful incremental mode and asked them to consider the pros and cons of each mode so we will revisit this point at a later stage.

PS if anyone else has a recommendation of which linux distro to use, or can tell me what they use for this purpose that would be great. Otherwise I may end up just using uBuntu as I am familiar with that and I am sure I can use synaptic to install Perl modules if it doesn't already have them.
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Best method to access NAS storage for backup performance

Post by tsightler »

I always used CentOS for my Veeam linux targets. It's a clone of RHEL and thus has a very long support cycle (i.e. once you get it working, it will work for years with only updates). That being said, there's no reason Ubuntu shouldn't work, and it's a good distro, but you might want to pick the Ubuntu Server LTS (Long Term Support) Edition, currently 10.04, as it's a solid version with long term updates.
Gostev
Chief Product Officer
Posts: 31814
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Best method to access NAS storage for backup performance

Post by Gostev »

If NAS is on LAN, then I would not expect any improvements of going to it via Linux comparing to using CIFS share. Where the Linux approach really shines is when NAS is located over WAN.

But I also must note, that with certain NAS storage (particularly, DataDomain) I have indeed seen signficant increase in performance when using NFS instead of CIFS. However, I guess this comes down to respective implementation quality of NFS and CIFS servers on the particular device.
Shogan

Re: Best method to access NAS storage for backup performance

Post by Shogan »

Hi Anton,

Thanks for your insight :) I think I will give the Linux server test a skip then as we are doing backup over a 1Gbit connection between data centres. So far on my testing CIFS and NFS methods on the QNAP NAS we are using as one of the backup targets seem to offer roughly the same performance. In some cases I have seen the CIFS (backing up directly to that from the veeam VM) going slightly faster than via the ESX hosts to the NFS Datastore.
tlphipps
Novice
Posts: 3
Liked: never
Joined: Dec 04, 2010 1:42 pm
Full Name: Travis Phipps
Contact:

Re: Best method to access NAS storage for backup performance

Post by tlphipps »

I just wanted to add an extra datapoint to this discussion. We have several clients using QNAP NAS devices as Veeam backup targets and we see incredible speed using direct CIFS access to these devices. One client in particular routinely sees 85MB/s of write performance over a single 1Gbps link. So that's VERY impressive for any device with the CIFS protocol. Because of this, we normally stick with CIFS for the QNAP devices since you can't really get much faster than that in normal scenarios.
Shogan

Which backup method would best in this situation?

Post by Shogan »

[merged]

Hi guys,

I have been testing Veeam Backup in different scenarios in our environment and trying to figure out which method would be best to use. Basically we have 2 x Data Centres - they are connected via 1GBit network link. Each Data Centre has 1 or more ESX clusters running at it, with their own FC SAN storage.

For true data protection, we need offsite backups, therefore we need to use Veeam to backup from source (Datacentre 1) to target (Datacentre 2) and vice versa depending on where the live VMs are running. So VMs running live in Datacentre 1 get backed up over night to Datacentre 2 and the other way around.

Now because there is a 1GBit network connection between data centres, this seems to rule out using SAN mode backups (am I right?), so the only options for us to use are: Network mode and Virtual Appliance mode. I have done tests using both modes to backup a 42GB VM from one data centre to the other. My testing was the following: For each test scenario, run 1 x Full Backup, then two incrementals. The first incremental with no change, then before the second incremental run, I make a copy of a 2GB folder inside the Guest OS, so therefore creating 2GB of changes. Before the next scenario run, I would delete the 2GB files, then do a new full backup using the next mode etc... on a brand new Veeam job. Here are the results:

Scenarios tested:
1. Backup to target storage via ESX host IO stack (i.e. NFS datastores added to hosts, and selected for backup job in Veeam) - Test Network mode and VA mode
2. Backup to target storage direct via CIFS share, accessible because of 2nd vNIC added to Veeam B&R VM that connects it to the storage network VLAN. - Test Network mode and VA mode

Specs of Veeam B&R VM:

4 x vCPUs (host ESX server is a new blade with 2 x 3.1GHz Xeon Nehalem processors (6 physical cores each)
4GB RAM
2 x vNICs, one to the VM network and one connecting it with an IP address directly to the storage network, where the target storage resides.

Specs of Storage:

QNAP NAS TS-859U-RP+ (8 x SATA disks in RAID6) on 1GBit network local to target backup data centre
Note: I have also done tests to EMC SAN storage at this data centre as backup target with similar results.

Image

Could anyone recommend based on the above results and scenario which backup method would be best to use? As far as I can tell from these results, there is no solid difference in performance/speed between Veeam Network mode and Virtual Appliance mode. Although Network backup mode seems to be slightly faster in more cases than the VA mode. Also, I have used Reversed Incremental mode in all tests, and Target storage set as "LAN". Optimal compression (Default) selected too.

I look forward to discussion / comments :)
chrisdearden
Veteran
Posts: 1531
Liked: 226 times
Joined: Jul 21, 2010 9:47 am
Full Name: Chris Dearden
Contact:

Re: Which backup method would best in this situation?

Post by chrisdearden »

Could you not backup locally using SAN mode , then replicate the backups using either SAN replication / rsync / robocopy ?

that way you would have a local copy of the backup available to get the best from things like instant restore / surebackup /u-air - you would also have an offsite copy that you could import to a veeam server on the remote site in case of a complete site failure.

Note you can only use SAN mode if your B&R server is physical, with an HBA zoned to your ESX storage.
Shogan

Re: Which backup method would best in this situation?

Post by Shogan »

chrisdearden wrote:Could you not backup locally using SAN mode , then replicate the backups using either SAN replication / rsync / robocopy ?

that way you would have a local copy of the backup available to get the best from things like instant restore / surebackup /u-air - you would also have an offsite copy that you could import to a veeam server on the remote site in case of a complete site failure.

Note you can only use SAN mode if your B&R server is physical, with an HBA zoned to your ESX storage.
Chris, that is a good idea and something I briefly thought about once before, just totally forgot about this option. What I also didn't quite understand was how it would help using rsync/robocopy for example (I am specifically thinking about speeds here) - would replicating to remote site be a lot faster in theory than the total time all of the Veeam Backups would take to local site? As in there is no compression, snapshots, backup job starts etc to worry about, it would just be the Veeam .VRB files that need to sync across to remote site each day over the 1GBit network. How about the Full Backup files, would they need a lot of replication too each day? As far as I can tell, they also change a bit when using Reverse Incremental mode backups.

In a way, this should be better actually, as we have faster access to backups at the local site, and then if all else fails, we still have a remote site copy that has replicated from the local site backups. SAN mode backup can also be used with a physical server and HBA connecting it directly to our SAN via FC.

PS I am still interested in ideas about my original post above, as this will still probably be used in the interim until a better method can be set up. (Such as SAN mode) with remote replication.
chrisdearden
Veteran
Posts: 1531
Liked: 226 times
Joined: Jul 21, 2010 9:47 am
Full Name: Chris Dearden
Contact:

Re: Which backup method would best in this situation?

Post by chrisdearden »

If you are copying off site with robocopy , I would possibly be tempted to use forward incrementals. Then you are only having to copy the incremental changes across your WAN rather than the changes in the VBK file as well.. If you are able to use something like DFS replication which can occur at a lower level you would be able to use reverse incrementals , but as you say you willbe replicating the change out of the VBK file and the creation of the VRB file. The only downside is that you will of course need a little bit more storage for those backups as you will have the local and remote copies.
Shogan

Re: Which backup method would best in this situation?

Post by Shogan »

chrisdearden wrote:If you are copying off site with robocopy , I would possibly be tempted to use forward incrementals. Then you are only having to copy the incremental changes across your WAN rather than the changes in the VBK file as well.. If you are able to use something like DFS replication which can occur at a lower level you would be able to use reverse incrementals , but as you say you willbe replicating the change out of the VBK file and the creation of the VRB file. The only downside is that you will of course need a little bit more storage for those backups as you will have the local and remote copies.
Thanks Chris.

Interesting stuff! As always, there are always pros and cons to everything. Reverse incrementals seem a safer bet to me if one of the job files were to become corrupt somehow - allowing recovery to the latest version going back down to oldest, rather than losing a low down forward incremental job file in the chain and losing the ability to restore the latest version! But this in my opinion would be a rare circumstance, so it would just be part of the decision on using offsite replication from a local SAN mode backup copy at the end of the day. I would be keen to use the Forward incremental mode as it just makes sense to not have to replicate changes to the huge .VBK full backup files over the network link too. Do you mean by a lower level in terms of replication the SAN replication mode - i.e. at a block level? I guess then this would only see changed blocks that need to replicate instead of files then - is that correct?

Regarding performance hits on the SAN from using offsite replication, this shouldn't be an issue if we used specific LUNs (Datastores) for Backups only, so only these LUNs would be affected by the copy process if it were to replicate during the day after a full night's worth of backups I guess. Does this sound correct?

Otherwise everyone else - please shout if you have any comments/ideas on the original post and results / decision as to what to temporarily use for now (Network or VA mode)! :)
Vitaliy S.
VP, Product Management
Posts: 27377
Liked: 2800 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Which backup method would best in this situation?

Post by Vitaliy S. »

Shogan wrote:Now because there is a 1GBit network connection between data centres, this seems to rule out using SAN mode backups (am I right?), so the only options for us to use are: Network mode and Virtual Appliance mode.
No, this is not right. Direct SAN access only refers to source data retrieval, so even though you have a 1 GBit network connection between sites I would still recommend to go with physical backup servers and direct SAN backup mode.
Shogan wrote:Scenarios tested:
1. Backup to target storage via ESX host IO stack (i.e. NFS datastores added to hosts, and selected for backup job in Veeam) - Test Network mode and VA mode
Storing backup files on VMFS/NFS volumes connected through ESX host is not considered to best practice. Imagine that your target host goes down, how would you get access to the backup files? That wouldn't be an easy task to do. I would recommend choosing CIFS share as a destination target in your scenario.
Shogan wrote:What I also didn't quite understand was how it would help using rsync/robocopy for example (I am specifically thinking about speeds here) - would replicating to remote site be a lot faster in theory than the total time all of the Veeam Backups would take to local site?
That's true. But in this case you would have two backup copies at both sites.
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Best method to access NAS storage for backup performance

Post by foggy »

Also, in case you choose to go with rsync scenario using the incremental mode, review this topic as it might be very useful.
By default, incremental mode creates synthetic fulls with different names each time and solution described there can be used to avoid full VBK replication over WAN.
Shogan

Re: Which backup method would best in this situation?

Post by Shogan »

Vitaliy S. wrote: No, this is not right. Direct SAN access only refers to source data retrieval, so even though you have a 1 GBit network connection between sites I would still recommend to go with physical backup servers and direct SAN backup mode. Storing backup files on VMFS/NFS volumes connected through ESX host is not considered to best practice. Imagine that your target host goes down, how would you get access to the backup files? That wouldn't be an easy task to do. I would recommend choosing CIFS share as a destination target in your scenario. That's true. But in this case you would have two backup copies at both sites.
Thank you for your reply Vitaliy S. :)

So with regard to point 1 - am I right that this would mean we still need a Physical Veeam Server with HBA(s) connected to our local SAN at that site. Then backup to the storage at this same site, then use the replication method (rsync/robocopy)? Or am I misunderstanding and I would be able to use SAN mode to access the VMs on the local site source (source data retrieval), and then target these VM backup jobs to remote storage at the other site over the 1GBit network link on a NAS or another SAN for example? - if this was the case and it worked, would I just target the remote storage for this SAN mode backup through the vCenter node added in the Veeam Backup Software and it would all work?

@foggy, thank you for that link - I actually read that end of last week and that is what had me thinking about the SAN mode backup and replication of backup files to remote site initially! If this ends up being the best solution then I will revisit this topic and look at implementing a solution like that.
Vitaliy S.
VP, Product Management
Posts: 27377
Liked: 2800 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Best method to access NAS storage for backup performance

Post by Vitaliy S. »

Shogan wrote:So with regard to point 1 - am I right that this would mean we still need a Physical Veeam Server with HBA(s) connected to our local SAN at that site. Then backup to the storage at this same site, then use the replication method (rsync/robocopy)?
This is one of the approaches you can use.
Shogan wrote:Or am I misunderstanding and I would be able to use SAN mode to access the VMs on the local site source (source data retrieval), and then target these VM backup jobs to remote storage at the other site over the 1GBit network link on a NAS or another SAN for example?
Yes, retrieving VMs through FC connection on the source site and pointing your backups to another site over 1 GBit netwolk link is another way out.
Shogan

Re: Best method to access NAS storage for backup performance

Post by Shogan »

Vitaliy S. wrote: This is one of the approaches you can use. Yes, retrieving VMs through FC connection on the source site and pointing your backups to another site over 1 GBit netwolk link is another way out.
Awesome. Thank you for confirming those two questions!

I may be back with some specifics about how exactly the incrementals work (mainly reverse) - there isn't any documentation on how they work in terms of the exact procedure I can refer to is there? I need this to work out how rsync/robocopy would work best and in what scenarios.

Also, we had some interesting ideas about Forward Incremental backup jobs and how to create a rolled Full Backup file using the original Full Backup job file and the delta incrementals so far - combining those to create a full backup file instead of having to run the full backup again (for example every few days) to keep the risk of using incrementals (and losing one in the chain) low. Don't know what your guys' thoughts are on that or if it would be possible to do with Veeam?
Gostev
Chief Product Officer
Posts: 31814
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Best method to access NAS storage for backup performance

Post by Gostev »

There is a great blog post on how reversed incremental works linked in the sticky FAQ topic.

No, currently this is not possible to perform backup files combining, but we are planning to provide something like this down the road, as we had a few similar requests by now.
Shogan

Re: Best method to access NAS storage for backup performance

Post by Shogan »

Gostev wrote:There is a great blog post on how reversed incremental works linked in the sticky FAQ topic.

No, currently this is not possible to perform backup files combining, but we are planning to provide something like this down the road, as we had a few similar requests by now.
Thanks again Gostev. For those interested here is the blog post direct link: http://www.veeam.com/blog/veeam-synthet ... ained.html

Anyway, so what I was hoping for was a bit more in depth detail - this post is great and helps me to understand the way it works, but I was hoping to find out a little bit more, like where each bit of processing takes place.

As an example (not saying this is correct but just to illustrate the specific bit of information about reverse incremental backups I am looking for): The full backup starts off on the first job run. The next backup that happens a rollback is created - the changed blocks are determined from the source VM by the ESX host, then these are transferred across to the Veeam server. The Veeam server works out what needs to be put aside for the VRB file, then writes that back to the datastore/target storage into the VRB file, finally the latest changes are merged/inject back into the .VBK file on the target storage by the Veeam server...

Any chance I can get something like that if someone has the time there from the Veeam team? By knowing this, we can make an informed decision as to what mode of backup we should be using based on our environment type and features.

Finally, is there any timescale on that planned feature (backup file combining) for example, next release of Veeam B&R?
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Best method to access NAS storage for backup performance

Post by foggy »

Shogan wrote:Anyway, so what I was hoping for was a bit more in depth detail - this post is great and helps me to understand the way it works, but I was hoping to find out a little bit more, like where each bit of processing takes place.

As an example (not saying this is correct but just to illustrate the specific bit of information about reverse incremental backups I am looking for): The full backup starts off on the first job run. The next backup that happens a rollback is created - the changed blocks are determined from the source VM by the ESX host, then these are transferred across to the Veeam server. The Veeam server works out what needs to be put aside for the VRB file, then writes that back to the datastore/target storage into the VRB file, finally the latest changes are merged/inject back into the .VBK file on the target storage by the Veeam server...
Please look at the description given on p.12 of our user guide - is it what you are searching for?
Shogan

Re: Best method to access NAS storage for backup performance

Post by Shogan »

foggy wrote: Please look at the description given on p.12 of our user guide - is it what you are searching for?
Thanks foggy. This seems to go into more detail - I'll pass this over to management who were looking for more specific detail to see if it suits their needs. I have read it myself and makes things a lot clearer for me now, so hopefully that will do the job!

On the question I had about merging backup files together (forward incrementals), the section about synthetic fulls being scheduled in amongst normal incremental backup jobs (like every Thursday for example) seems to do this - merging the last full backup file together with all the VIBs to create a synthetic full - so maybe there was some confusion with regard to my question where I asked:
Also, we had some interesting ideas about Forward Incremental backup jobs and how to create a rolled Full Backup file using the original Full Backup job file and the delta incrementals so far - combining those to create a full backup file instead of having to run the full backup again (for example every few days) to keep the risk of using incrementals (and losing one in the chain) low. Don't know what your guys' thoughts are on that or if it would be possible to do with Veeam?
Does this (Synthetic full backup) sound the same as what I was asking? (See quote above).

If it is the same, then I guess that answers my question then.

Sorry for all these questions - I am just trying to make 100% sense of everything and ensure I help plan out our backup strategy correctly. It also helps being linked directly to the answers as you guys are doing for me instead of me having to wade through tons of 100+ page documents! So thank you very much for all the help so far to you guys :)
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Best method to access NAS storage for backup performance

Post by foggy »

Yes, synthetic fulls are just what you are talking about. However, keep in mind that if you still want backups in both locations (source site and target site) you will have to create two jobs to perform this: one to backup locally and another to perform incrementals over WAN to have them injected into full on target. This will require taking the VM snapshot two times and backing up the same data twice.
Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 54 guests