Best Practice for Optimal Restore Speed

geriksen · Post by **geriksen** » Nov 08, 2009 1:10 pm this post

Hi,

I've been searching this forum for answers, but can't quite find what I'm looking for. We've been using Veeam Backup for awhile now, and are planing on moving up to the new V4 which looks great! But, want to reinstall, and NOT upgrade because of issues we had earlier with corrupt DB etc... giving no cleanup of vrb files. But, thats a different story. What I want is a best practice setup and we want the best performance for restore, since it is this that actually counts when we need it! Here a shot list of what we have:

- 4 ESX hosts v3.5 U4, but will be upgraded soon til vSphere as soon as the first major update is released (within a couple of weeks I've been told.)
- 2 IBM DS3400 SAN, where 1 is for VMFS in RAID 5, with 1 LUN and the other are for Backup with 1 LUN.
- 1 VCBProxy Server with Qlogic 4Gb HBA Dual Channel (1 to each SAN thru zoning in the SAN switch) and VCB installed. Are using Xeon Quad Core CPU and 2Gb of RAM. Normal SATA local disks in RAID 1.

Here's a couple of questions regarding of the new setup:

1. Seems like performance are slow when doing backups today with v3.1.1, but SAN config, Block Clusters, VCB Test etc. seems good! Well, I only have 1 LUN today for VMFS and 1 LUN for Backup on the other SAN, so this could be an issue I have read... But, what is recommended regarding how many LUNs, cluser size etc.?
2. We've done some restores, but how does this work with Veeam. Seems like all restore are done only through the network, and it's slow... Even though we have alle ESX and VCBProxy Server connected to the same Cisco 2960 Gbit switch...

I find it strange that the vendors never make an effort for making a best practice/setup guide for their product, and how to get the most out of it. Shouldn't be so hard to make, depending on different scenarios etc... I'm alwys hearing different setups and arguments through forums etc. so we never actually know what's the best way. It keeps changing the more info I find and read...

Post by **Gostev** » Nov 08, 2009 6:53 pm this post

Hello, thing is - every VMware deployment is different, if there was "universal recipe" which is good for every infrastructure - we would not have this many processing options... which is why we have this forum where everyone can exchange the information on what is working well for them for their specific deployments, backup windows, SLAs and so on.

Unfortunately the restore can only be done over network at this time, and you are right that restores are slower than backups because of this. The recommendation is to specify service console connection settings for the ESX host you are restoring to (to do this, right-click the ESX host in the Veeam Backup Servers tree).

geriksen · Post by **geriksen** » Nov 08, 2009 6:55 pm this post

I see, but that answer didn't help me anything... How about giving me something to work with?

geriksen · Post by **geriksen** » Nov 08, 2009 7:03 pm this post

Sorry, Gostev. Seems like a answered before you had answered completely...

What about other network tuning like Jumbo frames, NIC teaming etc... Would this help?

Post by **Gostev** » Nov 08, 2009 7:12 pm this post

I assume you have FC storage, since you mentioned Qlogic 4G HBA. I was under impression that jumbo frames and NIC teaming only relate to tuning iSCSI storage performance and are not applicable to FC storage? But if you are talking about increasing restore performance with these, than I do not think these will help much. Network restore to "fat" ESX is typically defined with how fast your service console can write to VMFS. The speed would vary depending on I/O controllers and other ESX hardware (I heard batter-backed cache really helps). I can say that on my test ESX server with fast local storage, I can see 50MB/s restore speed with service console connection enabled.

As for you question #1, hopefully someone else from community who has experience with shared storage and VMware will be able to give you recommendation on how best to setup you storage. I know very little about with SANs to give advices (never had to administrate those before). But I am sure things like block sizes and number of LUN would affect tons of other things besides backups, so my uneducated advices may actually hurt instead of helping.

donikatz · Post by **donikatz** » Dec 28, 2009 5:49 pm this post

Sorry to restart an old thread, but figured since it's the same topic...

We're in the process of establishing new, tighter SLAs and I'm trying to decide how best to optimize restores. In folks' experience, is the LAN the biggest bottleneck or the ESX --> storage I/O? Right now we only have a singe giga pNIC for consoles, so I'm thinking the easiest way to get a performance bump might be to add a second pNIC in a 2 Gb etherchannel bond. Our VBU storage array already has a 2 Gb etherchannel bond and our SAN is [sadly only] 2 Gb FC, so that would give us a theoretical 2 Gb path all the way through (obviously in the real world there will be a lot of other overhead and this would only be if no competing I/O). The question is will we see any real-world improvements or is ESX --> SAN the real bottleneck?

I'm going to deploy this in test and report back on findings, but would appreciate any other advice or recommendations. What is everyone else doing for restore optimization? Thanks!

donikatz · Post by **donikatz** » Dec 28, 2009 6:30 pm this post

Of course I just realized etherchannel won't help, since restore would be a 1-1 IP connection. Which is why we'd never set it up before. Other ideas? Thanks

Post by **Gostev** » Dec 28, 2009 9:31 pm this post

Hello Doni. Thanks for restarting the old thread, instead of creating new one!

Based on what I have been, seeing LAN speed is never a bottleneck (assuming you have 1Gbit LAN). After researching a lot of feedback on this subject, my understanding is that it all comes down to VMFS performance, and how it does writes.

Because I have seen significantly different speed reported from one host to another (all with decent storage) on 1Gbit LAN, this made me think that ESX I/O controller (and its settings) is the most critical part. Also, reading some feedback on VMware, I understood the general consensus is that installing battery-backed write cache in the ESX host is pretty much the only way to improve and guarantee good VMFS upload speed.

donikatz · Post by **donikatz** » Dec 28, 2009 10:37 pm this post

Thanks Anton. Of course our SANs have battery-backed write caching enabled, so that's no problem. Interesting that the bottleneck would be the ESX's storage controller, since it certainly has enough juice to run live VMs. I guess the difference is between typical server usage and the sustained writes of a restore?

So if ESX storage I/O is the primary bottleneck, sounds like for restore optimization, you'd want to make sure to restore through an ESX host that does not have much ongoing storage I/O. Perhaps it even makes sense to VMotion running VMs off before restoring, when possible. (Obviously restoring to a LUN on spindles with low contention is another key.)

Are those routines part of folks' regular recovery procedures? Thanks

R&D Forums

Best Practice for Optimal Restore Speed

Re: Best Practice for Optimal Restore Speed

Re: Best Practice for Optimal Restore Speed

Re: Best Practice for Optimal Restore Speed

Re: Best Practice for Optimal Restore Speed

Re: Best Practice for Optimal Restore Speed

Re: Best Practice for Optimal Restore Speed

Re: Best Practice for Optimal Restore Speed

Re: Best Practice for Optimal Restore Speed

Who is online