New Production Gear = Upgrade for Backups. Config Thoughts?

itfnb · Post by **itfnb** » Jan 23, 2014 6:07 pm this post

Good times all around when you get to upgrade your environment (Especially when Veeam B&R is in the mix, too!). I've got some new hardware en route, and I've been researching and debating if I should reconfigure our Veeam I/O path during this process. It's a good problem to have for sure. Here's some details:

Current Setup:
*ESXi Host1: Dell PE R710 (16 core, 96GB mem) (Production1)
*ESXi Host2: Dell PE 2950 III (8 core, 64GB mem) (Production2)
*ESXi Host3: Dell PE 2950 (8 core, 32GB mem) (Dev)
*HP StoreVirtual P4300, x6 (Shared by Production servers)
*Veeam + vCenter: Dell PE2950 (8 core, 16GB mem) (Physical server)
***eSATA RAID enclosure for local Backup Jobs
***NAS for local Backup Copy Jobs
***Backup jobs use Direct SAN Access with Storage Snapshots enabled

New Setup:
*ESXi Host1: Dell PE R720 (40 core, 128GB mem) (Production1)
*ESXi Host2: Dell PE R720 (40 core, 128GB mem) (Production2)
*Same HP SAN
*Consolidate Dev server???
*Dell PE R710 as new Veeam B&R server...either physical or virtual
***Internal RAID for faster local Backup Jobs
***MD1200 w/ 12 4TB HDDs for local Backup Copy Jobs

(CPU core count is including both physical and virtual cores.)

As you can see, the R710 is slated to replace the current Veeam B&R server in one way or another. Given the significant increase in CPU performance and memory capacity, I'm wondering if I should consolidate my old dev host onto it as well while breaking out vCenter and Veeam B&R into separate VMs. I have no doubt it could handle all that for me, but I don't want to end up regressing in Veeam on the data access side of the equation while I'm stepping up my performance on the storage side.

This has caused some lingering questions:

A. Should I keep Veeam physical...or transition it to virtual?
B. Or I suppose I could keep the server with 32GB mem and reconfigure it as a physical proxy for the SAN?
C. Can Storage Snapshots be used with HotAdd?
D. Due to the way HP VirtualStore handles iSCSI, I can't use HP's MPIO in Windows for Veeam since it can cause locks on the LUNs in vSphere. Because of that, my Direct SAN Access is limited to a single 1Gbit NIC on my current physical setup. On the other hand - and correct me if I'm wrong - if I were to virtualize Veeam, give its VM a 10Gbit virtual NIC for the SAN network, use the existing MPIO iSCSI setup on the vSphere side for the 4 Gbit NICs connected to the SAN, could I see higher throughput???

P.S. I've already planned to move Veeam to a new install of Server 2012 R2 either way to take advantage of the dedupe features for my Backup Copy Jobs.
P.P.S. There are also additional Backup Copy Jobs and Replication Jobs for off-site protection, but I figured I'd vomited up enough stuff already

.

All thoughts are welcome! Thanks!

Post by **dellock6** » Jan 23, 2014 10:49 pm this post

A: both. I mean, put the Veeam server in virtual so you can better protect it (I know there is configuration backup, but it's always better to be able to restore the whole vm in minutes), and configure the physical machine as proxy and repository
B: if you ask me, I would drop that machine, is some generations old and for example its power consumption is fairly high.
C: no, only in direct san mode, so the physical proxy suggested at point A is good also to satisfy this point
D: first of all, have you seen a saturated link? because the 4300 is the all-SATA model, so I'm not sure that even a 6 nodes cluster can saturate a link

Post by **veremin** » Jan 24, 2014 8:31 am this post

Additionally, few points regarding off-site replication that might be useful for you:

1) It might be worth deploying backup proxy servers at both ends (one closer to production and one to colo site). Having proxies at both ends allows replication data to cross the existing link in highly-compressed and deduplicated state, etc.

2) In case of remote replication, it’s recommended to have two instances of VB&R - one that is deployed in the production will be responsible for local backup jobs, meanwhile, the other in DR site for replication jobs. Such scenario guarantees that if any disaster happens (primary site goes down, for instance), the required operations (Failover, Failback) can be easily started from additional backup server without any issues.

Thanks.

itfnb · Post by **itfnb** » Jan 24, 2014 3:28 pm this post

dellock6 wrote:A: both. I mean, put the Veeam server in virtual so you can better protect it (I know there is configuration backup, but it's always better to be able to restore the whole vm in minutes), and configure the physical machine as proxy and repository
B: if you ask me, I would drop that machine, is some generations old and for example its power consumption is fairly high.

So the PHYSICAL proxy should be the REPOSITORY for the Backup Job for best performance...that makes sense. I guess I could keep the best 2950 for the physical server and dedicate as the Veeam physical proxy? (That was has a pair of Xeon E5440's and other two 2950s have a pair of E5335s.)

dellock6 wrote:D: first of all, have you seen a saturated link? because the 4300 is the all-SATA model, so I'm not sure that even a 6 nodes cluster can saturate a link

Our P4300s and P4300 G2s have 15k 3.5" SAS drives in them and all of the volumes are networked RAID 10. Each node has a pair of 1Gbit NICs that are bonded. In testing on the vSphere side, I've seen over 300MB/s in throughput on the SAN using MPIO during vMotion between LUNs.

But...for the main Backup Job during a part that's copying a fair amount of changed/new data off of a SAN snapshot (like Exchange for instance...Parallel Processing has made this more apparent with multiple VMs going), the bottleneck will be the Source with the Veeam server's 1Gbit NIC saturated for that time. Now, in reality, my current NAS is the bottleneck for the overall job since it's Reversed Incremental. Still...I'd like to improve both sides though.

v.Eremin wrote:Additionally, few points regarding off-site replication that might be useful for you:

The Replication jobs actually go to our DR site. I have a VM over there that is a proxy for that site. Our off-site Backup Copy Jobs going to a different off-site location and I have a small server that's a proxy at that site.

However, I do think I'll "upgrade" the DR proxy to a full-blown Veeam install.

Post by **dellock6** » Jan 24, 2014 4:28 pm this post

So the PHYSICAL proxy should be the REPOSITORY for the Backup Job for best performance...that makes sense. I guess I could keep the best 2950 for the physical server and dedicate as the Veeam physical proxy? (That was has a pair of Xeon E5440's and other two 2950s have a pair of E5335s.)

Yes, usually it's a domino effect: a new machine goes into production, and the first coming out can become a physical Veeam server (if budget does not allow for a new veeam server too).

Our P4300s and P4300 G2s have 15k 3.5" SAS drives in them and all of the volumes are networked RAID 10. Each node has a pair of 1Gbit NICs that are bonded. In testing on the vSphere side, I've seen over 300MB/s in throughput on the SAN using MPIO during vMotion between LUNs.

Wait, on LeftHand the thorughput comes from multiple nodes at the same time, but this does not means a single nodes puts out that value. Also, I'm not getting the vmotion part, unless you had a typo and was meaning storage vmotion. Anyway, I/O pattern for storage vmotion is not guaranteed to be the same as direct san backup.

But...for the main Backup Job during a part that's copying a fair amount of changed/new data off of a SAN snapshot (like Exchange for instance...Parallel Processing has made this more apparent with multiple VMs going), the bottleneck will be the Source with the Veeam server's 1Gbit NIC saturated for that time. Now, in reality, my current NAS is the bottleneck for the overall job since it's Reversed Incremental. Still...I'd like to improve both sides though.

Well, any component of the data pipe can be improved

I would suggest to change only one component at a time, and measure the improvement. Otherwise you cannot understand the value of a given change. As per the direct san backup, after you finished any viable improvement on a 1G network, probably the next step is adding a 10G connection to the Veeam server. Even before adding 10G to LeftHand nodes, if they are connected to the same switch where you are going to connect the 10G nic of Veeam.

Luca.

itfnb · Post by **itfnb** » Jan 24, 2014 5:17 pm this post

Yes, sorry that was storage vMotion...I was just trying to say that my SAN fabric can spit out enough iSCSI traffic to swamp my Veeam server's 1Gbit NIC. In LeftHand's Management Console, you can monitor the Throughput Reads and Writes for both individual nodes and for the cluster. The individual nodes are definitely capable of saturating their bonded 2Gbit connections.

In LeftHand, every volume has a node that acts as its gateway for iSCSI connections. I currently have six volumes so that LeftHand will auto-load balance one volume "gateway" for each node. In ESXi, each of my hosts has 4 1Gbit ports for SAN connectivity and thus each has 4 iSCSI connections for each vmfs volume (Technically 2 would probably be fine since there are 2 on each SAN node, but I had the extra ports...). This is accomplished via vSphere's Round Robin and not via HP's MPIO. Under ESXi and LeftHand, when a host connects to a volume it is still talking to it through a single node, regardless of the number of iSCSI connections. I have verified this in CMC by looking at the IPs associated with each iSCSI connection and with HP tech support.

In Windows without HP's DSM for MPIO, you are stuck with your Veeam server's one NIC for all your connections and and one iSCSI connection per volume. In my case (with 6 volumes and 6 nodes), I at least get the same benefit of having each iSCSI connection connected to its own node. Under Parallel Processing, where it can potentially be pulling from multiple nodes with 2Gbit connections each at one time, I can most definitely saturation the Veeam server's NIC.

Only in Windows using HP's DSM for MPIO can you get all of the true iSCSI multi-pathing out of LeftHand. Which is a shame to me...that power should be available to ESXi hosts, too.

I did begin to plan for 10Gbit, as I got a pair of Intel 10GBASE-T NICs on my new hosts. Our switches (stacked Dell PowerConnect 6248s for the LAN and a stacked pair of 6224s for the SAN) support that via expansion modules.

One can only spend so much money at a time before the CFO raises an eyebrow!

Post by **dellock6** » Jan 27, 2014 10:16 am this post

Sounds like a good plan indeed.
Yes, the lack of a dedicated PSP for ESXi is a shame, even if ESXi round robin is not that bad; it could be "even" better with a PSP like other iscsi storage have. Be careful however with the new v11 and Adaptive Optimization, they added a feature called "preferred pages" that is going to "lock" a iscsi session even more to a given node. It make sense if you have SSD and don't want to waste time reading data from network.

I agree with you the first move should be adding 10G to Veeam proxies.

Luca.

R&D Forums

New Production Gear = Upgrade for Backups. Config Thoughts?

Re: New Production Gear = Upgrade for Backups. Config Though

Re: New Production Gear = Upgrade for Backups. Config Though

Re: New Production Gear = Upgrade for Backups. Config Though

Re: New Production Gear = Upgrade for Backups. Config Though

Re: New Production Gear = Upgrade for Backups. Config Though

Re: New Production Gear = Upgrade for Backups. Config Though

Who is online