-
- Novice
- Posts: 3
- Liked: never
- Joined: Jul 11, 2022 3:09 pm
- Full Name: Mike Brennan
- Contact:
Veeam Direct SAN Backup Performance
Hi All -
I am trying to setup Direct SAN backup jobs with the below infrastructure and was wondering what performance #'s should look like. As of now I'm only getting ~600Mbps throughput. Have talked with both Veeam and Pure as well as tested the networking internally to rule out any networking lag on our internal infra.
Source - 12x VMWare hosts with dual 100gb iSCSI connections per host to a Pure Flash Array x50 (10 Luns @ 10tb each) where the VMs are stored - connecting via 100gb Mellanox SW dedicated to this connection.
Veeam Proxy - Dell x740 baremetal Veeam server with 2 separate 100gb Mellanox nics ( 1 connected to the above mellanox SW to access the VMWare luns, one 100gb that passes through a TOR 100GB SW to a 100GB FW to the destination array.)
Destination - Pure FlashBlade 200tb lun connected to above mentioned 100gb FW.
Both source and destination are all flash disk, with 40GB controller throughput. Can provide any additional info if need be, any info is helpful! Thanks.
I am trying to setup Direct SAN backup jobs with the below infrastructure and was wondering what performance #'s should look like. As of now I'm only getting ~600Mbps throughput. Have talked with both Veeam and Pure as well as tested the networking internally to rule out any networking lag on our internal infra.
Source - 12x VMWare hosts with dual 100gb iSCSI connections per host to a Pure Flash Array x50 (10 Luns @ 10tb each) where the VMs are stored - connecting via 100gb Mellanox SW dedicated to this connection.
Veeam Proxy - Dell x740 baremetal Veeam server with 2 separate 100gb Mellanox nics ( 1 connected to the above mellanox SW to access the VMWare luns, one 100gb that passes through a TOR 100GB SW to a 100GB FW to the destination array.)
Destination - Pure FlashBlade 200tb lun connected to above mentioned 100gb FW.
Both source and destination are all flash disk, with 40GB controller throughput. Can provide any additional info if need be, any info is helpful! Thanks.
-
- VP, Product Management
- Posts: 7077
- Liked: 1510 times
- Joined: May 04, 2011 8:36 am
- Full Name: Andreas Neufert
- Location: Germany
- Contact:
Re: Veeam Direct SAN Backup Performance
What does the Job bottleneck analyze say in the job statistics?
-
- Novice
- Posts: 3
- Liked: never
- Joined: Jul 11, 2022 3:09 pm
- Full Name: Mike Brennan
- Contact:
Re: Veeam Direct SAN Backup Performance
It says the target, which makes no sense to me as its an extremely high-performant endpoint that isn't being used by many other applications if any. A Veeam tech sent this to me, and I am awaiting Pure support to try and make sense of this all.
[26.05.2022 09:37:39] <61> Info [AP] (34a0) output: --pex:71;91618803712;48513417216;0;48513417216;41024497802;71;83;77;54;4;98;132980458597140000
[26.05.2022 09:37:49] <62> Info [AP] (34a0) output: --pex:73;94282711040;49615994880;0;49615994880;41811068134;71;83;77;54;4;98;132980458699150000
[26.05.2022 09:37:59] <81> Info [AP] (34a0) output: --pex:74;95926353920;50874286080;0;50874286080;42644450978;71;83;77;54;4;98;132980458799130000
[26.05.2022 09:38:10] <57> Info [AP] (34a0) output: --pex:76;98762227712;51975290880;0;51975290880;43516771170;71;83;77;54;4;98;132980458907240000
Wondering if there is something I'm missing from the Veeam standpoint other than the few tweaks in the documentation I followed to writing to a dedupe appliance OR if anyone else has this type of setup and has had similar issues.
Thanks!
[26.05.2022 09:37:39] <61> Info [AP] (34a0) output: --pex:71;91618803712;48513417216;0;48513417216;41024497802;71;83;77;54;4;98;132980458597140000
[26.05.2022 09:37:49] <62> Info [AP] (34a0) output: --pex:73;94282711040;49615994880;0;49615994880;41811068134;71;83;77;54;4;98;132980458699150000
[26.05.2022 09:37:59] <81> Info [AP] (34a0) output: --pex:74;95926353920;50874286080;0;50874286080;42644450978;71;83;77;54;4;98;132980458799130000
[26.05.2022 09:38:10] <57> Info [AP] (34a0) output: --pex:76;98762227712;51975290880;0;51975290880;43516771170;71;83;77;54;4;98;132980458907240000
Wondering if there is something I'm missing from the Veeam standpoint other than the few tweaks in the documentation I followed to writing to a dedupe appliance OR if anyone else has this type of setup and has had similar issues.
Thanks!
-
- VP, Product Management
- Posts: 7077
- Liked: 1510 times
- Joined: May 04, 2011 8:36 am
- Full Name: Andreas Neufert
- Location: Germany
- Contact:
Re: Veeam Direct SAN Backup Performance
Someone should check your configuration.
I guess ou write to the SMB part the FlashBalde.
From design it should look like the following:
Scale-out-Backup Repository1
- Extend1 with Gateway Server 1 writing into NFS part of FlashBalde to Folder 1
- Extend2 with Gateway Server 2 writing into NFS part of FlashBalde to Folder 2
- Extend3 with Gateway Server 3 writing into NFS part of FlashBalde to Folder 3
- Extend4 with Gateway Server 4 writing into NFS part of FlashBalde to Folder 4
Repeat until you have as many extents and Gateway server in use as you have FlashBlade nodes
Overall as we write through Gateway Servers, you have to check this setting.
As well you have to check the settings on how many task slots you allow for operation (Proxy and Repository side).
Why the above configuration. FlashBalde gives our Gateway Server (through DNS load balancing) one IP address to work with, which is basically one node.
If you have only one Gateway Server, the whole traffic is handeled by this node. By spereading the processing out to multiple Gateway Servers (VMs maybe) you can make sure that multiple nodes are used for backup traffic.
I guess ou write to the SMB part the FlashBalde.
From design it should look like the following:
Scale-out-Backup Repository1
- Extend1 with Gateway Server 1 writing into NFS part of FlashBalde to Folder 1
- Extend2 with Gateway Server 2 writing into NFS part of FlashBalde to Folder 2
- Extend3 with Gateway Server 3 writing into NFS part of FlashBalde to Folder 3
- Extend4 with Gateway Server 4 writing into NFS part of FlashBalde to Folder 4
Repeat until you have as many extents and Gateway server in use as you have FlashBlade nodes
Overall as we write through Gateway Servers, you have to check this setting.
As well you have to check the settings on how many task slots you allow for operation (Proxy and Repository side).
Why the above configuration. FlashBalde gives our Gateway Server (through DNS load balancing) one IP address to work with, which is basically one node.
If you have only one Gateway Server, the whole traffic is handeled by this node. By spereading the processing out to multiple Gateway Servers (VMs maybe) you can make sure that multiple nodes are used for backup traffic.
-
- VP, Product Management
- Posts: 7077
- Liked: 1510 times
- Joined: May 04, 2011 8:36 am
- Full Name: Andreas Neufert
- Location: Germany
- Contact:
Re: Veeam Direct SAN Backup Performance
For v12 I would look into Direct Backup to Object Storage with FlashBalde as it will balance the load based on normal S3 processing.
-
- Veeam Software
- Posts: 688
- Liked: 150 times
- Joined: Jan 22, 2015 2:39 pm
- Full Name: Stefan Renner
- Location: Germany
- Contact:
Re: Veeam Direct SAN Backup Performance
@mbrennan99 can you please clarify if FlashBlade is connecet via NAS or, as you wrote above via LUN (I guess iSCSI because of 100G ethernet)?
If the environment looks like what you explain the numbers should be nice.
There are couple of things you can test, first thing i would do is to test with some Veeam Hot-Add proxies running on the same VMware environement sending data to the repo servers over the 100G.
Essentially it is some performance related which can be a lot in sense of configuration (task slots on veeam, network settings on the lan...etc.).
So the more details you can provide the more ideas you may get here.
Thanks
If the environment looks like what you explain the numbers should be nice.
There are couple of things you can test, first thing i would do is to test with some Veeam Hot-Add proxies running on the same VMware environement sending data to the repo servers over the 100G.
Essentially it is some performance related which can be a lot in sense of configuration (task slots on veeam, network settings on the lan...etc.).
So the more details you can provide the more ideas you may get here.
Thanks
Stefan Renner
Veeam PMA
Veeam PMA
-
- Novice
- Posts: 3
- Liked: never
- Joined: Jul 11, 2022 3:09 pm
- Full Name: Mike Brennan
- Contact:
Re: Veeam Direct SAN Backup Performance
Thanks for the suggestions, I'm going to look to implement a few of the suggestions above and get back to you. The destination Pure FlashBlade is connected via NFS to the volume, the source Pure FlashArray is connected via iSCSI (both 100GB ETH).
Thanks!
Thanks!
-
- VP, Product Management
- Posts: 7077
- Liked: 1510 times
- Joined: May 04, 2011 8:36 am
- Full Name: Andreas Neufert
- Location: Germany
- Contact:
Re: Veeam Direct SAN Backup Performance
Are you saying that you have created an NFS based datastore under VMware to the FlashBalde and write to a VM disk that is placed on that VMware Datastore?
Can you please share some details on how you created the "volume" on the NFS share and what you have choosen to add it in Veeam ?
Can you please share some details on how you created the "volume" on the NFS share and what you have choosen to add it in Veeam ?
-
- Novice
- Posts: 3
- Liked: 2 times
- Joined: Jan 06, 2020 7:41 pm
- Full Name: Hunter Kaemmerling
- Contact:
Re: Veeam Direct SAN Backup Performance
Some clarification on the above would be helpful. I have the same setup but with 32gb fiber channel and with UCS.
2x physical 2019 proxy’s each with 40 cores.
each proxy has dual 40gb connections, and dual 32gb FC connections.
X70r2 array.
15x52tb flashblade with 4x 40gb connections.
4x 250tb volumes presented to 4x gateways. 2x physical and 2x virtual. Had to be 2 virtual for the time being until the additional physical servers come in.
We also oversubscribed our proxy’s 2:1. I know veeam says don’t do this so don’t do what I’m doing by any means. But we run 80 tasks per physical proxy. Mostly did this due to getting more performance. CPU’s we’re barely being hit so we added more tasks tell we started seeing some decent utilization.
We can push 40-60Gbps sustained during our backups after some tweaking. Our bottle neck is actually the array itself. During backups we push the load to 100% sustained for a couple hrs to backup the ~1k or so VMs across about 15 different backup jobs.
We’ve seen spikes of 90+Gbps when we do our oracle & sap Hana backups during the same window. The flashblade eats like it’s nothing. Kinda awesome IMO.
Originally we could only push 5-10Gbps. This is not a pure/Veeam issue but essentially a LACP load-balancing “issue”. I say issue because it’s doing exactly what it is Supposed to do.
LACP load-balances on PORT/IP/MAC hashes. If you have a single vip on the flashblade and a single proxy/gateway server sending data. You’ll only have roughly a single blade bandwidth capability. ~10Gbps minus overheads. Adding additional proxy’s/Gateways and additional VIP’s on the flashblade create more hashes for the network to load balance through more paths. In-tern allowing more bandwidth.
2x physical 2019 proxy’s each with 40 cores.
each proxy has dual 40gb connections, and dual 32gb FC connections.
X70r2 array.
15x52tb flashblade with 4x 40gb connections.
4x 250tb volumes presented to 4x gateways. 2x physical and 2x virtual. Had to be 2 virtual for the time being until the additional physical servers come in.
We also oversubscribed our proxy’s 2:1. I know veeam says don’t do this so don’t do what I’m doing by any means. But we run 80 tasks per physical proxy. Mostly did this due to getting more performance. CPU’s we’re barely being hit so we added more tasks tell we started seeing some decent utilization.
We can push 40-60Gbps sustained during our backups after some tweaking. Our bottle neck is actually the array itself. During backups we push the load to 100% sustained for a couple hrs to backup the ~1k or so VMs across about 15 different backup jobs.
We’ve seen spikes of 90+Gbps when we do our oracle & sap Hana backups during the same window. The flashblade eats like it’s nothing. Kinda awesome IMO.
Originally we could only push 5-10Gbps. This is not a pure/Veeam issue but essentially a LACP load-balancing “issue”. I say issue because it’s doing exactly what it is Supposed to do.
LACP load-balances on PORT/IP/MAC hashes. If you have a single vip on the flashblade and a single proxy/gateway server sending data. You’ll only have roughly a single blade bandwidth capability. ~10Gbps minus overheads. Adding additional proxy’s/Gateways and additional VIP’s on the flashblade create more hashes for the network to load balance through more paths. In-tern allowing more bandwidth.
-
- Influencer
- Posts: 13
- Liked: 6 times
- Joined: Oct 17, 2016 2:32 pm
- Full Name: Gary Nalley
- Contact:
Re: Veeam Direct SAN Backup Performance
I have to admit to being a little confused by this thread. I believe the terms FlashArray and FlashBlade are being used interchangeably.
If you are using a FlashBlade, why would you not look strongly at using the Object interface into the unit? I know there are currently opportunities for improvement with the object functionality in the current versions of Veeam, but based on what I have seen with v12, object is more attractive.
IMO "traditional" file-system access to backup storage allows for all type of risks and issues that are better handled with object technology. For example, current version of ransomwares typically attack files on filesystems (mounted or networked) and have yet to make the jump to API connections like those used with object.
Just something to think about...or maybe your responses will give me something to think about...
Thanks
Gary
If you are using a FlashBlade, why would you not look strongly at using the Object interface into the unit? I know there are currently opportunities for improvement with the object functionality in the current versions of Veeam, but based on what I have seen with v12, object is more attractive.
IMO "traditional" file-system access to backup storage allows for all type of risks and issues that are better handled with object technology. For example, current version of ransomwares typically attack files on filesystems (mounted or networked) and have yet to make the jump to API connections like those used with object.
Just something to think about...or maybe your responses will give me something to think about...
Thanks
Gary
-
- VP, Product Management
- Posts: 7077
- Liked: 1510 times
- Joined: May 04, 2011 8:36 am
- Full Name: Andreas Neufert
- Location: Germany
- Contact:
Re: Veeam Direct SAN Backup Performance
What I do not understand from OPs statements is, that there is FlashBalde used to present a volume to the backup server. I wonder how this is done as the system is mainly a filer/object storage to be able to guide OP on fixing the performance issues.
Who is online
Users browsing this forum: Bing [Bot] and 35 guests