-
- Novice
- Posts: 8
- Liked: 1 time
- Joined: Dec 12, 2014 9:43 pm
- Full Name: Jeremy Wright
- Contact:
Backup Performance Issues and Inconsistency
First a little about my setup
VMWare ESXi 5.5U2 - on Cisco UCS servers.
SAN - we have 2 EMC xTremIO (these are capable of 200,000 IOPS and huge bandwidth) They are connected to UCS through Fiber Channel at the Fabric Extender. They are also attached directly to our Physical backup server (and Proxy) by 10gb iSCSI. So VMs operate over the FC and Backups happen direct-san over iSCSI,
My backup server is a Dell R810 (server 2012 R2) with 16 Cores, 256GB RAM and three MD3200 SAS arrays each individually attached to their own SAS HBA. I am using the builtin windows iSCSI initiator with 2 Dual port Intel X520 NICs. iSCSI is redundantly connected to each xTremIO. I can see all of my volumes properly with MPIO. The only hiccup there has been that I had to manually add them to the proxy for direct san to work.
I am in the testing phase to see what configurations will yield the quickest backups. I have a few disks carved up in my MD3200 using different segment sizes (512 and 256) and am matching that to the Local,LAN,WAN target settings (so LAN goes to the 512 segment size drive...).
My first few backups I can see that it tells me my bottleneck is the Source at 99%. This was a surprise because those all-flash arrays are crazy fast. The backups start and as I watch statistics on both VEEAM, xTremIO and my MD3200, what I see is that there are times when I am pulling close to 500MB from the array and it pushes close to 400MB onto the MD3200 (it can handle around 600MB).
The job shows the graph and it looks like a series of mountains and steep cliffs.....starting and stopping seemingly. This happens on single volumes right in the middle.
Am I missing something, I read where a few people were getting well over 1000MB on their jobs. The fastest overall job I have completed worked at an average of 202MB. I know my SAN can handle way more than that. I have tried tuning the Intel NICS and have tried disabling TCP Auto setting, but I've had no luck.
These are initial full copes and I know that incrementals will be way faster, but I was expecting more performance.
VMWare ESXi 5.5U2 - on Cisco UCS servers.
SAN - we have 2 EMC xTremIO (these are capable of 200,000 IOPS and huge bandwidth) They are connected to UCS through Fiber Channel at the Fabric Extender. They are also attached directly to our Physical backup server (and Proxy) by 10gb iSCSI. So VMs operate over the FC and Backups happen direct-san over iSCSI,
My backup server is a Dell R810 (server 2012 R2) with 16 Cores, 256GB RAM and three MD3200 SAS arrays each individually attached to their own SAS HBA. I am using the builtin windows iSCSI initiator with 2 Dual port Intel X520 NICs. iSCSI is redundantly connected to each xTremIO. I can see all of my volumes properly with MPIO. The only hiccup there has been that I had to manually add them to the proxy for direct san to work.
I am in the testing phase to see what configurations will yield the quickest backups. I have a few disks carved up in my MD3200 using different segment sizes (512 and 256) and am matching that to the Local,LAN,WAN target settings (so LAN goes to the 512 segment size drive...).
My first few backups I can see that it tells me my bottleneck is the Source at 99%. This was a surprise because those all-flash arrays are crazy fast. The backups start and as I watch statistics on both VEEAM, xTremIO and my MD3200, what I see is that there are times when I am pulling close to 500MB from the array and it pushes close to 400MB onto the MD3200 (it can handle around 600MB).
The job shows the graph and it looks like a series of mountains and steep cliffs.....starting and stopping seemingly. This happens on single volumes right in the middle.
Am I missing something, I read where a few people were getting well over 1000MB on their jobs. The fastest overall job I have completed worked at an average of 202MB. I know my SAN can handle way more than that. I have tried tuning the Intel NICS and have tried disabling TCP Auto setting, but I've had no luck.
These are initial full copes and I know that incrementals will be way faster, but I was expecting more performance.
-
- VeeaMVP
- Posts: 6166
- Liked: 1971 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Backup Performance Issues and Inconsistency
Hi Jeremy,
the source bottleneck sounds indeed strange looking at first sight at your setup.
Just as a test, can you try to disable compression and deduplication in Veeam jobs to see if the numbers are changing? I did some tests with a similar machine (NetApp EF-550 full flash) and a single proxy was able to run at around 700 MBs, but in that case the bottleneck was the proxy itself
Oh, and obviously always run full backups at least for those tests so numbers can be compared.
the source bottleneck sounds indeed strange looking at first sight at your setup.
Just as a test, can you try to disable compression and deduplication in Veeam jobs to see if the numbers are changing? I did some tests with a similar machine (NetApp EF-550 full flash) and a single proxy was able to run at around 700 MBs, but in that case the bottleneck was the proxy itself
Oh, and obviously always run full backups at least for those tests so numbers can be compared.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- VP, Product Management
- Posts: 27377
- Liked: 2802 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- Contact:
Re: Backup Performance Issues and Inconsistency
In addition to this, I would recommend looking through the job stats to make sure there is no failover to network mode. Also to disable MPIO and see if it makes any difference or not.
-
- Novice
- Posts: 8
- Liked: 1 time
- Joined: Dec 12, 2014 9:43 pm
- Full Name: Jeremy Wright
- Contact:
Re: Backup Performance Issues and Inconsistency
I tested with compression and dedupe turned off and I still see the low performance I was getting previously.
Topped out at 400MB/s with starts and stops. Overall job rate was just 200MB/s. Source is listed as bottleneck. I am going to try and disable MPIO and see what happens. There is no failover to NBD, it is running SAN the whole time.
Topped out at 400MB/s with starts and stops. Overall job rate was just 200MB/s. Source is listed as bottleneck. I am going to try and disable MPIO and see what happens. There is no failover to NBD, it is running SAN the whole time.
-
- Enthusiast
- Posts: 57
- Liked: 3 times
- Joined: Jul 02, 2013 4:17 am
- Full Name: NIck
- Contact:
Re: Backup Performance Issues and Inconsistency
Hi,
i have the same problem, i am investigating with support but nothing so far.
HAving a EMC Vnxe autotiered with SSD drives, i would expect way more better performance but i m stuck at 120MB/s if i am lucky.
hotadd or network mode doesn't make any difference.
I can pull data out at 400MB/s full badnwidth if i storage vmotion within the same storage (vmotion between datastores inside the same storage)
Target is largely underutilized
PLease let me know if you make any progress. I have also disabled compression, that changed nothing,
i have the same problem, i am investigating with support but nothing so far.
HAving a EMC Vnxe autotiered with SSD drives, i would expect way more better performance but i m stuck at 120MB/s if i am lucky.
hotadd or network mode doesn't make any difference.
I can pull data out at 400MB/s full badnwidth if i storage vmotion within the same storage (vmotion between datastores inside the same storage)
Target is largely underutilized
PLease let me know if you make any progress. I have also disabled compression, that changed nothing,
-
- Product Manager
- Posts: 20415
- Liked: 2305 times
- Joined: Oct 26, 2012 3:28 pm
- Full Name: Vladimir Eremin
- Contact:
Re: Backup Performance Issues and Inconsistency
What is your major bottleneck according to the job session log? Is it source, as well? Speaking about speeds, do you see 120 MB/s during full or incremental cycle? Thanks.
-
- Novice
- Posts: 8
- Liked: 1 time
- Joined: Dec 12, 2014 9:43 pm
- Full Name: Jeremy Wright
- Contact:
Re: Backup Performance Issues and Inconsistency
I have done a lot more testing using different configurations.
One problem I know I was having with the backup seemingly starting and stopping was directly related to MPIO. I have tested with EMC PowerPath and it solved my issues of iSCSI dropping the connection to the volume (found iSCSIprt errors in Event Viewer while using MPIO). With an unlicensed PowerPath I am relegated to using only one link at a time.
On that single link I am averaging 320MB/s with peaks up to 500MB/s.
Once I get a proper license for PowerPath I am hoping to see a doubling of that number, but I won't know for a day or two.
I also set
TcpNoDelay for my iSCSI NICS
I am also going to switch from Intel X520 to Broadcom 57810 nics. The Broadcom have iSCSI Offload and I found a great Dell Whitepaper that compared the two cards.
Is there any other performance counter I can look at in Windows that might give me a clue as to how to squeeze out more performance. Should I get EMC involved?
One problem I know I was having with the backup seemingly starting and stopping was directly related to MPIO. I have tested with EMC PowerPath and it solved my issues of iSCSI dropping the connection to the volume (found iSCSIprt errors in Event Viewer while using MPIO). With an unlicensed PowerPath I am relegated to using only one link at a time.
On that single link I am averaging 320MB/s with peaks up to 500MB/s.
Once I get a proper license for PowerPath I am hoping to see a doubling of that number, but I won't know for a day or two.
I also set
TcpNoDelay for my iSCSI NICS
I am also going to switch from Intel X520 to Broadcom 57810 nics. The Broadcom have iSCSI Offload and I found a great Dell Whitepaper that compared the two cards.
Is there any other performance counter I can look at in Windows that might give me a clue as to how to squeeze out more performance. Should I get EMC involved?
-
- Novice
- Posts: 8
- Liked: 1 time
- Joined: Dec 12, 2014 9:43 pm
- Full Name: Jeremy Wright
- Contact:
Re: Backup Performance Issues and Inconsistency
After more testing and now running a much larger backup (8TB /40VMs) it seems that maybe I wasn't pushing VEEAM hard enough.
I am now seeing the Network as the bottleneck. This is a physical server, shouldn't the backups go from SAN to storage? I understand it uses an Agent and creates a TCP connection, but why is it going out my management IP of the server. That is only a 1GB link and the backup is now saturating that. Shouldn't it hit the loopback? I found another thread that mentioned a Regedit to make this happen. Is that true and can I get that if it exists?
I am now seeing the Network as the bottleneck. This is a physical server, shouldn't the backups go from SAN to storage? I understand it uses an Agent and creates a TCP connection, but why is it going out my management IP of the server. That is only a 1GB link and the backup is now saturating that. Shouldn't it hit the loopback? I found another thread that mentioned a Regedit to make this happen. Is that true and can I get that if it exists?
-
- VeeaMVP
- Posts: 6166
- Liked: 1971 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Backup Performance Issues and Inconsistency
Hi Jeremy,
in a scenario where proxy and repository are on the same machine, network bottleneck is the TCP communication between them happening internally to the same server. It's not using the external interface, otherwise you would not be able to see 320-500 MBs as you have tested, over a single 1GB line (topping at 125 MBs...).
Yes, there is a registry key to enable a faster internal communication, but as Anton explained here (http://forums.veeam.com/veeam-backup-re ... ck#p122157) is still in early stages of testing. If you would like to test it in your scenario (and someone else...) send me a private message on the forums and I can give you the registry key. Remember and be aware it's not fully supported yet.
in a scenario where proxy and repository are on the same machine, network bottleneck is the TCP communication between them happening internally to the same server. It's not using the external interface, otherwise you would not be able to see 320-500 MBs as you have tested, over a single 1GB line (topping at 125 MBs...).
Yes, there is a registry key to enable a faster internal communication, but as Anton explained here (http://forums.veeam.com/veeam-backup-re ... ck#p122157) is still in early stages of testing. If you would like to test it in your scenario (and someone else...) send me a private message on the forums and I can give you the registry key. Remember and be aware it's not fully supported yet.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Who is online
Users browsing this forum: No registered users and 54 guests