Comprehensive data protection for all workloads
Post Reply
aarick
Influencer
Posts: 11
Liked: 3 times
Joined: Apr 02, 2015 6:40 am
Full Name: Eric Yew
Contact:

Netapp Storage Integration Backup Slows to a crawl

Post by aarick »

Have recently implemented a new NetApp FAS and we migrated from iSCSI to NFS. NFS runs on a 10G network with jumbo frames.
Have not been able to get backup to work properly with Storage integration.
The backup would sometimes run at up to 120mb/s. However, majority of time, the speed drops to a crawl at around <100kb/s and so the backup fails or we stop it as it will take days to complete a backup job.
Turning off storage integration stops the issue, but then we loose the benefit of using storage snapshot instead of VMware snapshots and the backup runs at a lower speed.
We have been trying to figure out what the issue might be but have not been able to pinpoint where the issue is.
Anyone have any suggestion or experience this issue?

Thanks,
Eric
Vitaliy S.
VP, Product Management
Posts: 27112
Liked: 2719 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by Vitaliy S. »

Hi Eric,

Can you please give us a bit more details about your infrastructure? When does the performance drop? Does it happen for all VMs? What is the bottleneck stat? What does job session log show as a slowest operation?

Thanks!
aarick
Influencer
Posts: 11
Liked: 3 times
Joined: Apr 02, 2015 6:40 am
Full Name: Eric Yew
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by aarick »

Vitaliy S. wrote:Hi Eric,

Can you please give us a bit more details about your infrastructure? When does the performance drop? Does it happen for all VMs? What is the bottleneck stat? What does job session log show as a slowest operation?

Thanks!
Hi Vitaliy,

I currently have a case open (Case # 00868756). Here is a quick overview of my environment.
Infrastructure:
2 sites: Production and Disaster recovery

Production:
Physical Veeam backup server on Win server 2012 R2. Backup repository is on a bunch of local sata drives.
10GBps NICS x2: 1 connected to NFS network (jumbo frames enabled), 1 connected to LAN. Backup server acts as proxy.
VMWare 5.5 with Netapp FAS connected via NFS on IBM Blades. NFS network has jumbo frames enabled.

Disaster recovery:
Physical Veeam backup server on IBM Blades. Server 2008 R2. Backup repository is on NetApp CIFS
10GBPS NICS x2: 1 connected to NFS network (jumbo frames enabled), 1 connected to LAN. Backup server acts as proxy.

2 sites are connected via 10GBPS x 2 managed fibre link with <1ms latency

I think the performance drop only happens when the 2 backup server communicate with each other when running a replication, but I cannot be sure. Once the performance drops, it will stay below 100kb/s indefinitely. A reboot sometimes seem to fix the issue temporarily. It affects all VMs. Bottleneck is always source. It's the hard disk transfer that it slows down on.
When I disable Use storage snapshots, the job will run at acceptable (but slower) speed. Between 30mb/s-70mb/s.
I currently have a backup job running in Prod fine at 120-180mb/s with the back up server in DR switch off. Once the backup job completes, I will turn on the DR backup server and run a replication job. I will then, test running a backup in prod again, which I am sure both the replication and backup job will drop to <100kb/s again as has happen in the last few days that I have been troubleshooting.

Thanks for having a look. Hope you can point me in the right direction.
Gostev
Chief Product Officer
Posts: 31524
Liked: 6700 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by Gostev »

Hi, Eric.

We would like devs to take a look at this - what is the support case ID for this?
This is certainly not a know issue, other users are not experiencing anything like that.

Thanks!
aarick
Influencer
Posts: 11
Liked: 3 times
Joined: Apr 02, 2015 6:40 am
Full Name: Eric Yew
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by aarick »

Hi Gostev,

It's Case # 00868756.

Thanks,
Eric
aarick
Influencer
Posts: 11
Liked: 3 times
Joined: Apr 02, 2015 6:40 am
Full Name: Eric Yew
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by aarick »

So I ran another 3 backup jobs as active full backups.
1st backup was 2 exchange server and that ran successfully. With parallel processing, the speed was between 100-190mb/s.
Then, I turn on DR backup server and start another on prod backup job with 30VMs. Processing rate was 256mb/s, but 7 VMs failed with ChannelError: ConnectionReset.
I then proceed with a retry and now the processing rate was only 6mb/s (parallel processing). I stopped the job as it was going to take too long.
I ran a 3rd job with 1 vm. It started great at up to 128mb/s for 5 min, then it dropped and is now sitting at 0kb/s.
Seriously baffling me why this is happening.
aarick
Influencer
Posts: 11
Liked: 3 times
Joined: Apr 02, 2015 6:40 am
Full Name: Eric Yew
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by aarick »

Reviewing the logs have the following errors:

Code: Select all

[07.04.2015 11:16:02] <  8744> nfs| WARN|Error while executing command: NfsDiskRead.
[07.04.2015 11:16:31] <  4236> nfs| WARN|Error while executing command: NfsDiskRead.
[07.04.2015 11:16:31] <  4236> nfs| WARN|Error while executing command: NfsDiskRead.
[07.04.2015 11:16:31] <  4236> nfs| WARN|Error while executing command: NfsDiskRead.
[07.04.2015 11:16:31] <  4236> nfs| WARN|Error while executing command: NfsDiskRead.
[07.04.2015 11:16:31] <  4236> nfs| Reseting NFS session on try: 5
[07.04.2015 11:16:31] <  4236> cli| Binding to port range. Min: 300, Max: 999
[07.04.2015 11:16:31] <  4236> cli| Client connection bound to port: 963
[07.04.2015 11:16:31] <  4236> cli| Binding to port range. Min: 964, Max: 999
[07.04.2015 11:16:31] <  4236> cli| Client connection bound to port: 965
[07.04.2015 11:16:31] <  4236> cli| Binding to port range. Min: 966, Max: 999
[07.04.2015 11:16:31] <  4236> cli| Client connection bound to port: 969
[07.04.2015 11:16:31] <  4236> nfs| Connected to NFS server: 172.16.112.22, port 2049. FileHandle: 0000084080000040000f4a3af6784080000040000f4a3af67400000000
[07.04.2015 11:16:31] <  9400> cli|     - 5%, workload src: 99/10/0, ntf: 93/10/1
[07.04.2015 11:16:58] <  8744> nfs| WARN|Error while executing command: NfsDiskRead.
[07.04.2015 11:17:00] <  7592> cli| Number of sessions: 3. Interval: 900 sec. 
[07.04.2015 11:17:25] <  7188> nfs| WARN|Error while executing command: NfsDiskRead.
[07.04.2015 11:17:25] <  7188> nfs| WARN|Error while executing command: NfsDiskRead.
[07.04.2015 11:17:25] <  7188> nfs| WARN|Error while executing command: NfsDiskRead.
[07.04.2015 11:17:25] <  7188> nfs| WARN|Error while executing command: NfsDiskRead.
[07.04.2015 11:17:25] <  7188> nfs| Reseting NFS session on try: 5
[07.04.2015 11:17:25] <  7188> cli| Binding to port range. Min: 300, Max: 999
[07.04.2015 11:17:25] <  7188> nfs| WARN|Error while executing command: NfsDiskRead.
[07.04.2015 11:17:25] <  7188> cli| Binding to port range. Min: 300, Max: 999
[07.04.2015 11:17:25] <  7188> nfs| WARN|Error while executing command: NfsDiskRead.
[07.04.2015 11:17:25] <  7188> cli| Binding to port range. Min: 300, Max: 999
NetApp have reviewed their logs and can't see any issues on their side. Any idea what would cause a Disk Read error. The network guys have also run a trace on the network and cannot see any errors or dropped frames on path.
aarick
Influencer
Posts: 11
Liked: 3 times
Joined: Apr 02, 2015 6:40 am
Full Name: Eric Yew
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by aarick »

NetApp says that this is an issue with Veeam and storage integration. How can we get this looked at and fixed. They brought up this forum post, but we have already applied the new Veeam agent.
http://forums.veeam.com/veeam-backup-re ... 25377.html
Gostev
Chief Product Officer
Posts: 31524
Liked: 6700 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by Gostev »

Eric, please continue to work with our support (you have not been responding for over a week). I recommend that you ask this escalated to a higher support tier. Generally speaking, there are no known issues matching your description, and we have many customers using NetApp integration with NFS successfully. Which makes the environment-specific issue very likely to be the cause here. Thanks!
aarick
Influencer
Posts: 11
Liked: 3 times
Joined: Apr 02, 2015 6:40 am
Full Name: Eric Yew
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by aarick »

We were waiting on NetApp to investigate and I responded to veeam support once they got back to us, which was not a week ago but before I posted on this forum. I believe you mention that you were going to get the dev to look into this and I never heard anything back since. That was why I posted latest update on this forum to try to reach out for help as we are stuck. I did not realise veeam's view was this is our issue to sort out. We have investigated our environment end to end to try to pinpoint the issue with no luck. If you look at the forum link I mention above, users were having similar issue and was only resolved by disabling storage integration. So I'm not sure how you can say that there are no known issues. I've now asked support to escalate to a higher tier. Thanks anyway.
bteichner
Enthusiast
Posts: 30
Liked: 2 times
Joined: Apr 30, 2012 5:54 pm
Full Name: Brian Teichner
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by bteichner »

We have just started to use Veeam with NetApp storage snapshots, and I'm seeing very similar results using NetApp c-mode. I just opened a case with Veeam today to review our setup. Are you using 7-mode or c-mode when you see this issue?
aarick
Influencer
Posts: 11
Liked: 3 times
Joined: Apr 02, 2015 6:40 am
Full Name: Eric Yew
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by aarick »

bteichner wrote:We have just started to use Veeam with NetApp storage snapshots, and I'm seeing very similar results using NetApp c-mode. I just opened a case with Veeam today to review our setup. Are you using 7-mode or c-mode when you see this issue?
we are using c-mode. I think this issue is much wider spread than Veeam is ready to admit from looking at the other forum post I have linked above. We were hoping moving to storage integration would solve all our Veeam issues over the years, but it has now made it worst.
Are you using iSCSI or NFS? 10Gbps or 1gbps network? Or are you on FC?
bteichner
Enthusiast
Posts: 30
Liked: 2 times
Joined: Apr 30, 2012 5:54 pm
Full Name: Brian Teichner
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by bteichner »

We are using 10Gbps NFS. We ended up setting up cluster-management LIFs for each of the NetApp nodes, but after entering the first IP within Veeam, it discovers all the volumes within the SVM. I think this may be what is causing the slowness with some of the backups. If the volume is on the node that Veeam is connecting to (through the LIF IP), then the backup seems to run as expected. If the VM is on a volume on the second node, traffic will need to go through the NetApp cluster switches. I should have a call with Veeam support tomorrow and will continue to review this with them.
aarick
Influencer
Posts: 11
Liked: 3 times
Joined: Apr 02, 2015 6:40 am
Full Name: Eric Yew
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by aarick »

Veeam provided us with a hotfix replacing the netapp dll to allow us to set priority for IP to use with NetApp. So we assign the IP for the LIF that hosted all our VMware nfs volumes.
Still testing this and speed are more consistent now it seems. However, more testing is required and now we randomly get jobs failing with connection error:
ChannelError: ConnectionReset

Just a quick update. Still working with Veeam and testing further to confirm if this fixes the issue.
bteichner
Enthusiast
Posts: 30
Liked: 2 times
Joined: Apr 30, 2012 5:54 pm
Full Name: Brian Teichner
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by bteichner »

How has your additional testing been going?
aarick
Influencer
Posts: 11
Liked: 3 times
Joined: Apr 02, 2015 6:40 am
Full Name: Eric Yew
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by aarick »

bteichner wrote:How has your additional testing been going?
Backup and replication is still inconsistent with storage integration turned on. We no longer get the speed drop, but we randomly get the following errors:
· ChannelError: ConnectionReset
· An item with the same key has already been added.
· [SanSnapshotMount] Failed to mount NFS share to proxy.

The issue does not seem to be related to the LIF as we have disabled the 2nd interface and still get the errors above.
lchichiarelli
Novice
Posts: 6
Liked: never
Joined: Jun 24, 2014 1:30 pm
Full Name: Luca Chichiarelli
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by lchichiarelli »

Same problem here, NetApp 2552, Data Ontap 8.3 and 10Gbit.

Using NFS for VMs, VMware Esxi 5.5 U2 and Veeam 8 Update 2a.

With storage integration enabled backups runs slow as hell...anyone sort this out?
foggy
Veeam Software
Posts: 21070
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by foggy »

Luca, have you contacted technical support already?
lchichiarelli
Novice
Posts: 6
Liked: never
Joined: Jun 24, 2014 1:30 pm
Full Name: Luca Chichiarelli
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by lchichiarelli »

@foggy not yet, just wondering if there is an official fix for this or I've to contact the tech support to obtain a fix.

thanks
foggy
Veeam Software
Posts: 21070
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by foggy »

No official fix, since most of the issues are environment-specific. Each case should be investigated separately.
joshuacollins
Lurker
Posts: 1
Liked: never
Joined: Jun 24, 2015 4:27 am
Full Name: Joshua Collins
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by joshuacollins »

Chiming in here after finding this post online. We have the same exact issue happening in our environment. VMware 5.5U2 (environment is brand new not more than a couple of weeks), with a couple FAS2240-4 NetApps. We are also running VMs via NFS storage and getting the exact same error:

"Error: [SanSnapshotMount] Failed to mount NFS share to proxy."

Also putting in the failover setting doesn't work. i will try totally disabling storage integration but this is dishearting because we just spent a ton of money on licensing enterprise plus edition to get this feature (and it doesn't work........)

:?
foggy
Veeam Software
Posts: 21070
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by foggy »

Joshua, please contact technical support so our technical guys could take a look at your environment.
aarick
Influencer
Posts: 11
Liked: 3 times
Joined: Apr 02, 2015 6:40 am
Full Name: Eric Yew
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by aarick » 3 people like this post

Our issue seems to have stabilise now. I had to ensure that I only have the VMware volumes selected to be scanned periodically.
Issues seems to be because we had a lot of CIFS share and when Veeam scans the FAS, it uses up all the ports causing the backup to fail. Limiting it to only scan VMware volumes now and have not had the issue for a few weeks now. Hope that might help others.
Storage Infrastructure > NetApp > <FAS_Name>...right click...Choose volumes...Only these volumes...
AMS
Expert
Posts: 145
Liked: 33 times
Joined: Mar 06, 2012 6:32 pm
Full Name: Ari Saperstein
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by AMS »

For those inside Veeam, there is another case open for this matter
Storage Snapshot LIF Selection - Case #0097081
AMS
Expert
Posts: 145
Liked: 33 times
Joined: Mar 06, 2012 6:32 pm
Full Name: Ari Saperstein
Contact:

Re: Netapp Storage Integration Backup Slows to a crawl

Post by AMS »

Sorry. Above case is 00970819
Post Reply

Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 74 guests