Using object storage as a backup target
Post Reply
wparish
Novice
Posts: 5
Liked: 2 times
Joined: Jan 03, 2019 3:52 pm
Full Name: William Parish
Contact:

High latency for entire network, only during SOBR offload

Post by wparish »

Greetings,

I first noticed this issue when I first tried SOBR offload to S3 with 9.5, and just gave up the effort.

I'm trying again now with v10, and have noticed the same result. With the work at home thing going on for months I've had to use the internet pretty much daily for extended periods, and I know it isn't anywhere near as bad as during the offload. I put a nagios monitor in place this time, and where my local router latency might intermittently peak to ~10ms and ~12ms average to work, since my last offload started it's spiking as high as 5000ms.

I knew something was wrong last night as it was pretty much unusable when the job kicked in, so I applied throttling. I saw it using about 3MBs, so I dropped that down to 1500Kbs. It doesn't seem to make any difference. I'm wondering if it's not so much the volume of packets but the frequency of packets. And along those lines is there something I can do to throttle the 'flooding' behavior of this type of thing (I mean, if that's even what's happening, how can I check)?

I'm fairly confident after about 9 hours if I kill that job my internet will go back to normal pretty quickly. I'm thinking it's probably something to do with Cox trying to throttle my usage (in spite of charging me an extra $50 every month for unlimited usage).

Be that as it may, I searched the forums and only found one other reference to SOBR latency due to a network issue, but the resolution wasn't included in the follow up and it was actually only presented as a question of how do I manually run a SOBR to test that it was fixed? I presume it was fixed because that conversation stopped.

The job itself never errors. As far as I can tell, Veeam is a happy clam. It's just my performance for everything else to the internet is really bad, not to mention I'm concerned that if I'm impacting the upstream routers in some way that sooner or later Cox will take more drastic measures to curb my usage, and I don't want that.

There's a on site latency monitor throttle trigger for SAN storage, if something like that doesn't exist for SOBR offload, maybe it should? I never see my latency go up to 2-5 seconds, and I certainly never see that condition persist for 6 hours. It has to be directly related to the SOBR offload.

In the graphs the SOBR started around 9:30pm yesterday, and the graph runs through 6am today. You can see there's a definite change in latency (and even packet loss). The sandiego.ar01 host is the datacenter that hosts the S3 repository, and cox 'hop 4' is just an intermediate hop within the Cox network. Localhost is the nagios VM, and router.asus.com is my internet router. Packet loss is only observed outside the LAN, and internal Latency remains acceptable for me.

Image

Thank you,
- William

Gostev
SVP, Product Management
Posts: 27173
Liked: 4455 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: High latency for entire network, only during SOBR offload

Post by Gostev »

Could it be that your router is simply not keeping up with the number of connections? I assume you're using some consumer/SOHO unit, since I don't believe Asus builds enterprise-grade networking equipment? I might be wrong with that, but in any case the router would be my first suspect, before your ISP. So, I think the best course of action here for you would be to open a case with Asus.

Alternatively, you can try reducing the number of connections to object storage that offload process uses with the S3ConcurrentTaskLimit registry value, however this is not recommended from performance perspective. The default and recommended value is 64, which is relatively few connections anyway - so even if you reduce their number, it will still be too easy to reach and exceed across with a few different applications working concurrently down the road. So this approach will be all about fighting symptoms, instead of the root cause.

Thanks!

wparish
Novice
Posts: 5
Liked: 2 times
Joined: Jan 03, 2019 3:52 pm
Full Name: William Parish
Contact:

Re: High latency for entire network, only during SOBR offload

Post by wparish »

It's possible. I also noticed I'm not on the latest firmware. My max connections are set to 300000. I'll move over another job and starts SOBR, then check the tcp/udp connections and also check if any connections are dropped during the high latency period. I'll also check the cpu load, etc. While I did see latency increase on the Asus during my last test, it wasn't anything as significant as the upstream routers. If it turns out the router is the issue I'll try moving the ESX hardware directly to the cable modem.

However, my question was more about how to throttle it, and it looks like the S3ConcurrentTaskLimit is a solid lead. There are some good google hits on that value.

In my day job we deal with a lot of small businesses, and it's not that uncommon to find that they don't have IT staff to put in high end routers. If the SOBR offload really has that kind of dependency then is there a HCL someplace I can refer to so I can make sure presales is aware of this restriction? I'll still go through the extra troubleshooting steps, as I would like to know for sure where the bottleneck is.

selva
Enthusiast
Posts: 41
Liked: 3 times
Joined: Apr 07, 2017 5:30 pm
Contact:

Re: High latency for entire network, only during SOBR offload

Post by selva »

High latency when the uplink is saturated has all signs of excessive buffering somewhere up stream (may be the ISP). Google "bufferbloat". Your only remedy may be to throttle the upload. I have had good results using the Network traffic rules from the main menu and setting the throttling to about 50 to 60% of max upload bandwidth during office hours.

wparish
Novice
Posts: 5
Liked: 2 times
Joined: Jan 03, 2019 3:52 pm
Full Name: William Parish
Contact:

Re: High latency for entire network, only during SOBR offload

Post by wparish » 1 person likes this post

I'm thinking it's the ISP. When it triggered there's no load on the router, and I only have about 147 connections out of 300000, using the latest firmware. Traceroute and the very first Cox HOP has a very high response time. It's not even moving that much data, and I already have it restricted to 1MB/s when it was previously averaging 3MB/s. I haven't tried the registry key yet, but did set the concurrent tasks down. I guess if the registry key doesn't help then I just won't use the S3 offload.

RT-AC87R:/tmp/home/root# cat /proc/sys/net/ipv4/netfilter/ip_conntrack_max
300000
RT-AC87R:/tmp/home/root# grep -c ^udp /proc/net/ip_conntrack
141
RT-AC87R:/tmp/home/root# grep -c ^tcp /proc/net/ip_conntrack
147

I'll read up on bufferbloat in the mean time. Thanks for the tip.

wparish
Novice
Posts: 5
Liked: 2 times
Joined: Jan 03, 2019 3:52 pm
Full Name: William Parish
Contact:

Re: High latency for entire network, only during SOBR offload

Post by wparish » 1 person likes this post

So the registry key is the winner. I set it to 4, and in 1310 samples I had 0 packet loss, and an average ping of 92ms with a max of 373 (and a low of 8). I can live with that. I don't need great performance, it's just a long term offload. Thanks.

wparish
Novice
Posts: 5
Liked: 2 times
Joined: Jan 03, 2019 3:52 pm
Full Name: William Parish
Contact:

Re: High latency for entire network, only during SOBR offload

Post by wparish »

I also found a great article on techsegun.com by Daniel Segun on validating and resolving bufferbloat. By following his suggestions, and implementing a QOS rule for the Veeam VM, I further reduced my latency from a 87ms average down to a 10ms average.

In retrospect I could probably have skipped the reg key had I done the QOS rule first. Still, that's good for me to know, as not everyone I work with will have a router capable of QOS.

Now I have a couple ways to resolve the issue.

Thank you for the various suggestions, I've learned something new, so the effort was worthwhile.

Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests