16

I have a network that has been experiencing slow internet speeds. After a great deal of troubleshooting, I have determined that any streaming content/downloads will cause the latency of WAN traffic to explode.

For example, under no load, I ping 8.8.8.8 at about 30ms. If I start streaming YouTube on the same computer, latency jumps up to around 500ms, with a variance of around 400ms. If I turn off the video, the latency returns to 30ms. But, if I have a user on the same LAN start streaming pandora, the problem returns.

My network is ran off a single 10/100 switch. The switch is connected directly to the DSL router. I typically have a 6Mb connection.

In troubleshooting, I have completed the following:

  • Scanned with wireshark from several workstations looking for erroneous packets. (I would include but scans have confidential info). Nothing even remotely out of the ordinary.
  • Replaced router with an upgraded model, then upgraded firmware.
  • Had ISP increase speed which measured correctly on speedtest.net (10 down, 1.5 up). Problem was exactly the same.
  • Had the ISP swap out cards on their end, just in case they had bad hardware/port.
  • Tested at another office with exact same ISP/package. Had multiple computers streaming YouTube @ 1080p and pandora without impacting latency.
  • Shut down every computer but one and ran at night when no users where there.
  • Monitored LAN traffic, which never experiences a latency problem.

I am aware that, if I am reaching a bandwith limit or the speed is bottlenecking at some hardware, it will cause this problem. However, it doesn't seem that way at all. Almost any traffic over the WAN will shoot up latency. The problem was the same even when I almost doubled connection speed. When I get two users on pandora and a couple surfing, internet goes to nothing (dropped packets, pages won't load). I have half the connection at home and our simultaneous netflix/youtube/pandora streaming doesn't even touch my 5 Mb.

Question: What would cause high latency anytime traffic is going over the WAN?

Matt
  • 160
  • 9
Blackjack00
  • 297
  • 2
  • 3
  • 4
  • 1
    this question covers a wide area, what you are talking about is troubleshooting a network and finding a problem. Questions should be more specific. This btw has nothing to do with wireshark (as your tagging describes). That said, welcome to networkengineering ;) – Bulki May 22 '13 at 05:16
  • Did any answer help you? If so, you should accept the answer so that the question doesn't keep popping up forever, looking for an answer. Alternatively, you can post and accept your own answer. – Ron Maupin Jan 03 '21 at 02:02

12 Answers12

10

This sounds like some form of "bufferbloat", probably on the part of the DSLAM/LNS that's performing the 6Mb rate limiting.

It might be your CPE box, but that's a little less likely.

LapTop006
  • 1,743
  • 11
  • 24
  • +1 It could be some poorly configured rate limiting or shaping on the ISPs part, but it could also be a poor quality (or malfunctioning) CPE. I have seen CPEs rated at 40Mbps start to topple over at 10Mpbs because they can't handle a high pps rate for example. A high pps rate of small packets really strains them. – jwbensley May 22 '13 at 20:57
  • Oh, I hadn't seen that he had replaced the CPE. I missed that bullet point! – jwbensley May 22 '13 at 21:06
9

I would verify where the latency is occurring. Use a tool such as MTR which checks the latency at each hop. MTR combines ping statistics for each hop with a trace route, and can greatly help narrow down this type of problem.

On a linux box the command would be mtr 8.8.8.8, there is also a windows version of this tool.

The output will show you where the latency starts. If it is on the ISP network, you could forward the output to the ISP and help them use it to troubleshoot their network problem.

If the latency starts inside your network, you'll be able to narrow down the problem yourself as well.

Brett Lykins
  • 8,288
  • 5
  • 36
  • 66
  • 1
    is there a mtr version available for Cisco IOS devices at all ? I know it can be run from Junos CLI – DrBru Oct 23 '13 at 10:14
5

Check the DSL line statistics. (interleaved vs. fastpath, error counters, etc.)

The test at a different location tested a different line, maybe on a different DSLAM. This suggests the ISP infrastructure isn't to blame. It strongly suggests your DSL line is at fault. Possibly the DSLAM itself is congested, but it's highly unlikely for you to be the one to be pushing it over the line predictably and repeatedly.

If ATM cells are being corrupted (the transport for most DSL), you'd see significant slow downs like this as the entire frame has to be resent.

Ricky
  • 31,438
  • 2
  • 43
  • 84
3

Anytime I have cases where a customer is experiencing network latency, the first thing to do is check each individual connection in the network. Usually there is one device where a bottleneck is occurring.

If its a low use network, I would completely disable QoS on everything except the internet connected device(as QoS will slow down traffic in a switching environment).

In your packet captures i would do I/O analysis and see if you are getting plateaus anywhere. This can indicate bursty traffic which would cause queuing which will delay the delivery of packets or entirely get rid of the packets.

I would also check the CPU of each device when you have the issue. If you see the CPU jumping up then that is probably your problem device. Check the logs as well to see if there are any errors.

Also, I would be sure all connections are negotiating at full speed (speed 100 full duplex).

Also try disabling any firewall or security services.

Trent
  • 51
  • 1
2

Another thing to look at would be the connection between your switch and the DSL modem. The symptoms you are describing almost sound like there is a duplex mismatch between the two.

Another way to rule out the switch is to remove the switch entirely and test the connection with one machine attached directly to the DSL modem.

user204
  • 21
  • 1
2

High latency/bad throughput when traffic is high is sometimes indicative of a L1 problem (duplex mismatch / bad cable / dirty fibre). Did you check that this is not the case?

0

If you are connecting over a 10/100 switch and have autonegotiate on part of it you may have a duplex mismatch. This will cause frequent collisions when there is load on the network that won't show up when things are relatively quiet. The collisions will cause resends as will as forcing communications to backoff and can cause a seemingly unreasonable slowdown.

0

Sorry to revive an old thread. The OP wrote:

... Almost any traffic over the WAN will shoot up latency...

These are the exact symptoms of Bufferbloat. The router is likely queueing too much traffic, and starving small flows (which are necessary to provide responsiveness.)

Your router needs a way to mitigate the problem of "latency under load". You could farble around with QoS, but this requires a lot of configuration and continual adjustment.

The state of the art has advanced since the OP, so look up Bufferbloat, AQM, CoDel, fq_codel, Cake, PIE, or other techniques.

0

Could this be a bottleneck upstream? Not sure where you are in the world, but perhaps the ISP has terrible international bandwidth. Speedtest.net would default to the closest server.

rick
  • 333
  • 2
  • 8
0

simple method I used was the traceroute function looking for the high response times in the traces and checking that system for failing hardware, DOS attacks, QoS improper classifications and such. of course you need access to all the equipment in the path. Was easy for me during those times since I worked for a telecom.

bwindle66
  • 1
  • 1
0

What is the operating system you're testing this on? If it's Windows, by default there's "QoS Packet Scheduler" service installed and binded to the networking interface. It will kick in depending on the underlying settings of the network stack and proactively delay any traffic that is not classified as "multimedia".

Try to delete it from the interface and recheck your results.

Or better yet, reconfigure it properly: http://www.dslreports.com/faq/3688

Łukasz Bromirski
  • 4,030
  • 15
  • 24
0

I would add from my experience that some ISP treat ICMP packet with lowest priority. It happened once, everytime I start youtube to even have "requests timed out".

Post winmtr before starting the video, and while video is playing. Start a 2nd streaming, and let's see how will this impact both ICMP packets and 1st video.

laf
  • 795
  • 2
  • 7
  • 14