Skip to Content.
Sympa Menu

ndt-users - Re: new server and slow off-lan server-to-client speeds

Subject: ndt-users list created

List archive

Re: new server and slow off-lan server-to-client speeds


Chronological Thread 
  • From: Richard Carlson <>
  • To: Dale Blount <>
  • Cc:
  • Subject: Re: new server and slow off-lan server-to-client speeds
  • Date: Fri, 17 Feb 2006 20:11:04 -0500

Hi Dale;

Answers in-line
At 11:23 AM 2/17/2006, Dale Blount wrote:
On Thu, 2006-02-16 at 17:02 -0500, Richard Carlson wrote:
> Hi Dale;
>
> At 02:07 PM 2/16/2006, Dale Blount wrote:
> >On Wed, 2006-02-15 at 10:40 -0500, Richard Carlson wrote:
> > > Hi Dale;
> > >
> > > I don't recall if I replied to this earlier but here it is again.
> > >
> > > I'm seeing this problem more and more and I am currently trying to find a
> > > real fix for this problem. Here's what's happening.
> > >
> > > The NDT server starts sending data to the remote client. It enters a
> > > simple send loop and pumps data into the network for 10 seconds. It then
> > > exits the loop and sends the final results over to the client.
> > >
> > > The problem is, that when in this loop it is possible for the OS to
> > > transfer data to the TCP stack, faster than the TCP stack pump it out into
> > > the network. This results in a large standing queue, visible with the
> > > netstat -nat command). So at the end of 10 seconds the code stops pumping
> > > more data but the client keeps reading until the queue empties. Note the
> > > Web100 Duration variable has a value of 34,893,925 microseconds, or almost
> > > 35 seconds.
> > >
> > > One temporary step is to limit the max buffer space (tcp_wmax) to
> > something
> > > in the 512 KB to 1MB range. This will keep the queue from building up too
> > > much, but it's really just a band-aid until I can figure out how to
> > monitor
> > > the queue length to prevent such large queues in the first place.
> > >
> >
> >Rich,
> >
> >I can't find a tcp_wmax setting, but here is what I have set:
>
> Sorry that was a typo on my part. It should have been tcp_wmem not _wmax..
>
> >net.core.wmem_max = 131072
> >net.core.rmem_max = 131072
> >net.ipv4.tcp_wmem = 4096 16384 131072
> >net.ipv4.tcp_rmem = 4096 16384 131072
> >


so my "net.ipv4.tcp_wmem = 4096 16384 131072" is ok to limit the queue
length?

Yes, however this will limit you to about 128 KB so depending on the path length you might start seeing reduced performance. For example a round trip time of 11 msec would limit the maximum speed to ~95 Mbps (128 KB/11msec)* 8b/B = 95.33 Mbps. This is OK for a campus network, so this would only be a problem if the clients are outside this network, and can run at 100 Mbps all the way to the server.

> >Upload always works OK, but on anything but the lan, download is right
> >around 75k. It doesn't really matter if it's set to 128kb/512kb/2Mb,
> >it's always 70-80kb (both on a 5Mbps upload cable modem and a 768kbps
> >upload dsl link, both 3 hops from the ndt server).
> >
> >The old server that this is replacing is still around, and speedtests to
> >it work just fine. Could the newer hardware alone be causing this whole
> >problem? I've tried the sysctl settings from the old server with the
> >same results.
>
> It could be a hardware issue or an OS issue. Did you change/upgrade the OS
> level too? I noticed a problem with my server when I went from Linux
> 2.6.12 to 2.6 13.

I went from 2.6.12.2-web100 to 2.6.15.3-web100. Distro is the same
version.

I started having problems when I moved from 2.6.12 to 2.6.13, the .14 & .15 kernels also failed. I finally replaced the e100.c file in the .15 distribution with the one from the .12 tree and my problems went away.

> I finally tracked it down to a change in the Intel
> FastEthernet (e100) NIC driver that came with the new OS. I replaced the
> new e100.c file with the one from the .12 kernel and everything started
> working again. I also have a report of a problem with a built in NIC, and
> the problem was resolved when a PCI bus based NIC was installed. Perhaps
> this is a bigger problem than I realize.
>

I also moved from a Dlink PCI card to an onboard TG3 chipset.

What NIC driver does this chipset use? One option is to try what I did and use the old driver/net/xxx.c file. Simply rename the file in the .15 tree and then copy in the file from the .12 tree. Then run make modules; make modules_install and reboot.



> > > If anyone has any suggestions on how to do this, please let me know.
> > >
> >
> >Couldn't the client be adjusted to stop reading after 10 seconds? It
> >could then report the data transferred so far.
>
> There is a timer that runs in the client to clean things up, but I either
> there's a bug in my code or something else is wrong and the timer isn't
> working. I am currently testing a patch on my server at
> http://web100.internet2.edu:7123 It would help if you could try this
> server and let me know if the tests run long or what happens.
>

LAN: Duration: 12510326
running 10s outbound test (client to server) . . . 8.30Mb/s
running 10s inbound test (server to client) . . . 16.56Mb/s


CABLE: Duration: 17116116
running 10s outbound test (client to server) . . . . . 1.40Mb/s
running 10s inbound test (server to client) . . . . . . 1.80Mb/s

So, if I read this right you get much higher speeds testing to my server with my new code than you do to your local server. I'll try and get a patch built and release a new version early next week. Thanks for the feedback.

Rich

Dale

------------------------------------



Richard A. Carlson e-mail:

Network Engineer phone: (734) 352-7043
Internet2 fax: (734) 913-4255
1000 Oakbrook Dr; Suite 300
Ann Arbor, MI 48104



Archive powered by MHonArc 2.6.16.

Top of Page