ndt-users - RE: new server and slow off-lan server-to-client speeds
Subject: ndt-users list created
List archive
- From: "Rick Tyrell" <>
- To: "Dale Blount" <>
- Cc: <>
- Subject: RE: new server and slow off-lan server-to-client speeds
- Date: Mon, 20 Feb 2006 08:05:40 -0600
- Importance: normal
- Priority: normal
Title: Re: new server and slow off-lan server-to-client speeds
Sent: Mon 2/20/2006 7:40 AM
To: Richard Carlson
Cc:
Subject: Re: new server and slow off-lan server-to-client speeds
I just noticed I replied to Rich only on accident, resending to
the list
now:
On Fri, 2006-02-17 at 20:11 -0500, Richard Carlson
wrote:
> Hi Dale;
>
> Answers in-line
> At 11:23 AM
2/17/2006, Dale Blount wrote:
> >On Thu, 2006-02-16 at 17:02 -0500,
Richard Carlson wrote:
> > > Hi Dale;
> > >
> >
> At 02:07 PM 2/16/2006, Dale Blount wrote:
> > > >On Wed,
2006-02-15 at 10:40 -0500, Richard Carlson wrote:
> > > > > Hi
Dale;
> > > > >
> > > > > I don't recall if
I replied to this earlier but here it is again.
> > > >
>
> > > > > I'm seeing this problem more and more and I am
currently trying to
> > find a
> > > > > real fix for
this problem. Here's what's happening.
> > > > >
>
> > > > The NDT server starts sending data to the remote
client. It enters a
> > > > > simple send loop and pumps
data into the network for 10
> > seconds. It then
> >
> > > exits the loop and sends the final results over to the
client.
> > > > >
> > > > > The problem is,
that when in this loop it is possible for the OS to
> > > > >
transfer data to the TCP stack, faster than the TCP stack pump it
> >
out into
> > > > > the network. This results in a large
standing queue, visible with the
> > > > > netstat -nat
command). So at the end of 10 seconds the code stops
> >
pumping
> > > > > more data but the client keeps reading until
the queue
> > empties. Note the
> > > > >
Web100 Duration variable has a value of 34,893,925 microseconds, or
> >
almost
> > > > > 35 seconds.
> > > >
>
> > > > > One temporary step is to limit the max buffer
space (tcp_wmax) to
> > > > something
> > > > >
in the 512 KB to 1MB range. This will keep the queue from building
>
> up too
> > > > > much, but it's really just a band-aid
until I can figure out how to
> > > > monitor
> > >
> > the queue length to prevent such large queues in the first
place.
> > > > >
> > > >
> > >
>Rich,
> > > >
> > > >I can't find a tcp_wmax
setting, but here is what I have set:
> > >
> > > Sorry
that was a typo on my part. It should have been tcp_wmem not
_wmax..
> > >
> > > >net.core.wmem_max =
131072
> > > >net.core.rmem_max = 131072
> > >
>net.ipv4.tcp_wmem = 4096 16384 131072
> > >
>net.ipv4.tcp_rmem = 4096 16384 131072
> > > >
>
>
> >
> >so my "net.ipv4.tcp_wmem = 4096 16384 131072" is
ok to limit the queue
> >length?
>
> Yes, however this will
limit you to about 128 KB so depending on the path
> length you might
start seeing reduced performance. For example a round
> trip time of
11 msec would limit the maximum speed to ~95 Mbps (128
> KB/11msec)* 8b/B
= 95.33 Mbps. This is OK for a campus network, so this
> would only
be a problem if the clients are outside this network, and can
> run at 100
Mbps all the way to the server.
>
I've just set it at 128k in an
attempt to get this problem figured out
really. Setting it to 512k
still seems to have no effect.
> > > >Upload always works
OK, but on anything but the lan, download is right
> > > >around
75k. It doesn't really matter if it's set to 128kb/512kb/2Mb,
> >
> >it's always 70-80kb (both on a 5Mbps upload cable modem and a
768kbps
> > > >upload dsl link, both 3 hops from the ndt
server).
> > > >
> > > >The old server that this
is replacing is still around, and speedtests to
> > > >it work
just fine. Could the newer hardware alone be causing this whole
>
> > >problem? I've tried the sysctl settings from the old server
with the
> > > >same results.
> > >
> > >
It could be a hardware issue or an OS issue. Did you
change/upgrade
> > the OS
> > > level too? I noticed
a problem with my server when I went from Linux
> > > 2.6.12 to 2.6
13.
> >
> >I went from 2.6.12.2-web100 to
2.6.15.3-web100. Distro is the same
> >version.
>
> I
started having problems when I moved from 2.6.12 to 2.6.13, the .14 &
.15
> kernels also failed. I finally replaced the e100.c file in the
.15
> distribution with the one from the .12 tree and my problems went
away.
>
> > > I finally tracked it down to a change in the
Intel
> > > FastEthernet (e100) NIC driver that came with the new
OS. I replaced the
> > > new e100.c file with the one from the
.12 kernel and everything started
> > > working again. I also
have a report of a problem with a built in NIC, and
> > > the
problem was resolved when a PCI bus based NIC was installed.
Perhaps
> > > this is a bigger problem than I realize.
> >
>
> >
> >I also moved from a Dlink PCI card to an onboard
TG3 chipset.
>
> What NIC driver does this chipset use? One
option is to try what I did and
> use the old driver/net/xxx.c file.
Simply rename the file in the .15 tree
> and then copy in the file from
the .12 tree. Then run make modules; make
> modules_install and
reboot.
>
>
It's a TG3 chipset, I just installed the driver
from 2.6.12.2 into the
2.6.15 kernel. Same
results.
>
> > > > > If anyone has any
suggestions on how to do this, please let me know.
> > > >
>
> > > >
> > > >Couldn't the client be
adjusted to stop reading after 10 seconds? It
> > > >could
then report the data transferred so far.
> > >
> > >
There is a timer that runs in the client to clean things up, but I
either
> > > there's a bug in my code or something else is wrong and
the timer isn't
> > > working. I am currently testing a patch
on my server at
> > > http://web100.internet2.edu:7123
It would help if you could try this
> > > server and let me know if
the tests run long or what happens.
> > >
> >
>
>LAN: Duration: 12510326
>
> running 10s outbound test
(client to server) . . . 8.30Mb/s
>
> running 10s inbound test
(server to client) . . . 16.56Mb/s
> >
> >
> >CABLE:
Duration: 17116116
> >
running 10s outbound test (client to server) . . . . . 1.40Mb/s
>
> running 10s inbound test
(server to client) . . . . . . 1.80Mb/s
>
> So, if I read this right
you get much higher speeds testing to my server
> with my new code than
you do to your local server. I'll try and get a
> patch built and
release a new version early next week. Thanks for the
>
feedback.
>
My old local server is fine... it's just this new one
with problems.
Here's an interesting point, though. A linux client
connected to the
same 7mb/768kb dsl connection as the box i was doing testing
from
earlier reports the correct speeds. The linux client continues
to
report correctly using the driver from 2.6.12.2, the windows XP
clients
continue to
fail.
Dale
- new server and slow off-lan server-to-client speeds, Dale Blount, 02/10/2006
- Re: new server and slow off-lan server-to-client speeds, Richard Carlson, 02/15/2006
- Re: new server and slow off-lan server-to-client speeds, Clayton Keller, 02/15/2006
- Re: new server and slow off-lan server-to-client speeds, Dale Blount, 02/16/2006
- Re: new server and slow off-lan server-to-client speeds, Richard Carlson, 02/16/2006
- Re: new server and slow off-lan server-to-client speeds, Dale Blount, 02/17/2006
- Re: new server and slow off-lan server-to-client speeds, Richard Carlson, 02/17/2006
- Re: new server and slow off-lan server-to-client speeds, Dale Blount, 02/20/2006
- Re: new server and slow off-lan server-to-client speeds, Richard Carlson, 02/17/2006
- Re: new server and slow off-lan server-to-client speeds, Dale Blount, 02/17/2006
- Re: new server and slow off-lan server-to-client speeds, Richard Carlson, 02/16/2006
- <Possible follow-up(s)>
- RE: new server and slow off-lan server-to-client speeds, Rick Tyrell, 02/20/2006
- RE: new server and slow off-lan server-to-client speeds, Rick Tyrell, 02/21/2006
- Re: new server and slow off-lan server-to-client speeds, Richard Carlson, 02/15/2006
Archive powered by MHonArc 2.6.16.