Skip to Content.
Sympa Menu

ndt-users - Re: web100 chokes network activity under 2.6.15

Subject: ndt-users list created

List archive

Re: web100 chokes network activity under 2.6.15


Chronological Thread 
  • From: Richard Carlson <>
  • To: Maurice Volaski <>
  • Cc: ,
  • Subject: Re: web100 chokes network activity under 2.6.15
  • Date: Fri, 17 Mar 2006 09:35:32 -0500

Hi Maurice;

What version is the Gentoo kernel? Do you get the same poor performance with an unpatched vanilla 2.6.15 kernel? What type of NIC are you using?

I've had a lot of problems with the Linux NIC drivers over the past few months. For me it started when I installed the 2.6.13 kernel. I noticed very poor performance on my FastE connected server so I started digging into it. A packet trace (captured with tcpdump and analyzed with tcptrace) showed very strange behavior. I'd see the NDT server dump 40+ packets into the network, get a single ACK back from the client, and dump 900+ more packets into the net.

Everything worked fine under 2.6.12, but failed under 2.6.13. I reported this to the Linux community, but never heard any response. Someone told me that things got better with the 2.6.14 & 2.6.15 kernels so I upgraded to 2.6.15, same problem. I got mad and copied the e100.c file from the 2.6.12 tree into the drivers/net directory replacing the 2.6.15 version and rebuilt the modules. This fixed my problem.

Looking at the e100.c NIC driver source files I noticed that the .15 version was much larger than the .12 version. I think they did something with the NAPI polling functions, and going back to the old version seems to help. I've had other reports of problems and at least one case they were using the Intel NIC (e100.c) and replacing the source file with an old version solved the problem.

So I'd suggest that you replace the NIC driver source file with a version from the 2.6.12 kernel file and rebuild the module and/or kernel. Hopefully this will help.

Rich

At 04:13 PM 3/16/2006, Maurice Volaski wrote:
Getting back to you on this... I realized after I switched JDKs that I was running the Gentoo kernel and not the vanilla 2.6.15 with the web100 patches.

When I did loaded the patched kernel (2.615.6 and web100 2.5.8), I immediately began having problems. Processes using the network are not behaving properly. We use drbd, which is network RAID, and it stalls. When I tried to sync portage tree in Gentoo, which uses rsync, it also stalls. Interestingly, ssh is not affected.

This occurs even if I unload the iptable modules, which aren't being used, anyway.

So it appears that the web100 selectively chokes network activity and is presently unusable!


Hi Marurice;

At 10:59 AM 3/8/2006, Maurice Volaski wrote:
Hi Maurice;

OK, I looked at the attached files and I think there is a bug in my build system somewhere. For some reason gnu is creating an Admin/Admin directory and putting a copy of the Admin.class file there. It also creates a copy in the parent directory and that is installed in the /usr/local/ndt directory during the make install process. This same bug may be causing gnu to complain about the ./Tcpbw100 file (or maybe it thinks it's a directory).

In any case everything seems to be in order and the make install process completed despite the errors. And just to be sure you did create the customized tcpbw100.html file by running the "conf/create-html.sh" command from the ndt-3.1.4b directory, right?

Assuming that is done, then I think the question is, why doesn't the applet load? One possible answer is a version difference between the SDK (1.5.0_6 in your case) and the plugin JRE version. I'm hearing and seeing cases where the 1.4.2 plugin wouldn't run the 1.5.0 applet. There are 2 solutions
1) recompile the applet with a 1.4.2 SDK
2) update the browser plugin to a 1.5.0 JRE

I can try recompiling it, but shouldn't there actually be a file called Tcpbw100 somewhere?

No, there is no file called Tcpbw100. If the fakewww log file shows the .jar file was sent, then I'd really suspect the SDK/JRE version mismatch. My laptop is running a 1.5.0 JRE so if you send me a URL I can test this out for you.

--

Maurice Volaski,

Computing Support, Rose F. Kennedy Center
Albert Einstein College of Medicine of Yeshiva University

------------------------------------



Richard A. Carlson e-mail:

Network Engineer phone: (734) 352-7043
Internet2 fax: (734) 913-4255
1000 Oakbrook Dr; Suite 300
Ann Arbor, MI 48104



Archive powered by MHonArc 2.6.16.

Top of Page