Skip to Content.
Sympa Menu

ndt-users - Re: failed middlebox testing

Subject: ndt-users list created

List archive

Re: failed middlebox testing


Chronological Thread 
  • From: Richard Carlson <>
  • To: Clayton Keller <>
  • Cc:
  • Subject: Re: failed middlebox testing
  • Date: Wed, 31 Aug 2005 13:46:51 -0400

Hi Clayton;

At 09:53 AM 8/31/2005, Clayton Keller wrote:
Richard Carlson wrote:
Hi Clayton;
At 11:10 AM 8/30/2005,

wrote:

I have previously configured and have the application operational. Upon setting up Web100 and NDT on another system, I am having what appears to be issues with the web100srv.

I am running kernel 2.6.12.5 with web100-2.5.4 web100_userland-1.5.4 and NDT-3.1.4a. Current version of java is 1.4.2_09, and I have tried with both the libpcap files that are provided by Fedora Core 4 and also compiling libpcap-0.9.3.

All of this sounds normal.

I have used the following options when running the web100srv client.

./web100srv -a -m -l /var/log/web100/web100srv.log.

OK, the -a says generate the admin view, the -m says let multiple clients run simultaneously, and the -l specifies the log file.

When running with -d I see the following:

# ./web100srv -d -m -l /var/log/web100/web100srv.log
Reading config file /etc/ndt.conf to obtain options
ANL/Internet2 NDT ver 3.1.4
Variables file = /usr/local/ndt/web100_variables
log file = /var/log/web100/web100srv.log
Debug level set to 1
server ready on port 3001
web100_init() read 69 variables from file

Upon starting a test I see the following:

Signal 17 received from process 6956

Signal 17 indicates that the child process 6956 was stopped or terminated.

successfully locked '/tmp/view.string' for updating
sending '0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,' to tmp file
listening for Inet connection on sock2, fd=3
server ports 32778 32779
listening for Inet connection on sock3, fd=5
Middlebox test, Port 32779 waiting for incoming connection
Set MSS to 536, Window size set to 16777216KB

At this point the server should have ports 32778 and 32779 in a listen state. Is that true? Try running a "netstat -nat" command on the server. The ports should be in some state (WAITING, LISTEN, or something).

I then receive an error that the server failed middlebox testing.

On the working system, I see much more output when the test is began...

Any help would be appreciated on this issue, and if more information is needed, I can work on providing that as well.

What happens if you run without the "-m" flag? Does it work then?
What type of port security did you enable? Using the "-m" flag means that the NDT server will use ephemeral ports for the client connections. If you have "iptables" enabled, then the client may not be able to connect to the server.
I just tried using the "-m" on one of my test systems and it ran properly, so I'd suspect an iptables problem.
Regards;
Rich Carlson

When running without "-m", i receive the following output:

Checking for Middleboxes . . . . . . . . . . . . . . . . . . Done
running 10s outbound test (client to server) . . . . . Server failed: 'Go' flag not received

So a connection is opened and closed on port 3003 and the client moves on to the next test. The NDT server and client communicate with each other over port 3001. Since there are multiple tests being run, I created a simple message passing protocol that allows the server to control the clients actions. The original client ran on a timer, meaning it started each new test at a specific time. I changed that behavior so that the client enters a wait state at the end of each test. The server sends a message to the client on port 3001 to move the client out of this wait state and onto the next test. The client also starts a timer to avoid hanging forever if the server dies.

What is happening here is that the server is die-ing and the client is timeing out. This is what the "Server failed: ..." message means.

When the test is run and this error is returned I see a flood of Signal 11 received from process XXXX.

Signal 11 is an invalid memory reference. Are you running the server as root? There might be a bug in the code that causes it to crash if it isn't root. It needs root access to run the packet-pair bottleneck link detection algorithm (it does a raw read on the network interface).


Also, I do see it listening for connections to on the ports indicated in the debug when running with "-m":

tcp 0 0 0.0.0.0:32775 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:32776 0.0.0.0:* LISTEN

So, this very well could be an IPTABLES issue. Do you have a range of ports that when running with "-m" the server will be listening on? I guess the ESTABLISHED and RELATED IPTABLES rule is not working on this.

No, I don't have a port range when running in -m mode. The server will request a pair of open ports from the kernel and it will simply use what it gets back.

I turned IPTABLES turned off momentarily for testing and using the "-m" option, I still receive the error:

running 10s outbound test (client to server) . . . . . Server failed: 'Go' flag not received

So this seems to indicate that the original problem really is an IPTABLES issue. You get past the middlebox test and the server crashes when it tries to start the client->server speed test (the same as when you run without the -m option).

Again, there are a number of Signal 11 received from process XXXX which continue to flood with debug on, until a kill web100srv.

This is also a bug in my code. I should handle signal 11's as a permanent error and terminate the process. I'll fix this soon.

I've been trying to look for any information pertaining to the 'Go' flag, but again, input and information would be greatly appreciated.

See the previous email I sent to this list. The 'Go' flag is part of the client/server communications. It allows the server to control the clients state.

As I noted above, this may be a process ownership problem. Try running the server as root and with the -m flag turned off. What happens then?

Rich

Clay




------------------------------------



Richard A. Carlson e-mail:

Network Engineer phone: (734) 352-7043
Internet2 fax: (734) 913-4255
1000 Oakbrook Dr; Suite 300
Ann Arbor, MI 48104



Archive powered by MHonArc 2.6.16.

Top of Page