ndt-users - Re: failed middlebox testing
Subject: ndt-users list created
List archive
- From: Richard Carlson <>
- To: Clayton Keller <>
- Cc:
- Subject: Re: failed middlebox testing
- Date: Thu, 01 Sep 2005 13:30:04 -0400
Hi Clay;
Based on the info below, I'm beginning to suspect that something is wrong with packet filter code. Both traces below indicate that the web100srv process is trying to monitor the eth0 interface. What type of NIC are you using? Apparently you can use tcpdump on this interface, so it isn't clear why the NDT server can't get access. What happens if you tell the NDT server to look at the loopback interface (run a test with the "-i lo" option).
I ran some tests on a local machine and I get these signal 11 faults if I specify an invalid interface, (I tried monitoring eth2 while it was down).
Also, you can get rid of most of the signal 11 messages by handling the SIGSEGV errors. In the web100srv.c file around line 450 you will see the following code
case SIGINT:
case SIGTERM:
exit(0);
Change it to this
case SIGINT:
case SIGTERM:
case SIGSEGV:
exit(0);
and rerun make.
Please keep sending me feedback, I really want to find and fix this bug.
Rich
At 11:40 AM 9/1/2005, Clayton Keller wrote:
Richard Carlson wrote:
Hi Clay;
[snip snip snip]
Here are the current processes after I have restarted them:
root 8370 1 96 14:20 pts/1 00:00:01 /usr/local/sbin/web100srv -a -l /var/log/web100/web100srv.log
root 8377 1 0 14:20 pts/1 00:00:00 /usr/local/sbin/fakewww -l /var/log/web100/fakewww.log
I have configured iptables to allow connections on tcp - dest. ports 3001, 3002, 3003, and 7123.
When configuring web100 in the kernel (2.6.12.5 - kernel.org ), I have the following configured:
--- IP: Web100 networking enhancements
[*] Web100: Extended TCP statistics
(384) Web100:
Default file permissions (0) Web100: Default gid
[*] Web100: Net100 extensions
[*] Web100: Netlink event notification service
GID 0 is root.
File permissions for /usr/local/sbin/web100srv rwxr-xr-x root.root
All files in /usr/local/ndt are root.root with the exception of tcpbw100.html which is root.users. All files in this folder are rw-r--r--
The files in /usr/local/lib are all root.root as well, including the libpcap.a file that was compiled prior to installation of ndt-3.1.4a.
All of this sounds and looks correct. And just to be clear, the client sees the middlebox test end and starts the "client to server" test.
This test fails with the "Server failed: ..." message.
While running the test I went ahead and did a packet capture as well. The following information is being passed on the connection to port 3003:
ip.web.100.server;ip.client.doing.test;1456;-1;-1;
This is the result of the middlebox test. The IP addresses, MSS value and the window scale values. Port 3003 is then closed and re-opened.
I also see SYN, ACK, and ACK FIN traffic passing on 3001 and 3002.
I still am seeing the 'Go' flag error. I thank you for all the help thus far, and am curious what ideas you have as far as proceeding further with this.
No problem, thanks for putting up with my buggy code. .-)
OK, what is the last thing you seen on port 3001. This is the control channel and it should be sending a message back to the client after the middlebox test ends. Right now I'm simply reusing an old buffer so the string should be the port numbers "3002 3003". You should see this message twice. Once before the middlebox test and again after. If you only see it once, then the server is probably failing to initialize the Ethernet interface (the libpcap stuff). You can try forcing libpcap to use a specific interface with the -i flag. So if you are have 1 network interface then the option "-ieth0" should be added to the command line.
It may also help to see where the debug messages stop. Turn on a couple layers (at least 2) and let me know what the last message before the sig 11 comes. You might need to redirect the stderr output to a file or the messages may scroll off the screen. I'm looking for what comes after the ""C2S test calling init_pkttrace() with pd=" message.
Rich
Clay
------------------------------------
Rich,
You are correct, the Middlebox test completes, and when the client to server test starts is when we get the 'Go' failure:
TCP/Web100 Network Diagnostic Tool v5.3.3e
click START to begin
Checking for Middleboxes . . . . . . . . . . . . . . . . . . Done
running 10s outbound test (client to server) . . . . . Server failed: 'Go' flag not received
I've gathered ouput when specifying the interface for libpcap and without. Both are returning similar results.
Here is the output with the following command up to the SIGNAL 11's:
# ./web100srv -dd -a -ieth0 -l /var/log/web100/web100srv.log &> output.txt
# cat output.txt | more
ANL/Internet2 NDT ver 3.1.4
Variables file = /usr/local/ndt/web100_variables
log file = /var/log/web100/web100srv.log
Debug level set to 2
server ready on port 3001
web100_init() read 69 variables from file
Signal 17 received from process 9709
successfully locked '/tmp/view.string' for updating
sending '0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,' to tmp file
socket info src=ip.web.100.srvr:3001, dst=ip.pc.runnig.test:1683
listening for Inet connection on sock2, fd=3
server ports 3002 3003
listening for Inet connection on sock3, fd=5
Middlebox test, Port 3003 waiting for incoming connection
Set MSS to 536, Window size set to 16777216KB
socket info src=ip.web.100.srvr:3003, dst=ip.pc.running.test:1684
Sent 'GO' signal, waiting for incoming connection on sock2
C2S test calling init_pkttrace() with pd=0x95060002
Opening network interface 'eth0' for packet-pair timing
pcap_open_live() returned pointer 0x808c620
installing pkt filter for 'host ip.pc.running.test and port 1685'
Initial pkt src data = 0
Signal 11 received from process 9715
And here is the output when not specifying the ethernet device along with it's command up to the SIGNAL 11's:
# ./web100srv -dd -a -l /var/log/web100/web100srv.log &> output2.txt
# cat output2.txt | more
ANL/Internet2 NDT ver 3.1.4
Variables file = /usr/local/ndt/web100_variables
log file = /var/log/web100/web100srv.log
Debug level set to 2
server ready on port 3001
web100_init() read 69 variables from file
Signal 17 received from process 9720
successfully locked '/tmp/view.string' for updating
sending '0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,' to tmp file
socket info src=ip.web.100.srvr:3001, dst=ip.pc.running.test:1689
listening for Inet connection on sock2, fd=3
server ports 3002 3003
listening for Inet connection on sock3, fd=5
Middlebox test, Port 3003 waiting for incoming connection
Set MSS to 536, Window size set to 16777216KB
socket info src=ip.web.100.srvr:3003, dst=ip.pc.running.test:1690
Sent 'GO' signal, waiting for incoming connection on sock2
C2S test calling init_pkttrace() with pd=0x9b060002
Opening network interface 'eth0' for packet-pair timing
pcap_open_live() returned pointer 0x808c828
installing pkt filter for 'host ip.pc.running.test and port 1691'
Initial pkt src data = 0
Signal 11 received from process 9724
During a packet capture, I see another 3003 packet that looks to be the port reopening like you said after the Middlebox test is ran. There is then this information passed from port 3001 before and immediately after the Middlebox test:
03002 30033002 3003
And this is as you described. After the Middlebox test and after this is passed on 3001 there looks to be a connection attempt on 3002. Which from looking at the port information is where we are seeing the installing pkt filter in the debug output...
At this point the signal 11's hit, and I manually shutdown the web100srv with a ctrl-c.
Let me know if you need me to gather any more information on this for you. I hope this information is helping you as well as we continue to look into this.
Clay
------------------------------------
Richard A. Carlson e-mail:
Network Engineer phone: (734) 352-7043
Internet2 fax: (734) 913-4255
1000 Oakbrook Dr; Suite 300
Ann Arbor, MI 48104
- Re: failed middlebox testing, Clayton Keller, 09/01/2005
- Re: failed middlebox testing, Richard Carlson, 09/01/2005
- Re: failed middlebox testing, Richard Carlson, 09/01/2005
- Re: failed middlebox testing, Clayton Keller, 09/01/2005
Archive powered by MHonArc 2.6.16.