Skip to Content.
Sympa Menu

ndt-users - Re: FW: Protocol error - missed text message!

Subject: ndt-users list created

List archive

Re: FW: Protocol error - missed text message!


Chronological Thread 
  • From: Rich Carlson <>
  • To: "" <>
  • Subject: Re: FW: Protocol error - missed text message!
  • Date: Fri, 11 Feb 2011 09:32:14 -0500

Matt;

In-line.

On 2/11/2011 2:16 AM,

wrote:
Hi Rich and others ,

I have try to test with my ndt server without port 7123 , the same error
occur :

# sudo web100clt -4 -l -n [NDT_Server_IP] -dddddd

These message indicate that the client contacted the server on TCP port 3001 and the server responded, the test will begin.
Testing network path
for configuration and performance problems -- Send buffer initialized to
16384, Receive buffer initialized to 87380 Using IPv4 address Requesting
test suite:

Middlebox test

Simple firewall test

C2S throughput test

S2C throughput test

New Client, impliments queuing feedback Server version: 3.6.4 Received
tests sequence: ' 1 8 2 4'

<-- Middlebox test -->

-- port: 3003

connecting to [NDT_Server_IP]:3003

Send buffer initialized to 16384, Receive buffer initialized to 87380
Checking for Middleboxes ****************************** CWND limited
speed = 32274.20 kbps Done <-------------------->

The server was able to communicate with the client on TCP port 3003.

<-- Simple firewall test -->

checking for firewalls . . . . . . . . . . . . . . . . . . .

-- port: 47955

-- time: 1

Send buffer initialized to 16384, Receive buffer initialized to 87380

-- oport: 54847

Send buffer initialized to 16384, Receive buffer initialized to 87380
Simple firewall test: no connection for 1 seconds SIGALRM was caught
Done <-------------------------->


The previous messages indicate that there is a firewall blocking connections on ephemeral ports. As Jason noted, this firewall might also be blocking communications on TCP port 3002. (see below)

<-- C2S throughput test -->

-- port: 3002

Send buffer initialized to 16384, Receive buffer initialized to 87380
running 10s outbound test (client to server) . . . . . Protocol error -
missed text message!

C2S throughput test FAILED!

The C2S test failed. From this point on, the server and client are out of sync and the S2C fail as well.

<-- S2C throughput test -->

Protocol error - missed prepare message!

S2C throughput test FAILED!

Protocol error - expected results! got ' 1', msgtype=2

# web100srv -4 -s -l /var/log/ndt/web100srv.log -dddddd

ANL/Internet2 NDT ver 3.6.4

Variables file = /usr/local/ndt/web100_variables

log file = /var/log/ndt/web100srv.log

Syslog facility = default (128)

Debug level set to 6

Send buffer initialized to 16384, Receive buffer initialized to 87380
server ready on port 3001

web100_init() read 69 variables from file NDT server (v3.6.4) proces
[24836] started at Feb 10 22:03:40 Generated iflist with device=eth0 and
if_speed=7 nothing in queue Timer not running, waiting for new
connection New connection received, waiting for accept() to complete

Select() found 1 clients ready, highest fd=3 New connection received
from 0x1c22ce0 [192.168.1.153] sockfd=7.

Parent process spawned child = 24839

Parent thinks pipe() returned fd0=8, fd1=9 Child thinks pipe() returned
fd0=8, fd1=9 for pid=0creating new child - semaphore locked

Free'd ndtq semaphore lock - 1

initialize_tests returned old/new client = 1, test_suite = 1 8 2 4
Incrementing waiting variable now = 1 Line just before ChldRdy: label,
did't get here via the goto statement!

___-------______ client process is ready for testing!

Telling client 24839 testing will begin now sending 'GO' signal to
client msg='go 31 1 8 2 4'

Queue pointer=24839, testing=1, waiting=1, mclients=0, zombie_check=0
Waiting for new connection, timer running Child 24839 received 'go 31 1
8 2 4' from parent Got 'go' signal from parent, ready to start testing
24839 extracting test_suite ' 1 8 2 4' and t_opts '1f' from buff 'go 31
1 8 2 4'

Timer expired while waiting for a new connection Queue pointer=24839,
testing=1, waiting=1, mclients=0, zombie_check=0 Waiting for new
connection, timer running setting master alarm() to 120 seconds, tests
must start (complete?) before this timer expires Valid test sequence
requested, run test for client=24839 Child process 24839 started

run_test() routine, asking for test_suite = 1 8 2 4 Starting test suite:

Middlebox test

Simple firewall test

C2S throughput test

S2C throughput test

The client has connected and the server has agreed to run these 4 tests.
Starting middlebox test

<-- 24839 - Middlebox test -->

Timer expired while waiting for a new connection Queue pointer=24839,
testing=1, waiting=1, mclients=0, zombie_check=0 Waiting for new
connection, timer running Timer expired while waiting for a new
connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0 Waiting for new connection, timer running

Send buffer initialized to 16384, Receive buffer initialized to 87380

-- port: 3003

Looking for Web100 data on socketid 8

Server: matt_NDT_Server;3003 Timer expired while waiting for a new
connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0 Waiting for new connection, timer running Timer expired
while waiting for a new connection Queue pointer=24839, testing=1,
waiting=1, mclients=0, zombie_check=0 Waiting for new connection, timer
running

Client: 192.168.1.153;53993 1444;7;7;

Sending 1444 Byte packets over the network Setting Cwnd Limit to 2888
octets Signal 17 received by process 24839 Timer expired while waiting
for a new connection Queue pointer=24839, testing=1, waiting=1,
mclients=0, zombie_check=0 Waiting for new connection, timer running
Timer expired while waiting for a new connection Queue pointer=24839,
testing=1, waiting=1, mclients=0, zombie_check=0 Waiting for new
connection, timer running Finished with web100_middlebox() routine
snap-0x1c22af0, sndbuff=1c26c200x CWND limited throughput = 31696 kbps
(31696)

<--------- 24839 ----------->

The preceding messages indicate the client opened a test connection on TCP port 3003 and completed the middlebox test. Everything is good so far.

Starting simple firewall test

<-- 24839 - Simple firewall test -->

Timer expired while waiting for a new connection Queue pointer=24839,
testing=1, waiting=1, mclients=0, zombie_check=0 Waiting for new
connection, timer running Timer expired while waiting for a new
connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0 Waiting for new connection, timer running Timer expired
while waiting for a new connection Queue pointer=24839, testing=1,
waiting=1, mclients=0, zombie_check=0 Waiting for new connection, timer
running

Send buffer initialized to 16384, Receive buffer initialized to 87380

-- port: 52206

-- SFW time: 1

Timer expired while waiting for a new connection Queue pointer=24839,
testing=1, waiting=1, mclients=0, zombie_check=0 Waiting for new
connection, timer running Timer expired while waiting for a new
connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0 Waiting for new connection, timer running Timer expired
while waiting for a new connection Queue pointer=24839, testing=1,
waiting=1, mclients=0, zombie_check=0 Waiting for new connection, timer
running

-- oport: 50789

Send buffer initialized to 16384, Receive buffer initialized to 87380
Timer expired while waiting for a new connection Queue pointer=24839,
testing=1, waiting=1, mclients=0, zombie_check=0 Waiting for new
connection, timer running SFW - SIGALRM was caught SFW - SIGALRM was
caught Unable to create connect socket.

<-------------------------->
The server has completed the simple firewall tests. It found that both the server and the client are blocked from opening TCP connections of ephemeral ports. There must be some kind of firewall in the path.


Starting c2s throughput test

<-- 24839 - C2S throughput test -->

Timer expired while waiting for a new connection Queue pointer=24839,
testing=1, waiting=1, mclients=0, zombie_check=0 Waiting for new
connection, timer running Timer expired while waiting for a new
connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0 Waiting for new connection, timer running

Send buffer initialized to 16384, Receive buffer initialized to 87380

-- port: 3002

listening for Inet connection on testOptions->c2ssockfd, fd=3 Sending
'GO' signal, to tell client 24839 to head for the next test child 24839
- sent c2s prepare to client

accept() for 24839 completed

child 24839 - c2s ready for test with fd=9 C2S test Child 24839 thinks
pipe() returned fd0=10, fd1=11 Timer expired while waiting for a new
connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0 Waiting for new connection, timer running Timer expired
while waiting for a new connection Queue pointer=24839, testing=1,
waiting=1, mclients=0, zombie_check=0 Waiting for new connection, timer
running Unknown address family=17 found

IPv4 interface found address=[NDT_Server_IP]

IPv4 address match, setting device to 'eth0'

Opening network interface 'eth0' for packet-pair timing

pcap_open_live() returned pointer 0x1c288f0 installing pkt filter for
'host 192.168.1.153 and port 42660'

Initial pkt src data = 628f64

C2S test Parent thinks pipe() returned fd0=10, fd1=11

The server has started the c2s test. the client has opened a new connection to the server on port 3002, which the server has received. A child process has been created to perform the pkt-pair timing task. And testing should begin.
Timer expired
while waiting for a new connection Queue pointer=24839, testing=1,
waiting=1, mclients=0, zombie_check=0

Waiting for new connection, timer
running Timer expired while waiting for a new connection Queue
pointer=24839, testing=1, waiting=1, mclients=0, zombie_check=0

Waiting
for new connection, timer running Timer expired while waiting for a new
connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0

Waiting for new connection, timer running New IPv4 packet
trace started -- initializing counters Timer expired while waiting for a
new connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0

Waiting for new connection, timer running Timer expired
while waiting for a new connection Queue pointer=24839, testing=1,
waiting=1, mclients=0, zombie_check=0

Waiting for new connection, timer
running Timer expired while waiting for a new connection Queue
pointer=24839, testing=1, waiting=1, mclients=0, zombie_check=0

The preceding messages indicate that there is a client in the queue. They go off every 3 seconds. You should see 3-4 messages and then there should be messages showing that the test ended, and the child process will send data back to the parent. (the error below indicates this failed, because the parent process has been killed by a timeout.

Thus, the question is, why did this test not complete?

1) is there something wrong with the compile process. Do you have the libpcap development package installed? If not, install, rebuild, reinstall the NDT server.

2) did the test start? Turn on tcp tracing (--tcpdump). This will create a TCP dump file for each test. If this file is empty, then the test did not start. If the file has data, then the test did not finish.

3) capture the traffic on the control channel, run tcpdump on port 3001. This will also tell you if the test started and/or completed.

This should get you more data about what is happening.


The messaged below indicate that the server is cleaning up after a timeout occurred. That is all normal.

Fault: Something in queue, but child 24839 (fd=7) has exceeded wait time
pid=24839, client='192.168.1.153', stime=1297346632, qtime=1297346632
now=1297346704 pipe-fd=9, running=1, ctlsockfd=7, client-type=1,
tests='1 8 2 4'

writen() Error! write(7) failed with err='Bad file descriptor(9) pic=24836'

Processing SIGCHLD signal for active web100srv process [-1], sig17=0
Stuck child at head of queue, set pid=24839 and remove it from queue
checking for pktpair timing children, skip them

Looking for 24839, curent queue Child 24839, host: 192.168.1.153
[192.168.1.153], next=0x0 Main test process 24839 terminated, remove
from queue Attempting to clean up child 24839, head pid = 24839 Walkingqueue

Child 24839, host: 192.168.1.153 [192.168.1.153], next=0x0 Child process
24839 causing head pointer modification, semaphore locked modifying
queue tmp1=0x1c23170, head_ptr=0x1c23170 free tmp1=0x1c23170 Removing
Child from head, decremented waiting/mclients 0/0 Free'd ndtq semaphore
lock - 3 Processing SIGCHLD signal for active web100srv process [24839],
sig17=0

wait4() returned 32766 for PID=0

wtermsig = 126

child_sig() routine, wait4() non-zero status (32766) returned nothing in
queueSignal 15 received by process 24839

Timer not running, waiting for new connection Signal 17 received by
process 24836 Signal 17 (SIGCHLD) received - completed tests = 1

Any help greatly appreciated.

Matt




Archive powered by MHonArc 2.6.16.

Top of Page