ndt-users - Re: FW: Protocol error - missed text message!
Subject: ndt-users list created
List archive
- From: Rich Carlson <>
- To: "" <>
- Subject: Re: FW: Protocol error - missed text message!
- Date: Fri, 11 Feb 2011 09:32:14 -0500
Matt;
In-line.
On 2/11/2011 2:16 AM,
wrote:
Hi Rich and others ,
I have try to test with my ndt server without port 7123 , the same error
occur :
# sudo web100clt -4 -l -n [NDT_Server_IP] -dddddd
These message indicate that the client contacted the server on TCP port 3001 and the server responded, the test will begin.
Testing network path
for configuration and performance problems -- Send buffer initialized to
16384, Receive buffer initialized to 87380 Using IPv4 address Requesting
test suite:
Middlebox test
Simple firewall test
C2S throughput test
S2C throughput test
New Client, impliments queuing feedback Server version: 3.6.4 Receivedtests sequence: ' 1 8 2 4'
<-- Middlebox test -->
-- port: 3003
connecting to [NDT_Server_IP]:3003
Send buffer initialized to 16384, Receive buffer initialized to 87380
Checking for Middleboxes ****************************** CWND limited
speed = 32274.20 kbps Done <-------------------->
The server was able to communicate with the client on TCP port 3003.
<-- Simple firewall test -->
checking for firewalls . . . . . . . . . . . . . . . . . . .
-- port: 47955
-- time: 1
Send buffer initialized to 16384, Receive buffer initialized to 87380
-- oport: 54847
Send buffer initialized to 16384, Receive buffer initialized to 87380
Simple firewall test: no connection for 1 seconds SIGALRM was caught
Done <-------------------------->
The previous messages indicate that there is a firewall blocking connections on ephemeral ports. As Jason noted, this firewall might also be blocking communications on TCP port 3002. (see below)
<-- C2S throughput test -->
-- port: 3002
Send buffer initialized to 16384, Receive buffer initialized to 87380
running 10s outbound test (client to server) . . . . . Protocol error -
missed text message!
C2S throughput test FAILED!
The C2S test failed. From this point on, the server and client are out of sync and the S2C fail as well.
<-- S2C throughput test -->
Protocol error - missed prepare message!
S2C throughput test FAILED!
Protocol error - expected results! got ' 1', msgtype=2
# web100srv -4 -s -l /var/log/ndt/web100srv.log -dddddd
ANL/Internet2 NDT ver 3.6.4
Variables file = /usr/local/ndt/web100_variables
log file = /var/log/ndt/web100srv.log
Syslog facility = default (128)
Debug level set to 6
Send buffer initialized to 16384, Receive buffer initialized to 87380
server ready on port 3001
web100_init() read 69 variables from file NDT server (v3.6.4) proces
[24836] started at Feb 10 22:03:40 Generated iflist with device=eth0 and
if_speed=7 nothing in queue Timer not running, waiting for new
connection New connection received, waiting for accept() to complete
Select() found 1 clients ready, highest fd=3 New connection received
from 0x1c22ce0 [192.168.1.153] sockfd=7.
Parent process spawned child = 24839
Parent thinks pipe() returned fd0=8, fd1=9 Child thinks pipe() returned
fd0=8, fd1=9 for pid=0creating new child - semaphore locked
Free'd ndtq semaphore lock - 1
initialize_tests returned old/new client = 1, test_suite = 1 8 2 4
Incrementing waiting variable now = 1 Line just before ChldRdy: label,
did't get here via the goto statement!
___-------______ client process is ready for testing!
Telling client 24839 testing will begin now sending 'GO' signal to
client msg='go 31 1 8 2 4'
Queue pointer=24839, testing=1, waiting=1, mclients=0, zombie_check=0
Waiting for new connection, timer running Child 24839 received 'go 31 1
8 2 4' from parent Got 'go' signal from parent, ready to start testing
24839 extracting test_suite ' 1 8 2 4' and t_opts '1f' from buff 'go 31
1 8 2 4'
Timer expired while waiting for a new connection Queue pointer=24839,
testing=1, waiting=1, mclients=0, zombie_check=0 Waiting for new
connection, timer running setting master alarm() to 120 seconds, tests
must start (complete?) before this timer expires Valid test sequence
requested, run test for client=24839 Child process 24839 started
run_test() routine, asking for test_suite = 1 8 2 4 Starting test suite:
Middlebox test
Simple firewall test
C2S throughput test
S2C throughput test
The client has connected and the server has agreed to run these 4 tests.
Starting middlebox test
<-- 24839 - Middlebox test -->
Timer expired while waiting for a new connection Queue pointer=24839,
testing=1, waiting=1, mclients=0, zombie_check=0 Waiting for new
connection, timer running Timer expired while waiting for a new
connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0 Waiting for new connection, timer running
Send buffer initialized to 16384, Receive buffer initialized to 87380
-- port: 3003
Looking for Web100 data on socketid 8
Server: matt_NDT_Server;3003 Timer expired while waiting for a new
connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0 Waiting for new connection, timer running Timer expired
while waiting for a new connection Queue pointer=24839, testing=1,
waiting=1, mclients=0, zombie_check=0 Waiting for new connection, timer
running
Client: 192.168.1.153;53993 1444;7;7;
Sending 1444 Byte packets over the network Setting Cwnd Limit to 2888
octets Signal 17 received by process 24839 Timer expired while waiting
for a new connection Queue pointer=24839, testing=1, waiting=1,
mclients=0, zombie_check=0 Waiting for new connection, timer running
Timer expired while waiting for a new connection Queue pointer=24839,
testing=1, waiting=1, mclients=0, zombie_check=0 Waiting for new
connection, timer running Finished with web100_middlebox() routine
snap-0x1c22af0, sndbuff=1c26c200x CWND limited throughput = 31696 kbps
(31696)
<--------- 24839 ----------->
The preceding messages indicate the client opened a test connection on TCP port 3003 and completed the middlebox test. Everything is good so far.
Starting simple firewall testThe server has completed the simple firewall tests. It found that both the server and the client are blocked from opening TCP connections of ephemeral ports. There must be some kind of firewall in the path.
<-- 24839 - Simple firewall test -->
Timer expired while waiting for a new connection Queue pointer=24839,
testing=1, waiting=1, mclients=0, zombie_check=0 Waiting for new
connection, timer running Timer expired while waiting for a new
connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0 Waiting for new connection, timer running Timer expired
while waiting for a new connection Queue pointer=24839, testing=1,
waiting=1, mclients=0, zombie_check=0 Waiting for new connection, timer
running
Send buffer initialized to 16384, Receive buffer initialized to 87380
-- port: 52206
-- SFW time: 1
Timer expired while waiting for a new connection Queue pointer=24839,
testing=1, waiting=1, mclients=0, zombie_check=0 Waiting for new
connection, timer running Timer expired while waiting for a new
connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0 Waiting for new connection, timer running Timer expired
while waiting for a new connection Queue pointer=24839, testing=1,
waiting=1, mclients=0, zombie_check=0 Waiting for new connection, timer
running
-- oport: 50789
Send buffer initialized to 16384, Receive buffer initialized to 87380
Timer expired while waiting for a new connection Queue pointer=24839,
testing=1, waiting=1, mclients=0, zombie_check=0 Waiting for new
connection, timer running SFW - SIGALRM was caught SFW - SIGALRM was
caught Unable to create connect socket.
<-------------------------->
Starting c2s throughput test
<-- 24839 - C2S throughput test -->
Timer expired while waiting for a new connection Queue pointer=24839,
testing=1, waiting=1, mclients=0, zombie_check=0 Waiting for new
connection, timer running Timer expired while waiting for a new
connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0 Waiting for new connection, timer running
Send buffer initialized to 16384, Receive buffer initialized to 87380
-- port: 3002
listening for Inet connection on testOptions->c2ssockfd, fd=3 Sending
'GO' signal, to tell client 24839 to head for the next test child 24839
- sent c2s prepare to client
accept() for 24839 completed
child 24839 - c2s ready for test with fd=9 C2S test Child 24839 thinks
pipe() returned fd0=10, fd1=11 Timer expired while waiting for a new
connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0 Waiting for new connection, timer running Timer expired
while waiting for a new connection Queue pointer=24839, testing=1,
waiting=1, mclients=0, zombie_check=0 Waiting for new connection, timer
running Unknown address family=17 found
IPv4 interface found address=[NDT_Server_IP]
IPv4 address match, setting device to 'eth0'
Opening network interface 'eth0' for packet-pair timing
pcap_open_live() returned pointer 0x1c288f0 installing pkt filter for
'host 192.168.1.153 and port 42660'
Initial pkt src data = 628f64
C2S test Parent thinks pipe() returned fd0=10, fd1=11
The server has started the c2s test. the client has opened a new connection to the server on port 3002, which the server has received. A child process has been created to perform the pkt-pair timing task. And testing should begin.
Timer expired
while waiting for a new connection Queue pointer=24839, testing=1,
waiting=1, mclients=0, zombie_check=0
Waiting for new connection, timer
running Timer expired while waiting for a new connection Queue
pointer=24839, testing=1, waiting=1, mclients=0, zombie_check=0
Waiting
for new connection, timer running Timer expired while waiting for a new
connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0
Waiting for new connection, timer running New IPv4 packet
trace started -- initializing counters Timer expired while waiting for a
new connection Queue pointer=24839, testing=1, waiting=1, mclients=0,
zombie_check=0
Waiting for new connection, timer running Timer expired
while waiting for a new connection Queue pointer=24839, testing=1,
waiting=1, mclients=0, zombie_check=0
Waiting for new connection, timer
running Timer expired while waiting for a new connection Queue
pointer=24839, testing=1, waiting=1, mclients=0, zombie_check=0
The preceding messages indicate that there is a client in the queue. They go off every 3 seconds. You should see 3-4 messages and then there should be messages showing that the test ended, and the child process will send data back to the parent. (the error below indicates this failed, because the parent process has been killed by a timeout.
Thus, the question is, why did this test not complete?
1) is there something wrong with the compile process. Do you have the libpcap development package installed? If not, install, rebuild, reinstall the NDT server.
2) did the test start? Turn on tcp tracing (--tcpdump). This will create a TCP dump file for each test. If this file is empty, then the test did not start. If the file has data, then the test did not finish.
3) capture the traffic on the control channel, run tcpdump on port 3001. This will also tell you if the test started and/or completed.
This should get you more data about what is happening.
The messaged below indicate that the server is cleaning up after a timeout occurred. That is all normal.
Fault: Something in queue, but child 24839 (fd=7) has exceeded wait time
pid=24839, client='192.168.1.153', stime=1297346632, qtime=1297346632
now=1297346704 pipe-fd=9, running=1, ctlsockfd=7, client-type=1,
tests='1 8 2 4'
writen() Error! write(7) failed with err='Bad file descriptor(9) pic=24836'
Processing SIGCHLD signal for active web100srv process [-1], sig17=0
Stuck child at head of queue, set pid=24839 and remove it from queue
checking for pktpair timing children, skip them
Looking for 24839, curent queue Child 24839, host: 192.168.1.153
[192.168.1.153], next=0x0 Main test process 24839 terminated, remove
from queue Attempting to clean up child 24839, head pid = 24839 Walkingqueue
Child 24839, host: 192.168.1.153 [192.168.1.153], next=0x0 Child process
24839 causing head pointer modification, semaphore locked modifying
queue tmp1=0x1c23170, head_ptr=0x1c23170 free tmp1=0x1c23170 Removing
Child from head, decremented waiting/mclients 0/0 Free'd ndtq semaphore
lock - 3 Processing SIGCHLD signal for active web100srv process [24839],
sig17=0
wait4() returned 32766 for PID=0
wtermsig = 126
child_sig() routine, wait4() non-zero status (32766) returned nothing in
queueSignal 15 received by process 24839
Timer not running, waiting for new connection Signal 17 received by
process 24836 Signal 17 (SIGCHLD) received - completed tests = 1
Any help greatly appreciated.
Matt
- Protocol error - missed text message!, matt.crawfoord, 02/09/2011
- Re: Protocol error - missed text message!, Jason Zurawski, 02/09/2011
- Message not available
- Re: Protocol error - missed text message!, Jason Zurawski, 02/10/2011
- Message not available
- Re: Protocol error - missed text message!, Jason Zurawski, 02/10/2011
- RE: Protocol error - missed text message!, matt.crawfoord, 02/10/2011
- Re: Protocol error - missed text message!, Jason Zurawski, 02/10/2011
- RE: Protocol error - missed text message!, matt.crawfoord, 02/10/2011
- RE: Protocol error - missed text message!, matt.crawfoord, 02/10/2011
- Re: Protocol error - missed text message!, Jason Zurawski, 02/10/2011
- Message not available
- Re: Protocol error - missed text message!, Jason Zurawski, 02/10/2011
- Message not available
- Re: Protocol error - missed text message!, Jason Zurawski, 02/09/2011
- <Possible follow-up(s)>
- Re: FW: Protocol error - missed text message!, Rich Carlson, 02/11/2011
Archive powered by MHonArc 2.6.16.