Skip to Content.
Sympa Menu

ndt-users - RE: Error running web100 3.5.0

Subject: ndt-users list created

List archive

RE: Error running web100 3.5.0


Chronological Thread 
  • From: Galuschka Christoph <>
  • To: Richard Carlson <>
  • Cc: "" <>
  • Subject: RE: Error running web100 3.5.0
  • Date: Sat, 29 Aug 2009 20:11:03 +0200

Hello Richard,
 
I'm currently running the tests with 2.6.30.5 and IE6. JAVA - if relevant - is 1.6.11.
The funny thing also is, I do get results in the webbrowser, the client just doesn't finish correctly.
 
I will repair the WAIT-comment thing on monday (sorry, I'm not the best in C)
 
Regards
Christoph
 
-----------------------------------------
Ing. Christoph Galuschka

TIWAG-Tiroler Wasserkraft AG
Bereich IT/Betrieb und Services
Eduard-Wallnöfer-Platz 2
6010 Innsbruck
T: +43 (0)50607 21832
F: +43 (0)50607 41832
www.tiroler-wasserkraft.at
-----------------------------------------
Firmenbuchgericht Innsbruck, FN 44133b
Sitz der Gesellschaft: Innsbruck
DVR: 0164089


From: Richard Carlson
Sent: Sat 8/29/2009 15:28
To: Galuschka Christoph
Cc:
Subject: Re: Error running web100 3.5.0

Hi Chris;

What browser are you using?  What kernel are you using on the server?
I'll try and duplicate this in my lab.  

The problem is that the server thinks that the server-to-client test failed, even though it completed successfully. 
More in-line

On Aug 29, 2009, at 2:16 AM, Galuschka Christoph wrote:

Hello Richard,
 
thanks for posting the new releases.
I've installed 3.5.6 and I still get some errors resulting in an incomplete measurement. Here is the output from web100srv:
>>
ANL/Internet2 NDT ver 3.5.6
        Variables file = /usr/local/ndt/web100_variables
        log file = /usr/local/ndt/web100srv.log
        Debug level set to 5
[snip snip snip]
Everything was normal up to this point.
fwd.saddr = dd70b0a:3003, rev.saddr = f006c0a:3461
01:02:56.724367   10.11.215.13:3003 --> 10.108.0.15:3461 Collected pkt-pair data max = 18667
01:02:56.724367   10.108.0.15:3461 --> 10.11.215.13:3003 Collected pkt-pair data max = 65475
Read '  1 0 0 0 4 661 18667 6971 5501 5721 0 5377 976.37 0 0 0 1 0 7' from monitor pipe
Read '  0 0 0 1 367 9334 40681 26321 35413 65475 39990 34285 663.83 39864 40036 171967 0 39990 7' from monitor pipe
550764 kbps inbound
This is the measured sc2 speed.

libweb100: warning: accessing depricated variable AckPktsIn
Variable 0 (AckPktsIn): web100_snap_read(): invalid arguments
libweb100: warning: accessing depricated variable AckPktsOut
[snip snip snip]
The server walks through the list of variables twice, once for the 'read' group and once for the 'tuning' group.  You can ignore these errors - they are non-events.
>>> send_msg: type=5, len=18

[snip snip snip]
The data was successfully sent back to the client.
Signal 11 received by process 3746
Signal 17 received by process 3741
The child process received the terminate signal and the child process terminated.

Protocol error!
>>> send_msg: type=7, len=61
S2C throughput test FAILED!
This says the s2c test failed and the server sent that message to the client, however as noted above, the test succeeded.

Finished testing C2S = 690.88 Mbps, S2C = -0.00 Mbps
Client --> Server data detects link = OC-12
Client <-- Server Ack's detect link = Gigabit Ethernet
Server --> Client data detects link = OC-12
Server <-- Client Ack's detect link = OC-12
CWND limited test = 43453.26 while unlimited = -0.71
Better throughput when CWND is limited, may be duplex mismatch
>>> send_msg: type=8, len=42
>>> send_msg: type=8, len=76
>>> send_msg: type=8, len=89
>>> send_msg: type=8, len=77
>>> send_msg: type=8, len=82
>>> send_msg: type=8, len=53
>>> send_msg: type=9, len=0
Opened '/usr/local/ndt/serverdata/2009/08/29/20090829T07:02:36.826169000Z_10.108.0.15:3444.meta' metadata log file
Successfully returned from run_test() routine
Signal 17 received by process 3740
now = 1251529386 Process started at 1251529356, run time = 30
Select exited with rc = -1
Queue pointer = 3741, testing = 1, waiting = 1, zombie_check = 0
Received SIGCHLD signal for active web100srv process [3740]
wait3() returned 0 for PID=3741
wexitstatus = '0'
Attempting to clean up child 3741, head pid = 3741
Child process 3741 causing head pointer modification
Removing Child from head, decrementing waiting now = 0
Timer not running, waiting for new connection
And everything exits properly.

>>
This is the result from the browser:
>>
Connecting to '10.11.215.13' [/10.11.215.13] to run test
Connected to: 10.11.215.13  --  Using IPv4 address
Checking for Middleboxes . . . . . . . . . . . . . . . . . .  Done
checking for firewalls . . . . . . . . . . . . . . . . . . .  Done
running 10s outbound test (client-to-server [C2S]) . . . . . 690.87Mb/s
running 10s inbound test (server-to-client [S2C]) . . . . . . 550.79Mb/s
S2C throughput test: Received wrong type of the message
ERROR MSG: Server (S2C throughput test): Invalid S2C throughput received
S2C throughput test FAILED!
The slowest link in the end-to-end path is a a 622 Mbps OC-12 subnet
>>

So the question is, why did the server mistakenly report an error?  
 
After running the test I re-read your email and checked src/testoptions.c line 730 about the comment wait(NULL). I see the /* */ are still there so I removed them and recompiled everything. This did not help very much - here is output from web100srv with the /* */ removed:
>>

[snip snip snip]

Sorry for not being clear.  The 3.5.6 code used the waitpid() call instead of the wait() call.  They are functionally equivalent, so when you removed the comment, you now wait twice after the c2s test completes.  This causes the server to timeout and no s2c test is run.  Just remove/comment out one of the 2 wait() or waitpid() lines and rebuild/install to get back to 1 wait call.

Regards;
Rich

 
I hope the debugging output helps...
 
thanks and best regards
Christoph Galuschka


From: Richard Carlson []
Sent: Fri 8/28/2009 14:25
To: Galuschka Christoph
Cc:
Subject: Re: Error running web100 3.5.0

Hi Chris;

Sorry about that.  There is a bug in the 3.5.0 release.  You can 
download the latest version (3.5.6 which will be posted soon) or you 
can easily patch the 3.5.0 release.  Just edit the src/testoptions.c 
file and go to line 730.  You should find the line /* wait(NULL); */  
- which ofcourse makes this a comment.  Remove the "/*" and "*/" 
char's (so its not a comment) and rebuild/reinstall the package.  This 
should clear out this fault.

Rich

On Aug 27, 2009, at 4:47 AM, wrote:

> Hello,
>
> I've just finished installing ndt-3.5.0 on a fresh SuSE 11.1 System 
> (incl. alle prerequisits; patch for kernel 2.6.27, 
> web100_userland-1.7). Server runs fine and I do get bandwith results.
>
> However, the web100srv produces an error which fails to compelte the 
> test successfully. This is the output from -ddd:
> ANL/Internet2 NDT ver 3.5.0
>        Variables file = /usr/local/ndt/web100_variables
>        log file = /usr/local/ndt/web100srv.log
>        Debug level set to 1
> server ready on port 3001
> web100_init() read 69 variables from file
> Starting test suite:
>> Middlebox test
>> Simple firewall test
>> C2S throughput test
>> S2C throughput test
> <-- Middlebox test -->
>  -- port: 3003
> Sending 1456 Byte packets over the network
> Signal 17 received by process 22352
> <-------------------->
> <-- Simple firewall test -->
>  -- port: 42133
>  -- time: 1
>  -- oport: 2571
> <-------------------------->
> <-- C2S throughput test -->
>  -- port: 3002
> listening for Inet connection on testOptions->c2ssockfd, fd=3
> Sending 'GO' signal, to tell client to head for the next test
> Opening network interface 'eth2' for packet-pair timing
> installing pkt filter for 'host 10.110.109.104 and port 2574'
> Initial pkt src data = "8068484> New packet trace started -- initializing counters
> 365314 kbps outbound
> Signal USR1(10) sent to child [22355]
> Signal 10 received by process 22355
> 03:16:15.649224   03:16:15.649224   128 bytes read '  0 0 84 694 
> 7815 18212 77975 16876 70005 144937 1 1558 232.14 0 0 0 1 0' from 
> monitor pipe
> 128 bytes read '  1 0 0 99 558 1644 45429 40869 16975 3745 1 1 
> 274.82 86 14 109221 0 0' from monitor pipe
> <------------------------->
> <-- S2C throughput test -->
>  -- port: 3003
> waiting for data on testOptions->s2csockfd
> Signal 11 received by process 22355
> Signal 17 received by process 22352
> Opening network interface 'eth2' for packet-pair timing
> installing pkt filter for 'host 10.110.109.104 and port 2580'
> Initial pkt src data = "8068484> Signal 17 received by process 22352
> New packet trace started -- initializing counters
> sent 716955648 bytes to client in 10.00 seconds
> Buffer control counters Total = 87519, new data = "0," Draining Queue 
> = 0
> Signal USR2(12) sent to child [22357]
> Signal 12 received by process 22357
> 03:16:26.019890   03:16:26.019890   Read '  0 0 1 1 5 6 9612 1476 
> 1235 1247 0 4235 454.46 0 0 0 1 0' from monitor pipe
> Read '  0 0 0 3 1179 9323 46700 9154 13887 148207 4687 19061 273.91 
> 10056 242043 102 0 4687' from monitor pipe
> 573470 kbps inbound
> libweb100: warning: accessing depricated variable AckPktsIn
> libweb100: warning: accessing depricated variable AckPktsOut
> Variable 13 (CwndRestores) not found in KIS
> Variable 22 (MaxCaCwnd) not found in KIS
> Variable 30 (MaxSaCwnd) not found in KIS
> Variable 13 (CwndRestores) not found in KIS
> Variable 22 (MaxCaCwnd) not found in KIS
> Variable 30 (MaxSaCwnd) not found in KIS
> Signal 11 received by process 22357
> Signal 17 received by process 22352
> Protocol error!
> S2C throughput test FAILED!
> Client --> Server data detects link = 10 Gigabit Enet
> Client <-- Server Ack's detect link = OC-12
> Server --> Client data detects link = OC-12
> Server <-- Client Ack's detect link = 10 Gigabit Enet
>
> If I'm not mistaking, the 2 lines:
>>>
> libweb100: warning: accessing depricated variable AckPktsIn
> libweb100: warning: accessing depricated variable AckPktsOut
>>>
> probably are the source of the problem.
>
> any ideas what i might have missed?
>
> thanks and best regards
> Christoph

Richard Carlson

1000 Oakbrook Dr
Ann Arbor, MI  48104

P: 734-352-7043
C: 630-251-4572


Richard Carlson
1000 Oakbrook Dr
Ann Arbor, MI  48104

P: 734-352-7043
C: 630-251-4572




Archive powered by MHonArc 2.6.16.

Top of Page