perfsonar-user - Re: [perfsonar-user] throughput host missing test graphs
Subject: perfSONAR User Q&A and Other Discussion
List archive
- From: Trey Dockendorf <>
- To: perfsonar-user <>
- Subject: Re: [perfsonar-user] throughput host missing test graphs
- Date: Tue, 27 Jan 2015 19:50:45 -0600
Thought I'd mention this issue is now resolved. Once I correctly set the MTU on the switch for these hosts all the test data started showing up.
Thanks,
- Trey
=============================
Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
Email:
Jabber:
On Tue, Jan 20, 2015 at 4:28 PM, Trey Dockendorf <> wrote:
Jason,Thanks, so far the majority of hosts give "Invalid protocol message received" with iperf3.Trying iperf as the tool used I get "local tool did not complete in allocated time frame and was killed" [1]. I tried increasing test interval to 30 but that doesn't seem like the right tuning knob to change the allocated time frame.The nuttcp seems to throw error "Bad file descriptor" and produces no measurements. [2].So far most of the hosts I'm testing against are throwing these errors. Unsure if this issue on my end or the remote end.Exceptions to the errors above were hosts that are clearly down and ps1-hardy-hstn.tx-learn.net which failed in the same ways except using nuttcp and only when using the "-s" flag.[1]:$ bwctl -T iperf -t 20 -i 1 -f m -x -vv -c ps.tacc.utexas.eduMessages being sent to syslog(user,err)bwctl[555]: FILE=bwctl.c, LINE=2961, Using 165.91.55.6 as the address for local senderbwctl[555]: FILE=bwctl.c, LINE=2961, Using ps.tacc.utexas.edu as the address for remote receiverbwctl[555]: FILE=bwctl.c, LINE=3008, Available in-common: iperf nuttcp iperf3bwctl[555]: FILE=bwctl.c, LINE=3055, Using tool: iperfbwctl[555]: FILE=bwctl.c, LINE=3150, Requested Time: 1421780927.581861bwctl[555]: FILE=bwctl.c, LINE=3152, Latest Acceptable Time: 1421781527.581861bwctl[555]: FILE=bwctl.c, LINE=3332, Reservation(ps.tacc.utexas.edu): 1421780933.303169bwctl[555]: FILE=bwctl.c, LINE=3189, Server 'ps.tacc.utexas.edu' accepted test request at time 1421780933.303169bwctl[555]: FILE=bwctl.c, LINE=3332, Reservation(localhost): 1421780933.303169bwctl[555]: FILE=bwctl.c, LINE=3218, Client 'localhost' accepted test request at time 1421780933.303169bwctl[555]: FILE=bwctl.c, LINE=3450, 27 seconds until test results availableRECEIVER STARTbwctl: start_endpoint: 3630769727.276532bwctl: run_endpoint: receiver: 129.114.0.189bwctl: run_endpoint: sender: 165.91.55.6bwctl: exec_line: iperf -B 129.114.0.189 -s -f m -m -p 5085 -t 20 -i 1.000000bwctl: run_tool: tester: iperfbwctl: run_tool: receiver: 129.114.0.189bwctl: run_tool: sender: 165.91.55.6bwctl: start_tool: 3630769730.794569------------------------------------------------------------Server listening on TCP port 5085Binding to local address 129.114.0.189TCP window size: 0.08 MByte (default)------------------------------------------------------------[ 15] local 129.114.0.189 port 5085 connected with 165.91.55.6 port 53366Waiting for server threads to complete. Interrupt again to force quit.bwctl: local tool did not complete in allocated time frame and was killedbwctl: stop_tool: 3630769755.812094bwctl: stop_endpoint: 3630769756.833291RECEIVER ENDSENDER STARTbwctl: start_endpoint: 3630769727.281346bwctl: run_endpoint: receiver: 129.114.0.189bwctl: run_endpoint: sender: 165.91.55.6bwctl: exec_line: iperf -c 129.114.0.189 -B 165.91.55.6 -f m -m -p 5085 -t 20 -i 1.000000bwctl: run_tool: tester: iperfbwctl: run_tool: receiver: 129.114.0.189bwctl: run_tool: sender: 165.91.55.6bwctl: start_tool: 3630769733.303298------------------------------------------------------------Client connecting to 129.114.0.189, TCP port 5085Binding to local address 165.91.55.6TCP window size: 0.09 MByte (default)------------------------------------------------------------[ 8] local 165.91.55.6 port 53366 connected with 129.114.0.189 port 5085bwctl: local tool did not complete in allocated time frame and was killedbwctl: stop_tool: 3630769755.825698bwctl: stop_endpoint: 3630769756.827477SENDER END[2]:$ bwctl -T nuttcp -t 20 -i 1 -f m -x -vv -c ps.tacc.utexas.eduMessages being sent to syslog(user,err)bwctl[5045]: FILE=bwctl.c, LINE=2961, Using 165.91.55.6 as the address for local senderbwctl[5045]: FILE=bwctl.c, LINE=2961, Using ps.tacc.utexas.edu as the address for remote receiverbwctl[5045]: FILE=bwctl.c, LINE=3008, Available in-common: iperf nuttcp iperf3bwctl[5045]: FILE=bwctl.c, LINE=3055, Using tool: nuttcpbwctl[5045]: FILE=bwctl.c, LINE=3150, Requested Time: 1421781241.364543bwctl[5045]: FILE=bwctl.c, LINE=3152, Latest Acceptable Time: 1421781841.364543bwctl[5045]: FILE=bwctl.c, LINE=3332, Reservation(ps.tacc.utexas.edu): 1421781247.724141bwctl[5045]: FILE=bwctl.c, LINE=3189, Server 'ps.tacc.utexas.edu' accepted test request at time 1421781247.724141bwctl[5045]: FILE=bwctl.c, LINE=3332, Reservation(localhost): 1421781247.724141bwctl[5045]: FILE=bwctl.c, LINE=3218, Client 'localhost' accepted test request at time 1421781247.724141bwctl[5045]: FILE=bwctl.c, LINE=3450, 27 seconds until test results availableRECEIVER STARTbwctl: start_endpoint: 3630770041.058892bwctl: run_endpoint: receiver: 129.114.0.189bwctl: run_endpoint: sender: 165.91.55.6bwctl: exec_line: nuttcp -vv -p 5530 -P 5000 -i 1.000000 -T 20 --nofork -1bwctl: run_tool: tester: nuttcpbwctl: run_tool: receiver: 129.114.0.189bwctl: run_tool: sender: 165.91.55.6bwctl: start_tool: 3630770044.896131nuttcp_mread: Bad file descriptorbwctl: stop_tool: 3630770062.822761bwctl: stop_endpoint: 3630770065.833646RECEIVER ENDSENDER STARTbwctl: start_endpoint: 3630770041.063726bwctl: run_endpoint: receiver: 129.114.0.189bwctl: run_endpoint: sender: 165.91.55.6bwctl: exec_line: nuttcp -vv -p 5530 -P 5000 -i 1.000000 -T 20 -t 129.114.0.189bwctl: run_tool: tester: nuttcpbwctl: run_tool: receiver: 129.114.0.189bwctl: run_tool: sender: 165.91.55.6bwctl: start_tool: 3630770047.724292bwctl: stop_tool: 3630770065.827634bwctl: stop_endpoint: 3630770065.827928SENDER END=============================Trey DockendorfSystems Analyst ITexas A&M UniversityAcademy for Advanced Telecommunications and Learning TechnologiesPhone: (979)458-2396Email:Jabber:On Tue, Jan 20, 2015 at 1:05 PM, Jason Zurawski <> wrote:Hey Trey;
Apologies, I fat fingered the instructions - try a lowercase ’s’ for that command to TACC (or just view the -help output from BWCTL to see the command line options).
With regards to the error, perhaps try using ‘-T iperf’ or ‘-T nuttcp’ to see if that has any change (e.g. changing the tools used). The default testing tool is now iperf3, but seeing what the others say would be good. As I noted in the previous mail, there are still some funny behaviors the developers are sorting out with iperf3 and BWCTL.
Thanks;
-jason
> On Jan 20, 2015, at 12:57 PM, Trey Dockendorf <> wrote:
>
> Jason,
>
> Thanks for the response.
>
> For #1 the second command doesn't work
>
> bwctl -T iperf3 -t 20 -i 1 -f m -x -vv -S ps.tacc.utexas.edu
>
> Gives me:
>
> bwctl: Invalid value for TOS. (-S)
>
> Unsure what is valid value would be.
>
> When I tried the first command against a remote host I got the following:
>
> # bwctl -T iperf3 -t 20 -i 1 -f m -x -vv -c ps.tacc.utexas.edu
> Messages being sent to syslog(user,err)
> bwctl[30659]: FILE=bwctl.c, LINE=2961, Using 165.91.55.6 as the address for local sender
> bwctl[30659]: FILE=bwctl.c, LINE=2961, Using ps.tacc.utexas.edu as the address for remote receiver
> bwctl[30659]: FILE=bwctl.c, LINE=3008, Available in-common: iperf nuttcp iperf3
> bwctl[30659]: FILE=bwctl.c, LINE=3055, Using tool: iperf3
> bwctl[30659]: FILE=bwctl.c, LINE=3150, Requested Time: 1421776247.770821
> bwctl[30659]: FILE=bwctl.c, LINE=3152, Latest Acceptable Time: 1421776847.770821
> bwctl[30659]: FILE=bwctl.c, LINE=3332, Reservation(ps.tacc.utexas.edu): 1421776254.700447
> bwctl[30659]: FILE=bwctl.c, LINE=3189, Server 'ps.tacc.utexas.edu' accepted test request at time 1421776254.700447
> bwctl[30659]: FILE=bwctl.c, LINE=3332, Reservation(localhost): 1421776254.700447
> bwctl[30659]: FILE=bwctl.c, LINE=3218, Client 'localhost' accepted test request at time 1421776254.700447
> bwctl[30659]: FILE=bwctl.c, LINE=3450, 28 seconds until test results available
> bwctl[30659]: FILE=protocol.c, LINE=506, BWLReadRequestType: Read interrupted by signal.
> bwctl[30659]: FILE=capi.c, LINE=1166, BWLEndSession: Invalid protocol message received...
> bwctl[30659]: FILE=bwctl.c, LINE=3546, Timed out waiting for results
>
> This particular endpoint is of interest at this time as another group on our campus has asked me to test against the remote site to investigate recent performance issues from their cluster to the remote site.
>
> I'll continue testing each host that's results are missing and see if errors are non obvious.
>
> Thanks,
> - Trey
>
> =============================
>
> Trey Dockendorf
> Systems Analyst I
> Texas A&M University
> Academy for Advanced Telecommunications and Learning Technologies
> Phone: (979)458-2396
> Email:
> Jabber:
>
> On Tue, Jan 20, 2015 at 10:14 AM, Jason Zurawski <> wrote:
> Hey Trey;
>
> Your mail is timely, we are trying to re-write some documentation on this. This will be sort of scattered, but may be a enough to get you started:
>
> 1) Take a look at the hosts you have configured vs. the ones that are succeeding. See if you can do a test by hand to one of the failures (e.g. on the cmd line: bwctl -T iperf3 -t 20 -i 1 -f m -x -vv -c HOST and bwctl -T iperf3 -t 20 -i 1 -f m -x -vv -S HOST). If the host is dead, doesn’t support the requested test type, or denies your test in some way it is most likely a problem on the other end. Maybe send an email to that admin to see what is up.
>
> 2) If the test from the first part succeeded, its time to look in the database. In your case, open your favorite browser (preferably one with a JSON prettifier in place) to your esmond database:
>
> http://psonar-bwctl.brazos.tamu.edu/esmond/perfsonar/archive/?format=json
>
> Each test should have a corresponding record. The ones that succeed, and the ones that fail. Picking one that is failing for you (to ps1-akard-dlls.tx-learn.net - the LEARN Dallas node), we can pull up the failure event type:
>
> http://psonar-bwctl.brazos.tamu.edu/esmond/perfsonar/archive/7f8d5be4115b4927b7ce6a5dfbfa8e44/failures/base?format=json
>
> In this case we get a (useful?) error message:
>
> > Problem parsing output: malformed JSON string, neither array, object, number, string or atom, at character offset 0 (before "(end of string)") at /opt/perfsonar_ps/regular_testing/bin/../lib/perfSONAR_PS/RegularTesting/Parsers/Iperf3.pm line 54.
>
>
> This is a known problem right now between certain versions of BWCTL and iperf3 that the developers are working on. Another common error:
>
> > interrupt - the client has terminated
>
>
> Typically means there was either a firewall or NTP issue.
>
> 3) Check your server side logs (/var/log/perfsonar) to see if anything is showing up in the bwctl_owamp log. Common errors could be permission denied due to limits violations, the lack of being able to get a testing slot (common for busy perfSONAR nodes) or the aforementioned firewall issues.
>
> In general if your node is working for ‘some’ things, it may be on the healthy side. Things that you are trying to test against may just not be reachable, or could be in need of an upgrade (another common issue we are seeing - until we can get a majority of the instances to 3.4.x).
>
> Hope this helps;
>
> -jason
>
> On Jan 20, 2015, at 10:54 AM, Trey Dockendorf <> wrote:
>
> > I have about 14 test members added for throughput tests and only 3 of them show up in the Throughput/Latency Graphs page. I am unsure how to begin debugging this to find the cause of the missing data.
> >
> > The host is http://psonar-bwctl.brazos.tamu.edu/. I have most of these sites also added on a separate perfsonar instance that does latency tests and that host is not missing data.
> >
> > Thanks,
> > - Trey
> >
> > =============================
> >
> > Trey Dockendorf
> > Systems Analyst I
> > Texas A&M University
> > Academy for Advanced Telecommunications and Learning Technologies
> > Phone: (979)458-2396
> > Email:
> > Jabber:
- [perfsonar-user] throughput host missing test graphs, Trey Dockendorf, 01/20/2015
- RE: [perfsonar-user] throughput host missing test graphs, Pedro Henrique Diniz Da Silva, 01/20/2015
- Re: [perfsonar-user] throughput host missing test graphs, Jason Zurawski, 01/20/2015
- Re: [perfsonar-user] throughput host missing test graphs, Trey Dockendorf, 01/20/2015
- Re: [perfsonar-user] throughput host missing test graphs, Jason Zurawski, 01/20/2015
- Re: [perfsonar-user] throughput host missing test graphs, Trey Dockendorf, 01/20/2015
- Re: [perfsonar-user] throughput host missing test graphs, Trey Dockendorf, 01/28/2015
- Re: [perfsonar-user] throughput host missing test graphs, Trey Dockendorf, 01/20/2015
- Re: [perfsonar-user] throughput host missing test graphs, Jason Zurawski, 01/20/2015
- Re: [perfsonar-user] throughput host missing test graphs, Trey Dockendorf, 01/20/2015
Archive powered by MHonArc 2.6.16.