Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] throughput host missing test graphs

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] throughput host missing test graphs


Chronological Thread 
  • From: Trey Dockendorf <>
  • To: perfsonar-user <>
  • Subject: Re: [perfsonar-user] throughput host missing test graphs
  • Date: Tue, 20 Jan 2015 11:57:59 -0600

Jason,

Thanks for the response.

For #1 the second command doesn't work

bwctl -T iperf3 -t 20 -i 1 -f m -x -vv -S ps.tacc.utexas.edu

Gives me:

bwctl: Invalid value for TOS. (-S)

Unsure what is valid value would be.

When I tried the first command against a remote host I got the following:

# bwctl -T iperf3 -t 20 -i 1 -f m -x -vv -c ps.tacc.utexas.edu
Messages being sent to syslog(user,err)
bwctl[30659]: FILE=bwctl.c, LINE=2961, Using 165.91.55.6 as the address for local sender
bwctl[30659]: FILE=bwctl.c, LINE=2961, Using ps.tacc.utexas.edu as the address for remote receiver
bwctl[30659]: FILE=bwctl.c, LINE=3008, Available in-common: iperf nuttcp iperf3
bwctl[30659]: FILE=bwctl.c, LINE=3055, Using tool: iperf3
bwctl[30659]: FILE=bwctl.c, LINE=3150, Requested Time: 1421776247.770821
bwctl[30659]: FILE=bwctl.c, LINE=3152, Latest Acceptable Time: 1421776847.770821
bwctl[30659]: FILE=bwctl.c, LINE=3332, Reservation(ps.tacc.utexas.edu): 1421776254.700447
bwctl[30659]: FILE=bwctl.c, LINE=3189, Server 'ps.tacc.utexas.edu' accepted test request at time 1421776254.700447
bwctl[30659]: FILE=bwctl.c, LINE=3332, Reservation(localhost): 1421776254.700447
bwctl[30659]: FILE=bwctl.c, LINE=3218, Client 'localhost' accepted test request at time 1421776254.700447
bwctl[30659]: FILE=bwctl.c, LINE=3450, 28 seconds until test results available
bwctl[30659]: FILE=protocol.c, LINE=506, BWLReadRequestType: Read interrupted by signal.
bwctl[30659]: FILE=capi.c, LINE=1166, BWLEndSession: Invalid protocol message received...
bwctl[30659]: FILE=bwctl.c, LINE=3546, Timed out waiting for results

This particular endpoint is of interest at this time as another group on our campus has asked me to test against the remote site to investigate recent performance issues from their cluster to the remote site.

I'll continue testing each host that's results are missing and see if errors are non obvious.

Thanks,
- Trey

=============================

Trey Dockendorf 
Systems Analyst I 
Texas A&M University 
Academy for Advanced Telecommunications and Learning Technologies 
Phone: (979)458-2396 
Email:  
Jabber:

On Tue, Jan 20, 2015 at 10:14 AM, Jason Zurawski <> wrote:
Hey Trey;

Your mail is timely, we are trying to re-write some documentation on this.  This will be sort of scattered, but may be a enough to get you started:

1) Take a look at the hosts you have configured vs. the ones that are succeeding.  See if you can do a test by hand to one of the failures (e.g. on the cmd line: bwctl -T iperf3 -t 20 -i 1 -f m -x -vv -c HOST and bwctl -T iperf3 -t 20 -i 1 -f m -x -vv -S HOST).  If the host is dead, doesn’t support the requested test type, or denies your test in some way it is most likely a problem on the other end.  Maybe send an email to that admin to see what is up.

2) If the test from the first part succeeded, its time to look in the database.  In your case, open your favorite browser (preferably one with a JSON prettifier in place) to your esmond database:

http://psonar-bwctl.brazos.tamu.edu/esmond/perfsonar/archive/?format=json

Each test should have a corresponding record.  The ones that succeed, and the ones that fail.  Picking one that is failing for you (to ps1-akard-dlls.tx-learn.net - the LEARN Dallas node), we can pull up the failure event type:

http://psonar-bwctl.brazos.tamu.edu/esmond/perfsonar/archive/7f8d5be4115b4927b7ce6a5dfbfa8e44/failures/base?format=json

In this case we get a (useful?) error message:

> Problem parsing output: malformed JSON string, neither array, object, number, string or atom, at character offset 0 (before "(end of string)") at /opt/perfsonar_ps/regular_testing/bin/../lib/perfSONAR_PS/RegularTesting/Parsers/Iperf3.pm line 54.


This is a known problem right now between certain versions of BWCTL and iperf3 that the developers are working on.  Another common error:

> interrupt - the client has terminated


Typically means there was either a firewall or NTP issue.

3) Check your server side logs (/var/log/perfsonar) to see if anything is showing up in the bwctl_owamp log.  Common errors could be permission denied due to limits violations, the lack of being able to get a testing slot (common for busy perfSONAR nodes) or the aforementioned firewall issues.

In general if your node is working for ‘some’ things, it may be on the healthy side.  Things that you are trying to test against may just not be reachable, or could be in need of an upgrade (another common issue we are seeing - until we can get a majority of the instances to 3.4.x).

Hope this helps;

-jason

On Jan 20, 2015, at 10:54 AM, Trey Dockendorf <> wrote:

> I have about 14 test members added for throughput tests and only 3 of them show up in the Throughput/Latency Graphs page.  I am unsure how to begin debugging this to find the cause of the missing data.
>
> The host is http://psonar-bwctl.brazos.tamu.edu/.  I have most of these sites also added on a separate perfsonar instance that does latency tests and that host is not missing data.
>
> Thanks,
> - Trey
>
> =============================
>
> Trey Dockendorf
> Systems Analyst I
> Texas A&M University
> Academy for Advanced Telecommunications and Learning Technologies
> Phone: (979)458-2396
> Email:
> Jabber:




Archive powered by MHonArc 2.6.16.

Top of Page