Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] pscheduler throughput test fails to complete after pS software upgrade

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] pscheduler throughput test fails to complete after pS software upgrade


Chronological Thread 
  • From: "Uhl, George D. (GSFC-423.0)[SGT INC]" <>
  • To: Mark Feit <>, "" <>
  • Subject: Re: [perfsonar-user] pscheduler throughput test fails to complete after pS software upgrade
  • Date: Fri, 5 Apr 2019 14:55:26 +0000
  • Dkim-filter: OpenDKIM Filter v2.11.0 ndjsvnpf102.ndc.nasa.gov 6EC45401337A

Hi Mark,

 

I’m resurrecting this issue for additional insight.  This problem has been cropping up on other iperf3 throughput tests in my mesh when iperf3 streams are sourced from my esdis-ps2-10g.eos.nasa.gov node.  I did contact the administrator at USGS responsible for the edclxw41.cr.usgs.gov node.  I’m running iperf3.6 on my node and the USGS node is iperf3.5.

 

[uhl@enpl-pt2-10g ~]$ iperf3 -v

iperf 3.6 (cJSON 1.5.2)

Linux enpl-pt2-10g.eos.nasa.gov 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 29 14:49:43 UTC 2018 x86_64

Optional features available: CPU affinity setting, IPv6 flow label, TCP congestion algorithm setting, sendfile / zerocopy, socket pacing, authentication

 

[rech@edclxw41 ~]$ iperf3 -v

iperf 3.5 (cJSON 1.5.2)

Linux edclxw41 2.6.32-754.6.3.el6.x86_64 #1 SMP Tue Oct 9 17:27:49 UTC 2018 x86_64

Optional features available: CPU affinity setting, IPv6 flow label, TCP congestion algorithm setting, sendfile / zerocopy, authentication

 

We’re still getting iperf3 test completion errors when running tests via pscheduler.  However when I run standalone iperf3 tests from my node, the tests complete without reporting an issue.

 

[uhl@enpl-pt2-10g ~]$ iperf3 -c edclxw41.cr.usgs.gov

Connecting to host edclxw41.cr.usgs.gov, port 5201

[  5] local 169.154.197.28 port 48914 connected to 152.61.6.5 port 5201

[ ID] Interval           Transfer     Bitrate         Retr  Cwnd

[  5]   0.00-1.00   sec  87.2 MBytes   731 Mbits/sec  679   3.30 MBytes       

[  5]   1.00-2.00   sec   105 MBytes   881 Mbits/sec    0   3.35 MBytes       

[  5]   2.00-3.00   sec   104 MBytes   870 Mbits/sec    0   3.44 MBytes       

[  5]   3.00-4.00   sec   109 MBytes   912 Mbits/sec    0   3.68 MBytes       

[  5]   4.00-5.00   sec   119 MBytes   996 Mbits/sec    0   4.06 MBytes       

[  5]   5.00-6.00   sec   135 MBytes  1.13 Gbits/sec    0   4.60 MBytes       

[  5]   6.00-7.00   sec   154 MBytes  1.29 Gbits/sec    0   5.28 MBytes       

[  5]   7.00-8.00   sec   175 MBytes  1.47 Gbits/sec    0   6.13 MBytes       

[  5]   8.00-9.00   sec   204 MBytes  1.71 Gbits/sec    0   7.14 MBytes       

[  5]   9.00-10.00  sec   239 MBytes  2.00 Gbits/sec    0   8.33 MBytes       

- - - - - - - - - - - - - - - - - - - - - - - - -

[ ID] Interval           Transfer     Bitrate         Retr

[  5]   0.00-10.00  sec  1.40 GBytes  1.20 Gbits/sec  679             sender

[  5]   0.00-10.03  sec  1.38 GBytes  1.18 Gbits/sec                  receiver

 

iperf Done.

[uhl@enpl-pt2-10g ~]$ 

 

The USGS test node will be replaced with a more current release of perfSONAR/CentOS but in the meantime I’m seeing these same failures occurring with iperf3 tests to other (I assume older) nodes.  I also gave pscheduler/iperf2 a try and that also failed to complete  I’m unable to test standalone iperf2 since there is no daemon running at USGS.

 

[uhl@enpl-pt2-10g ~]$ pscheduler task --tool iperf2 throughput --source enpl-pt2-10g.eos.nasa.gov --dest edclxw41.cr.usgs.gov

Submitting task...

Task URL:

https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/d0886c96-9328-434d-af6e-6a669d78b261

Running with tool 'iperf2'

Fetching first run...

 

Next scheduled run:

https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/d0886c96-9328-434d-af6e-6a669d78b261/runs/a8f06a52-be3d-4033-83d9-c330e50defc8

Starts 2019-04-05T10:21:39-04:00 (~50 seconds)

Ends   2019-04-05T10:21:54-04:00 (~14 seconds)

Waiting for result...

 

Run did not complete: Failed

 

 

Diagnostics from enpl-pt2-10g.eos.nasa.gov:

  /usr/bin/iperf -p 5001 -c edclxw41.cr.usgs.gov -t 10 -m

  

  ------------------------------------------------------------

  Client connecting to edclxw41.cr.usgs.gov, TCP port 5001

  TCP window size:  325 KByte (default)

  ------------------------------------------------------------

  [  3] local 169.154.197.28 port 50002 connected with 152.61.6.5 port 5001

  [ ID] Interval       Transfer     Bandwidth

  [  3]  0.0-10.0 sec  2.66 GBytes  2.28 Gbits/sec

  [  3] MSS size 1448 bytes (MTU 1500 bytes, ethernet)

  

 

Error from enpl-pt2-10g.eos.nasa.gov:

  No error.

 

Diagnostics from edclxw41.cr.usgs.gov:

  No result was produced

 

Error from edclxw41.cr.usgs.gov:

  No result was produced

 

No further runs scheduled.

[uhl@enpl-pt2-10g ~]$

 

Any insight/suggestions appreciated.

 

Thanks,

George

 

From: Mark Feit <>
Date: Tuesday, March 5, 2019 at 8:04 PM
To: "George.D.Uhl" <>, "" <>
Subject: Re: [perfsonar-user] pscheduler throughput test fails to complete after pS software upgrade

 

"Uhl, George D. (GSFC-423.0)[SGT INC]" writes:

 

One of my managed test nodes underwent a perfsonar software upgrade on Saturday morning.  Ever since the upgrade, outbound iperf3 throughput tests fail to complete.  The destination is a no-agent test node which I think might be running a perfsonar 4.0.x release.  

 

Diagnostics from edclxw41.cr.usgs.gov:

  No diagnostics.

 

Error from edclxw41.cr.usgs.gov:

  iperf3 returned an error: exiting

 

It looks like iperf3 failed at the far end.  Earlier versions of the iperf3 plugin don’t collect sufficient diagnostic information when the tool fails, so getting to the bottom of this will require some cooperation from USGS.  (Looking at the current sources, we may need to re-think some of how that’s done.)  The failure should have left traces in the logs and, if not, turning debugging on for a few minutes will get good information.

 

Your end appears to have produced a usable result, which would seem to indicate that USGS worked well enough to produce it and died on the way out.  Iperf3 can sometimes be cagey about why it failed.  Bruce can correct me if I’m wrong, but I seem to recall that parts of the error end up on different output streams.

 

I can’t reach pScheduler on the USGS machine (but I can reach the BWCTL and iperf3 servers) and was able to detect that the system is running iperf3 3.6.  That is the current version, released last June, so they’re not that far behind.

 

--Mark

 




Archive powered by MHonArc 2.6.19.

Top of Page