perfsonar-user - Re: [perfsonar-user] pscheduler throughput test fails to complete after pS software upgrade
Subject: perfSONAR User Q&A and Other Discussion
List archive
Re: [perfsonar-user] pscheduler throughput test fails to complete after pS software upgrade
Chronological Thread
- From: "Uhl, George D. (GSFC-423.0)[SGT INC]" <>
- To: Mark Feit <>, "" <>
- Subject: Re: [perfsonar-user] pscheduler throughput test fails to complete after pS software upgrade
- Date: Wed, 1 May 2019 15:23:19 +0000
- Dkim-filter: OpenDKIM Filter v2.11.0 ndmsvnpf104.ndc.nasa.gov 7937C4030622
Thanks Mark, it was a case of User X firing up an iperf3 daemon without knowledge of what the impact would be. He’s been tracked down and educated. 😊
From: Mark Feit <>
Uhl, George D. (GSFC-423.0)[SGT INC] writes:
I finally got back to looking at this again and figured out what was going on. The remote non-agent host has an iperf3 daemon running on TCP/5201 as a separate process. When pscheduler spawns the test, the throughput test is run against the iperf3 server rather than spawning a new daemon. Once the throughput test is complete, pscheduler is unable to terminate the independently running iperf3 server and the pscheduler exits with an iperf3 error.
Would you consider this a bug or a feature?
Normally, I’d consider pleading the Fifth, but in this case I’m going to have to call it a feature. :-)
What’s failing is the receiving-end pScheduler’s attempt to start an iperf server because port 5201 is already in use by the persistent one. The two ends of the test operate independently, so the sending end has no idea the server it talked to wasn’t started by pScheduler. Only after the scheduled test time ends are the results from both ends collected and turned into the final result that pScheduler produces. pScheduler doesn’t consider a test successful unless all of the participants say they were, so even if the sending side came up with a result, there’s room to wonder whether or not it’s trustworthy. On that front, pScheduler errs on the side of caution.
Running a persistent iperf server on a system with pScheduler isn’t a good idea because other hosts with no knowledge of the schedule can easily throw a wrench into the works. For example, say there are two nodes A and B, both with pScheduler and an extra iperf server on B. If pScheduler has arranged an A-to-B test at 10:00, a third party could run a 30-second test starting at 9:59:50, the service wouldn’t be available as scheduled and the test would fail. Similarly, if there’s a B-to-A test scheduled and someone starts running a test to the persistent iperf server, neither will have any idea that the results are being distorted by the other. This is why throughput is gets the system to itself.
There is a “single-ended” switch for throughput that will run a test against a persistent iperf server (if it’s run with a tool that supports it), but that will result in failure if the server is already tied up with something else at the scheduled start time.
One thing we’ve had on the to-do list from the very beginning is the concept of resource pools, which would allow us to pick the ports that get used for a test out of a port pool. This would simplify some of the ACLs, where we could just say that 443 and ports 10000-10050 (or whatever range) have to be open and that’s the end of it. We’d be starting the iperf server on some random port in that range, and a persistent iperf server on 5201 would be completely independent, running the risk of both testing at the same time and getting a wrong result.
--Mark |
- Re: [perfsonar-user] pscheduler throughput test fails to complete after pS software upgrade, Mark Feit, 05/01/2019
- Re: [perfsonar-user] pscheduler throughput test fails to complete after pS software upgrade, Uhl, George D. (GSFC-423.0)[SGT INC], 05/01/2019
Archive powered by MHonArc 2.6.19.