perfsonar-user - Re: [perfsonar-user] Problems with Debian pscheduler
Subject: perfSONAR User Q&A and Other Discussion
List archive
- From: "Uhl, George D. (GSFC-423.0)[SGT INC]" <>
- To: Mark Feit <>, Alex Hsia <>
- Cc: "" <>
- Subject: Re: [perfsonar-user] Problems with Debian pscheduler
- Date: Thu, 30 May 2019 14:17:58 +0000
- Dkim-filter: OpenDKIM Filter v2.11.0 ndjsvnpf104.ndc.nasa.gov DA5C9400EDA9
Hi,
I’m one of the users testing to the NOAA perfSONAR test node. I’ve been specifying the “—ip-version 4” switch on the throughput tasks and I’ve just included it with a “troubleshoot” task. In both cases the tests are failing despite that TCP/443 communication works fine. The NOAA test node is a no-agent host in one of my test meshes that include a number of no-agent test hosts and it’s the only one experiencing this problem.
Thanks, George Uhl
$ pscheduler troubleshoot --ip-version 4 nettest.boulder.noaa.gov Performing basic troubleshooting of localhost and nettest.boulder.noaa.gov.
localhost:
Checking path MTU... 65535 (Local) Checking for pScheduler... OK. Checking clock... OK. Idle test.... 13 seconds.... Checking archiving... OK.
nettest.boulder.noaa.gov:
Checking path MTU... 1500+ Checking for pScheduler... OK. Checking clock... OK. Idle test.... 13 seconds.... Checking archiving... OK.
localhost and nettest.boulder.noaa.gov:
Checking path MTU... 1500+ Checking timekeeping... OK. Simple stream test.... 13 seconds.... Failed. Task failed to run properly.
2019-05-30T10:09:01-04:00 on localhost and nettest.boulder.noaa.gov with simplestreamer:
simplestream --dest nettest.boulder.noaa.gov --ip-version 4
Run did not complete: Failed
Diagnostics from localhost: Try 1 failed: Failed to connect: [Errno 111] Connection refused Try 2 failed: Failed to connect: [Errno 111] Connection refused Try 3 failed: Failed to connect: [Errno 111] Connection refused Try 4 failed: Failed to connect: [Errno 111] Connection refused Try 5 failed: Failed to connect: [Errno 111] Connection refused Try 6 failed: Failed to connect: [Errno 111] Connection refused Try 7 failed: Failed to connect: [Errno 111] Connection refused Try 8 failed: Failed to connect: [Errno 111] Connection refused Try 9 failed: Failed to connect: [Errno 111] Connection refused Try 10 failed: Failed to connect: [Errno 111] Connection refused
Error from localhost: Failed to connect: [Errno 111] Connection refused
Diagnostics from nettest.boulder.noaa.gov: Nothing to see at the receiving end.
Error from nettest.boulder.noaa.gov: Timed out [uhl@enpl-pt2-10g ~]$
$ pscheduler task --debug throughput --source enpl-pt2-10g.eos.nasa.gov --dest nettest.boulder.noaa.gov --ip-version 4 2019-05-30T14:15:18 Debug started 2019-05-30T14:15:18 Assistance is from localhost 2019-05-30T14:15:18 Forcing default slip of PT5M 2019-05-30T14:15:18 Converting to spec via https://localhost/pscheduler/tests/throughput/spec Submitting task... 2019-05-30T14:15:18 Fetching participant list 2019-05-30T14:15:18 Spec is: {"dest": "nettest.boulder.noaa.gov", "source": "enpl-pt2-10g.eos.nasa.gov", "ip-version": 4, "schema": 1} 2019-05-30T14:15:18 Params are: {'spec': '{"dest": "nettest.boulder.noaa.gov", "source": "enpl-pt2-10g.eos.nasa.gov", "ip-version": 4, "schema": 1}'} 2019-05-30T14:15:18 Got participants: {u'participants': [u'enpl-pt2-10g.eos.nasa.gov', u'nettest.boulder.noaa.gov']} 2019-05-30T14:15:18 Lead is enpl-pt2-10g.eos.nasa.gov 2019-05-30T14:15:18 Pinging https://enpl-pt2-10g.eos.nasa.gov/pscheduler/ 2019-05-30T14:15:18 enpl-pt2-10g.eos.nasa.gov is up 2019-05-30T14:15:18 Posting task to https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks 2019-05-30T14:15:18 Data is {"test": {"type": "throughput", "spec": {"dest": "nettest.boulder.noaa.gov", "source": "enpl-pt2-10g.eos.nasa.gov", "ip-version": 4, "schema": 1}}, "schema": 1, "schedule": {"slip": "PT5M"}} Task URL: https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/2207091b-b376-4c16-8e29-6d5c9a3e36c7 2019-05-30T14:15:35 Posted https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/2207091b-b376-4c16-8e29-6d5c9a3e36c7 2019-05-30T14:15:35 Submission diagnostics: 2019-05-30T14:15:35 Hints: 2019-05-30T14:15:35 requester: 169.154.197.28 2019-05-30T14:15:35 server: 169.154.197.28 2019-05-30T14:15:35 Identified as everybody, local-interfaces 2019-05-30T14:15:35 Classified as default, friendlies 2019-05-30T14:15:35 Application: Hosts we trust to do everything 2019-05-30T14:15:35 Group 1: Limit 'always' passed 2019-05-30T14:15:35 Group 1: Want all, 1/1 passed, 0/1 failed: PASS 2019-05-30T14:15:35 Application PASSES 2019-05-30T14:15:35 Application: Defaults applied to non-friendly hosts 2019-05-30T14:15:35 Group 1: Limit 'innocuous-tests' failed: Passed but inverted 2019-05-30T14:15:35 Group 1: Limit 'throughput-default-time' passed 2019-05-30T14:15:35 Group 1: Limit 'idleex-default' failed: Test is not 'idleex' 2019-05-30T14:15:35 Group 1: Want any, 1/3 passed, 2/3 failed: PASS 2019-05-30T14:15:35 Application PASSES 2019-05-30T14:15:35 Proposal meets limits Running with tool 'iperf3' Fetching first run... 2019-05-30T14:15:35 Fetching https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/2207091b-b376-4c16-8e29-6d5c9a3e36c7/runs/first 2019-05-30T14:15:36 Handing off: pscheduler watch --first --format text/plain --debug https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/2207091b-b376-4c16-8e29-6d5c9a3e36c7 2019-05-30T14:15:36 Debug started 2019-05-30T14:15:36 Fetching https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/2207091b-b376-4c16-8e29-6d5c9a3e36c7 2019-05-30T14:15:36 Fetching next run from https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/2207091b-b376-4c16-8e29-6d5c9a3e36c7/runs/first
Next scheduled run: https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/2207091b-b376-4c16-8e29-6d5c9a3e36c7/runs/745d9145-65f3-4b4d-804f-2bf1fe1b12b3 Starts 2019-05-30T10:16:05-04:00 (~28 seconds) Ends 2019-05-30T10:16:24-04:00 (~18 seconds) Waiting for result...
Run did not complete: Failed
Diagnostics from enpl-pt2-10g.eos.nasa.gov: No diagnostics.
Error from enpl-pt2-10g.eos.nasa.gov: iperf3 returned an error: error - unable to connect to server: Connection refused
Diagnostics from nettest.boulder.noaa.gov: No diagnostics.
Error from nettest.boulder.noaa.gov: iperf3 returned an error: Process took too long to run. 2019-05-30T14:16:24 Fetching next run from https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/2207091b-b376-4c16-8e29-6d5c9a3e36c7/runs/next
No further runs scheduled.
From: <> on behalf of Mark Feit <>
Alex Hsia writes:
There shouldn't be any firewall blocking access to simplestream and the host is using the default perfSONAR default host level firewall rules, i.e.:
I did some additional poking around and found the cause: the hosts involved have mixed IP stacks.
Nettest is single-stack, so its FQDN has an A record only; sdmz-perfsonar-40g is dual-stack and has both. Because most flavors of Linux will prefer IPv6 if available, sdmz’s FQDN will get an AAAA record first and the listening socket opened will be IPv6. Nettest won’t be able to see that. You can force your way around this by adding the “--ip-version 4” switch when running the troubleshooter or specifying IPv4 addresses explicitly.
The simplestream test and simplestreamer tool were originally written for use during development, but it’s become clear the diagnostics it produces aren’t as helpful as they could be to end users. While I’m at it, I’ll probably make some adjustments to the troubleshooter to spot this situation and either warn about it or take steps to avoid it. I’ve opened a couple tickets on these issues and should have enhancements out with 4.2.0: https://github.com/perfsonar/pscheduler/issues/850 and https://github.com/perfsonar/pscheduler/issues/851.
--Mark
|
- [perfsonar-user] Problems with Debian pscheduler, Alex Hsia, 05/24/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Antoine Delvaux, 05/24/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Alex Hsia, 05/24/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Mark Feit, 05/28/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Alex Hsia, 05/28/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Mark Feit, 05/29/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Uhl, George D. (GSFC-423.0)[SGT INC], 05/30/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Alex Hsia, 05/30/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Uhl, George D. (GSFC-423.0)[SGT INC], 05/30/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Mark Feit, 05/30/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Uhl, George D. (GSFC-423.0)[SGT INC], 05/30/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Alex Hsia, 05/30/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Uhl, George D. (GSFC-423.0)[SGT INC], 05/30/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Mark Feit, 05/29/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Alex Hsia, 05/28/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Antoine Delvaux, 05/24/2019
Archive powered by MHonArc 2.6.19.