perfsonar-user - Re: [perfsonar-user] Problems with Debian pscheduler
Subject: perfSONAR User Q&A and Other Discussion
List archive
- From: Alex Hsia <>
- To: "Uhl, George D. (GSFC-423.0)[SGT INC]" <>
- Cc: Mark Feit <>, "" <>
- Subject: Re: [perfsonar-user] Problems with Debian pscheduler
- Date: Thu, 30 May 2019 08:32:30 -0600
NOAA/OAR Phone: (303)497-6351
Mailstop R/ESRL GVoice: (303)536-5430
325 Broadway e-mail:
Boulder, CO 80305 PGP keyid: 8A482A90
========================================================================
Hi,
I’m one of the users testing to the NOAA perfSONAR test node. I’ve been specifying the “—ip-version 4” switch on the throughput tasks and I’ve just included it with a “troubleshoot” task. In both cases the tests are failing despite that TCP/443 communication works fine. The NOAA test node is a no-agent host in one of my test meshes that include a number of no-agent test hosts and it’s the only one experiencing this problem.
Thanks,
George Uhl
$ pscheduler troubleshoot --ip-version 4 nettest.boulder.noaa.gov
Performing basic troubleshooting of localhost and nettest.boulder.noaa.gov.
localhost:
Checking path MTU... 65535 (Local)
Checking for pScheduler... OK.
Checking clock... OK.
Idle test.... 13 seconds.... Checking archiving... OK.
Checking path MTU... 1500+
Checking for pScheduler... OK.
Checking clock... OK.
Idle test.... 13 seconds.... Checking archiving... OK.
localhost and nettest.boulder.noaa.gov:
Checking path MTU... 1500+
Checking timekeeping... OK.
Simple stream test.... 13 seconds.... Failed.
Task failed to run properly.
2019-05-30T10:09:01-04:00 on localhost and nettest.boulder.noaa.gov with simplestreamer:
simplestream --dest nettest.boulder.noaa.gov --ip-version 4
Run did not complete: Failed
Diagnostics from localhost:
Try 1 failed: Failed to connect: [Errno 111] Connection refused
Try 2 failed: Failed to connect: [Errno 111] Connection refused
Try 3 failed: Failed to connect: [Errno 111] Connection refused
Try 4 failed: Failed to connect: [Errno 111] Connection refused
Try 5 failed: Failed to connect: [Errno 111] Connection refused
Try 6 failed: Failed to connect: [Errno 111] Connection refused
Try 7 failed: Failed to connect: [Errno 111] Connection refused
Try 8 failed: Failed to connect: [Errno 111] Connection refused
Try 9 failed: Failed to connect: [Errno 111] Connection refused
Try 10 failed: Failed to connect: [Errno 111] Connection refused
Error from localhost:
Failed to connect: [Errno 111] Connection refused
Diagnostics from nettest.boulder.noaa.gov:
Nothing to see at the receiving end.
Error from nettest.boulder.noaa.gov:
Timed out
[uhl@enpl-pt2-10g ~]$
$ pscheduler task --debug throughput --source enpl-pt2-10g.eos.nasa.gov --dest nettest.boulder.noaa.gov --ip-version 4
2019-05-30T14:15:18 Debug started
2019-05-30T14:15:18 Assistance is from localhost
2019-05-30T14:15:18 Forcing default slip of PT5M
2019-05-30T14:15:18 Converting to spec via https://localhost/pscheduler/tests/throughput/spec
Submitting task...
2019-05-30T14:15:18 Fetching participant list
2019-05-30T14:15:18 Spec is: {"dest": "nettest.boulder.noaa.gov", "source": "enpl-pt2-10g.eos.nasa.gov", "ip-version": 4, "schema": 1}
2019-05-30T14:15:18 Params are: {'spec': '{"dest": "nettest.boulder.noaa.gov", "source": "enpl-pt2-10g.eos.nasa.gov", "ip-version": 4, "schema": 1}'}
2019-05-30T14:15:18 Got participants: {u'participants': [u'enpl-pt2-10g.eos.nasa.gov', u'nettest.boulder.noaa.gov']}
2019-05-30T14:15:18 Lead is enpl-pt2-10g.eos.nasa.gov
2019-05-30T14:15:18 Pinging https://enpl-pt2-10g.eos.nasa.gov/pscheduler/
2019-05-30T14:15:18 enpl-pt2-10g.eos.nasa.gov is up
2019-05-30T14:15:18 Posting task to https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks
2019-05-30T14:15:18 Data is {"test": {"type": "throughput", "spec": {"dest": "nettest.boulder.noaa.gov", "source": "enpl-pt2-10g.eos.nasa.gov", "ip-version": 4, "schema": 1}}, "schema": 1, "schedule": {"slip": "PT5M"}}
Task URL:
https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/2207091b-b376-4c16-8e29-6d5c9a3e36c7
2019-05-30T14:15:35 Posted https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/2207091b-b376-4c16-8e29-6d5c9a3e36c7
2019-05-30T14:15:35 Submission diagnostics:
2019-05-30T14:15:35 Hints:
2019-05-30T14:15:35 requester: 169.154.197.28
2019-05-30T14:15:35 server: 169.154.197.28
2019-05-30T14:15:35 Identified as everybody, local-interfaces
2019-05-30T14:15:35 Classified as default, friendlies
2019-05-30T14:15:35 Application: Hosts we trust to do everything
2019-05-30T14:15:35 Group 1: Limit 'always' passed
2019-05-30T14:15:35 Group 1: Want all, 1/1 passed, 0/1 failed: PASS
2019-05-30T14:15:35 Application PASSES
2019-05-30T14:15:35 Application: Defaults applied to non-friendly hosts
2019-05-30T14:15:35 Group 1: Limit 'innocuous-tests' failed: Passed but inverted
2019-05-30T14:15:35 Group 1: Limit 'throughput-default-time' passed
2019-05-30T14:15:35 Group 1: Limit 'idleex-default' failed: Test is not 'idleex'
2019-05-30T14:15:35 Group 1: Want any, 1/3 passed, 2/3 failed: PASS
2019-05-30T14:15:35 Application PASSES
2019-05-30T14:15:35 Proposal meets limits
Running with tool 'iperf3'
Fetching first run...
2019-05-30T14:15:35 Fetching https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/2207091b-b376-4c16-8e29-6d5c9a3e36c7/runs/first
2019-05-30T14:15:36 Handing off: pscheduler watch --first --format text/plain --debug https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/2207091b-b376-4c16-8e29-6d5c9a3e36c7
2019-05-30T14:15:36 Debug started
2019-05-30T14:15:36 Fetching https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/2207091b-b376-4c16-8e29-6d5c9a3e36c7
2019-05-30T14:15:36 Fetching next run from https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/2207091b-b376-4c16-8e29-6d5c9a3e36c7/runs/first
Next scheduled run:
Starts 2019-05-30T10:16:05-04:00 (~28 seconds)
Ends 2019-05-30T10:16:24-04:00 (~18 seconds)
Waiting for result...
Run did not complete: Failed
Diagnostics from enpl-pt2-10g.eos.nasa.gov:
No diagnostics.
Error from enpl-pt2-10g.eos.nasa.gov:
iperf3 returned an error: error - unable to connect to server: Connection refused
Diagnostics from nettest.boulder.noaa.gov:
No diagnostics.
Error from nettest.boulder.noaa.gov:
iperf3 returned an error:
Process took too long to run.
2019-05-30T14:16:24 Fetching next run from https://enpl-pt2-10g.eos.nasa.gov/pscheduler/tasks/2207091b-b376-4c16-8e29-6d5c9a3e36c7/runs/next
No further runs scheduled.
From: <> on behalf of Mark Feit <>
Reply-To: Mark Feit <>
Date: Wednesday, May 29, 2019 at 10:59 AM
To: Alex Hsia <>
Cc: "" <>
Subject: Re: [perfsonar-user] Problems with Debian pscheduler
Alex Hsia writes:
There shouldn't be any firewall blocking access to simplestream and the host is using the default perfSONAR default host level firewall rules, i.e.:
I did some additional poking around and found the cause: the hosts involved have mixed IP stacks.
Nettest is single-stack, so its FQDN has an A record only; sdmz-perfsonar-40g is dual-stack and has both. Because most flavors of Linux will prefer IPv6 if available, sdmz’s FQDN will get an AAAA record first and the listening socket opened will be IPv6. Nettest won’t be able to see that. You can force your way around this by adding the “--ip-version 4” switch when running the troubleshooter or specifying IPv4 addresses explicitly.
The simplestream test and simplestreamer tool were originally written for use during development, but it’s become clear the diagnostics it produces aren’t as helpful as they could be to end users. While I’m at it, I’ll probably make some adjustments to the troubleshooter to spot this situation and either warn about it or take steps to avoid it. I’ve opened a couple tickets on these issues and should have enhancements out with 4.2.0: https://github.com/perfsonar/pscheduler/issues/850 and https://github.com/perfsonar/pscheduler/issues/851.
--Mark
- [perfsonar-user] Problems with Debian pscheduler, Alex Hsia, 05/24/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Antoine Delvaux, 05/24/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Alex Hsia, 05/24/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Mark Feit, 05/28/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Alex Hsia, 05/28/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Mark Feit, 05/29/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Uhl, George D. (GSFC-423.0)[SGT INC], 05/30/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Alex Hsia, 05/30/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Uhl, George D. (GSFC-423.0)[SGT INC], 05/30/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Mark Feit, 05/30/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Uhl, George D. (GSFC-423.0)[SGT INC], 05/30/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Alex Hsia, 05/30/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Uhl, George D. (GSFC-423.0)[SGT INC], 05/30/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Mark Feit, 05/29/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Alex Hsia, 05/28/2019
- Re: [perfsonar-user] Problems with Debian pscheduler, Antoine Delvaux, 05/24/2019
Archive powered by MHonArc 2.6.19.