perfsonar-user - Re: [perfsonar-user] "pscheduler result" raises exception {"result-merged": null}
Subject: perfSONAR User Q&A and Other Discussion
List archive
Re: [perfsonar-user] "pscheduler result" raises exception {"result-merged": null}
Chronological Thread
- From: Brian Candler <>
- To: "" <>
- Subject: Re: [perfsonar-user] "pscheduler result" raises exception {"result-merged": null}
- Date: Fri, 6 Sep 2019 11:40:45 +0100
- Domainkey-signature: a=rsa-sha1; c=nofws; d=pobox.com; h=from:subject:to :references:message-id:date:mime-version:in-reply-to :content-type:content-transfer-encoding; q=dns; s=sasl; b=FiaxuX zqoRcxVzOLGfPHaeKKsaQOhN6+ITQVYaCUpGPJjpOR/4uUDXnHnzMaBG/w0y9rS/ KEBgtZOnSzDEmL8uBs/D6cb2LKjw2YdAIgyccrBX+aGOI+zHS3KOGVKbH5S44cSE pwJwbdrSOiQwMuw8zVT1PdYH0JkziFR6FSx3k=
I am making slight progress, although I'm having to do far more reverse-engineering than I'd hope. To summarise, the current symptoms are:
- throughput tasks submitted at the command line (pscheduler task ...) display results for both outbound and inbound tests, in the CLI response.
- but scheduled tasks (created in the GUI) are storing results for outbound, but not inbound. Specifically, the toolkit front page list of tests shows "n/a" for the inbound throughput; and the graph page shows an outbound line but no inbound line.
Here's what I believe I have discovered.
1. for inbound tests (i.e. --source <remotenode> --dest <localnode>), the "lead participant" is the remote node
2. the lead node is the only one which stores the result_merged value in the database. This means that
pscheduler result https://localhost/pscheduler/tasks/<tuuid>/runs/<ruuid>
complains that result-merged is missing, but
pscheduler result https://<remotenode>/pscheduler/tasks/<tuuid>/runs/<ruuid>
shows the full result
That explains my 'pscheduler result' problem: I should have been asking the lead node.
3. This implies to me that the archiver, which writes the results to esmond, runs on the lead node - which in this case is the remote node
This in turn implies that either esmond on the local node must be fully open to writes from the whole Internet (ouch!); or that some sort of access token is given to the remote node to grant it permission to write to the local node.
Fortunately, I have full control of both the local node (Ubuntu 18.04 / perfsonar-toolkit 4.2.0) and the remote node (Ubuntu 16.04 / perfsonar-toolkit 4.1.6) so was able to turn on debugging at the far end.
Debugging on the remote node says archiving was successful:
Sep 6 10:49:53 ns1 archiver-esmond/archive DEBUG Debug started
Sep 6 10:49:53 ns1 archiver-esmond/archive message repeated 2 times: [ DEBUG Debug started]
Sep 6 10:49:53 ns1 archiver-esmond/archive DEBUG Archiver received: {u'last-attempt': None, u'attempts': 0, u'result':...
Sep 6 10:49:53 ns1 archiver-esmond/archive DEBUG Found metadata key d7bd3b9b8f28474b97a730a49364ccb4 in cache for https://localhost/pscheduler/tasks/b28773ff-3c0e-4b5d-99e5-cf9a1fc143f8@https://[A:A:A:A::45]/esmond/perfsonar/archive/, so running in fast mode
Sep 6 10:49:53 ns1 archiver-esmond/archive DEBUG Found metadata key d7bd3b9b8f28474b97a730a49364ccb4 in cache for https://localhost/pscheduler/tasks/b28773ff-3c0e-4b5d-99e5-cf9a1fc143f8@https://[A:A:A:A::45]/esmond/perfsonar/archive/, so running in fast mode
Sep 6 10:49:53 ns1 archiver-esmond/archive DEBUG fast_mode is True
Sep 6 10:49:53 ns1 archiver-esmond/archive DEBUG fast_mode is True
Sep 6 10:49:53 ns1 archiver-esmond/archive DEBUG Putting data to https://[A:A:A:A::45]/esmond/perfsonar/archive/d7bd3b9b8f28474b97a730a49364ccb4/: {'data': [{'ts': 1567763364, 'val': [{'val': 206006572.70301065, 'event-type': 'throughput'}, {'val': 637, 'event-type': 'packet-retransmits'}, ...snip... {'val': {'href': 'https://localhost/pscheduler/tasks/b28773ff-3c0e-4b5d-99e5-cf9a1fc143f8/runs/94b5a7a6-5c85-4c0c-b369-f6325f153f2d'}, 'event-type': 'pscheduler-run-href'}]}]}
Sep 6 10:49:53 ns1 archiver-esmond/archive DEBUG Putting data to https://[A:A:A:A::45]/esmond/perfsonar/archive/d7bd3b9b8f28474b97a730a49364ccb4/: {'data': ...snip...
Sep 6 10:49:53 ns1 archiver DEBUG 174540: Returned JSON from archiver: {u'succeeded': True}
Sep 6 10:49:53 ns1 archiver DEBUG 174540: Succeeded: 94b5a7a6-5c85-4c0c-b369-f6325f153f2d to esmond
Sep 6 10:49:53 ns1 archiver DEBUG 174540: Thread finished
Sep 6 10:49:53 ns1 archiver DEBUG Notifications: archiving_change
Sep 6 10:49:53 ns1 archiver DEBUG Nothing to archive; finding time until next archiving.
Sep 6 10:49:53 ns1 archiver DEBUG Nothing in the future.
Sep 6 10:49:53 ns1 archiver DEBUG Waiting 15.0 for change or notification
(Aside: it's weird that those lines are logged twice, apart from "Archiver received". Those messages come from /usr/lib/pscheduler/classes/archiver/esmond/archive)
But the specified archive on the local node appears to be empty: event-types is empty list.
root@perf1:~# curl -sk https://localhost/esmond/perfsonar/archive/d7bd3b9b8f28474b97a730a49364ccb4/
{"url":"https://localhost/esmond/perfsonar/archive/d7bd3b9b8f28474b97a730a49364ccb4/","metadata-key":"d7bd3b9b8f28474b97a730a49364ccb4","subject-type":"point-to-point","event-types":[],"uri":"/esmond/perfsonar/archive/d7bd3b9b8f28474b97a730a49364ccb4/"}
On the local node, /var/log/apache2/access.log shows success (status code 201) for the write:
B:B:B:B:feff:ff:fe00:f16 - - [06/Sep/2019:09:49:53 +0000] "PUT /esmond/perfsonar/archive/d7bd3b9b8f28474b97a730a49364ccb4/ HTTP/1.1" 201 1558 "-" "python-requests/2.9.1"
At this point I'm pretty stumped; I don't know how to go about debugging esmond internals. /var/log/esmond/esmond.log is zero bytes.
There *is* one other thing that might be related. The local node has no reverse DNS for its IPv6 address; I haven't gotten round to getting it delegated yet. Hence you can see the URL generated with https://[ipaddress]/. But that URL is valid, and in any case, if it couldn't be contacted, the archiver would have reported an error. And of course, I see the successful access log.
To try to rule that out, I've just added an /etc/hosts entry on the remote host.
Regards,
Brian Candler.
- [perfsonar-user] "pscheduler result" raises exception {"result-merged": null}, Brian Candler, 09/05/2019
- Re: [perfsonar-user] "pscheduler result" raises exception {"result-merged": null}, Brian Candler, 09/05/2019
- Re: [perfsonar-user] "pscheduler result" raises exception {"result-merged": null}, Brian Candler, 09/06/2019
- Re: [perfsonar-user] "pscheduler result" raises exception {"result-merged": null}, Mark Feit, 09/06/2019
- Re: [perfsonar-user] "pscheduler result" raises exception {"result-merged": null}, Brian Candler, 09/06/2019
- Re: [perfsonar-user] "pscheduler result" raises exception {"result-merged": null}, Brian Candler, 09/06/2019
- Re: [perfsonar-user] "pscheduler result" raises exception {"result-merged": null}, Mark Feit, 09/06/2019
- Re: [perfsonar-user] "pscheduler result" raises exception {"result-merged": null}, Brian Candler, 09/06/2019
- Re: [perfsonar-user] "pscheduler result" raises exception {"result-merged": null}, Brian Candler, 09/05/2019
Archive powered by MHonArc 2.6.19.