Skip to Content.
Sympa Menu

perfsonar-user - RE: [perfsonar-user] Test results unavailable on dashboard

Subject: perfSONAR User Q&A and Other Discussion

List archive

RE: [perfsonar-user] Test results unavailable on dashboard


Chronological Thread 
  • From: Trond Endrestøl <>
  • To: "" <>
  • Subject: RE: [perfsonar-user] Test results unavailable on dashboard
  • Date: Mon, 9 Oct 2017 13:51:10 +0200 (CEST)
  • Ironport-phdr: 9a23:9vreqxDlUE3UvwEYLHN6UyQJP3N1i/DPJgcQr6AfoPdwSP37pcmwAkXT6L1XgUPTWs2DsrQf2rqQ6/iocFdDyK7JiGoFfp1IWk1NouQttCtkPvS4D1bmJuXhdS0wEZcKflZk+3amLRodQ56mNBXdrXKo8DEdBAj0OxZrKeTpAI7SiNm82/yv95HJbQhFgDmwbaluIBmqsA7cqtQYjYx+J6gr1xDHuGFIe+NYxWNpIVKcgRPx7dqu8ZBg7ipdpesv+9ZPXqvmcas4S6dYDCk9PGAu+MLrrxjDQhCR6XYaT24bjwBHAwnB7BH9Q5fxri73vfdz1SWGIcH7S60/VDK/5KlpVRDokj8KOTA5/mHMlMJ+kaBUrhGvpxNxxI7bb52aOeF7fq/BYd8XX3ZNU9xTWiFHH4iyb5EPD+0EPetAsYf9vVwOrR2jCga0C+3vzCJHhmXo0q0hz+Q5CQXG1xEnEtwQqnTUq9P1O7oIXe66yqnH0C/DYO1N2Tfh74jIdgssof+WUrJ/asrRyk4vFwfbgVWWs4DlMDGV1uMTs2ia7OpsT+Svi2k+pgx3vzOhyMAsiozTiYIUzFDJ7Tt2wJ0vKd2+VkF7fcaoEJ1XtyGAKoR2X9kuQ2d2tyYm0rEGuJi7fDQUx5Q9wR7QdeCHf5aS7h39SemRPC90hGp7d7KliRay6k+gyuvmWcmuylpKqDRKksXUunAC1hzT9siHSuZm8Uu7xTmP0AXT5+dZKk43jarWM4AtzqIsmpYOtEnOHDP6lFnzgaOLa0ko5vSk5uH6brjjuJOQK5N4hwTwMqQgn8G/D/o3PhQSU2We4uux0qDo81fjT7VQlPI2l7HUsJDEKsQfoa60GxRV0oM/6xanFTum3s4XnXYcLFJGfxKHi5bmO1fULPD3D/e/hEqskDZox/zcOL3hBY3BLnnFkLj/YbZw81BQxBYvwdxC4p9ZCK0NLO72V0PsqdDUEgM1Pgmpz+viFtlxyp8SVGeBAq+XNa7Sv0GH6v4zL+SJf4AZoDP9JOIk5/7qg385g1gdfayx0JsLcnC4GfJmLFiCbHrimNgBHnwKvgk5TOzullKCVyRfZ3mwX6I7+DE7CYGmAZ3FRoCqmLCBxju0HoVKZmBaDVCBCW/oeJueW/cCcyKSJclhnSYeVbS4Vo8hzg+htBXhxrpjL+rU4TEYtYn92NRv5u3Tkw0y+iJuD8SbzW6NU394knkWSDArwaAs6XB6nx2b3LJ2mPteHMYW+uhESC87M4LR1ep3F4q0Vw7cNJ/dUFu8TM6hBzgrC88qzsUmYkBhFs+kgwyZmSemHulGuaaMAckI6KXa00/cIcpnxm3aRK1p22U8Rc1Dc0CgirRy7BTUA6bHmkqClKGueKJa0yzK6maAi2yJ6hILGDVsWLnICChMLnDdqs70sxvP
  • Openpgp: url=http://fig.ol.no/~trond/trond.key
  • Organization: Fagskolen Innlandet

On Fri, 6 Oct 2017 14:50-0400, Andrew Lake wrote:

> The error message in the web UI is likely a direct result of cassandra
> working correctly and not. That’s the section of the page that grabs the
> test results from esmond, which is the component talking to cassandra. The
> “Internal server” error you get when you are in IPv6 only mode is because
> cassandra is not working and it can’t determine if you have any results or
> not. When you add the IPv4 address, the “no test results” message implies
> that cassandra is working, but esmond is not finding any test results…which
> isn't entirely unexpected because your archive had been going in and out.
>
> For the issue with cassandra not work with IPv6-only I have created an
> issue since obviously we need to tinker with some settings to get this work
> right: https://github.com/esnet/esmond/issues/67.

esmond isn't the only issue.

lsregistrationdaemon is not able to contact ps1.es.net. I wonder why
it doesn't try nor log any IPv6 addresses.

ts=2017-10-09T08:19:33.517337Z
event=org.perfSONAR.LSCacheDaemon.LSCacheHandler.handle.start
guid=943BCEB0-ACCA-11E7-AF70-1AF7915E0D98
ts=2017-10-09T08:19:33.517502Z
event=org.perfSONAR.LSCacheDaemon.LSCacheHandler.cond_get.start
url=http://www.perfsonar.net/ls.cache.hints
guid=943BCEB0-ACCA-11E7-AF70-1AF7915E0D98
ts=2017-10-09T08:19:34.463191Z
event=org.perfSONAR.LSCacheDaemon.LSCacheHandler.cond_get.end
http_response_code=500 url=http://www.perfsonar.net/ls.cache.hints
guid=943BCEB0-ACCA-11E7-AF70-1AF7915E0D98
ts=2017-10-09T08:19:34.463512Z
event=org.perfSONAR.LSCacheDaemon.LSCacheHandler.handle.end msg=No URLs
obtained from hints file status=-1 next_update=1507565974
guid=943BCEB0-ACCA-11E7-AF70-1AF7915E0D98

Attempting http://www.perfsonar.net/ls.cache.hints in a web browser
redirects me to https://www.perfsonar.net/en/ls.cache.hints/ and gives
me a "404: Page not found."

2017/10/09 12:20:34 (1699) INFO> lsregistrationdaemon.pl:257 main:: - Initial
LS URL set to
2017/10/09 12:20:34 (1699) ERROR> lsregistrationdaemon.pl:301 main:: - Unable
to determine ls_instance so not performing any operations
2017/10/09 13:20:35 (1699) ERROR> LookupService.pm:110
perfSONAR_PS::Utils::LookupService::discover_lookup_services - Problem
retrieving http://ps1.es.net:8096/lookup/activehosts.json: 500 Can't connect
to 198.128.151.11:8096 (Network is unreachable)
2017/10/09 13:20:35 (1699) WARN> lsregistrationdaemon.pl:144 main::__ANON__ -
Warned: Use of uninitialized value $current_ls_instance in concatenation (.)
or string at /usr/lib/perfsonar/bin/lsregistrationdaemon.pl line 257.

The names do resolve:

[root@perfsonar ~]# host www.perfsonar.net
www.perfsonar.net is an alias for www.internet2.edu.
www.internet2.edu is an alias for webprod2.internet2.edu.
webprod2.internet2.edu has address 207.75.164.248
webprod2.internet2.edu has IPv6 address 2001:48a8:68fe::248

[root@perfsonar ~]# host ps1.es.net
ps1.es.net is an alias for ps-west.es.net.
ps-west.es.net has address 198.128.151.11
ps-west.es.net has IPv6 address 2001:400:210:151::b

Here's a traceroute and a ping from one of my perfSONAR instances to
ps1.es.net (some values are obfuscated to make it harder to be
processed automatically by certain tools):

[root@perfsonar ~]# traceroute -6 ps1.es.net
traceroute to ps1.es.net (2001:400:210:151::b), 30 hops max, 80 byte packets
1 core-sw.FQDN (2001:db8:x:y::1) 3.015 ms 3.008 ms 2.895 ms
2 gjovik-gw1[.]uninett[.]no (2001:db8:0:z::1) 4.608 ms 4.556 ms 4.575 ms
3 hamar-gw2[.]uninett[.]no (2001:db8:0:xx::1) 3.889 ms 4.079 ms 4.173 ms
4 stolav-gw2[.]uninett[.]no (2001:db8:0:yy::1) 6.809 ms 6.805 ms 6.796 ms
5 dk-ore[.]nordu[.]net (2001:db8:3:c::2) 14.081 ms 14.131 ms 14.151 ms
6 dk-uni[.]nordu[.]net (2001:db8:1:29::3) 14.695 ms 14.196 ms 14.399 ms
7 uk-hex[.]nordu[.]net (2001:db8:1:26::3) 33.930 ms 32.906 ms 32.876 ms
8 gw[.]es[.]net (2001:db8:2:1e::3) 108.962 ms 108.772 ms 108.869 ms
9 aofacr5-ip-a-londcr5[.]es[.]net (2001:db8:0:82::3) 2283.868 ms !N
2281.685 ms !N 2282.278 ms !N

[root@perfsonar ~]# ping -6 -c 4 ps1.es.net
PING ps1.es.net(ps-west.es.net (2001:400:210:151::b)) 56 data bytes
From aofacr5-ip-a-londcr5[.]es[.]net (2001:db8:0:82::3) icmp_seq=1
Destination unreachable: No route
From aofacr5-ip-a-londcr5[.]es[.]net (2001:db8:0:82::3) icmp_seq=2
Destination unreachable: No route
From aofacr5-ip-a-londcr5[.]es[.]net (2001:db8:0:82::3) icmp_seq=3
Destination unreachable: No route
From aofacr5-ip-a-londcr5[.]es[.]net (2001:db8:0:82::3) icmp_seq=4
Destination unreachable: No route

--- ps1.es.net ping statistics ---
4 packets transmitted, 0 received, +4 errors, 100% packet loss, time 3003ms

Maybe ps1.es.net is down, or it's not configured to respond to ping,
or the router is clueless.

> For the issue where you are not getting test results when you do the IPv4
> trick, it's quite possible you just need to wait and the results will
> fill-in.

I have waited all weekend for the results of simple ping tests
scheduled to run every 5 minutes. No cigar.

> You can start by looking in /var/log/pscheduler/pscheduler.log for
> errors.

Oct 9 12:19:34 perfsonar journal: scheduler INFO Started
Oct 9 12:19:34 perfsonar journal: runner INFO Started
Oct 9 12:19:35 perfsonar journal: archiver INFO Started
Oct 9 12:19:36 perfsonar journal: pscheduler-api INFO Started
Oct 9 12:19:37 perfsonar journal: pscheduler-api INFO Limits loaded from
/etc/pscheduler/limits.conf
Oct 9 12:19:47 perfsonar journal: safe_run/archiver ERROR Program threw
an exception after 2:00:12.705722
Oct 9 12:19:47 perfsonar journal: ticker WARNING Queue maintainer got
exception server closed the connection unexpectedly#012#011This probably
means the server terminated abnormally#012#011before or while processing the
request.
Oct 9 12:19:47 perfsonar journal: safe_run/archiver ERROR Exception:
DatabaseError: server closed the connection unexpectedly#012#011This probably
means the server terminated abnormally#012#011before or while processing the
request.#012#012Traceback (most recent call last):#012 File
"/usr/lib/python2.7/site-packages/pscheduler/saferun.py", line 41, in
safe_run#012 function()#012 File
"/usr/libexec/pscheduler/daemons/archiver", line 522, in <lambda>#012
pscheduler.safe_run(lambda: main_program())#012 File
"/usr/libexec/pscheduler/daemons/archiver", line 478, in main_program#012
[options.max_parallel])#012 File
"/usr/lib/python2.7/site-packages/pscheduler/db.py", line 187, in query#012
cursor.execute(query, args)#012DatabaseError: server closed the connection
unexpectedly#012#011This probably means the server terminated
abnormally#012#011before or while processing the request.
Oct 9 12:19:47 perfsonar journal: safe_run/archiver ERROR Waiting 0.25
seconds before restarting
Oct 9 12:19:47 perfsonar journal: safe_run/runner ERROR Program threw an
exception after 2:00:12.711637
Oct 9 12:19:47 perfsonar journal: safe_run/scheduler ERROR Program threw
an exception after 2:00:12.707907
Oct 9 12:19:47 perfsonar journal: safe_run/runner ERROR Exception:
DatabaseError: server closed the connection unexpectedly#012#011This probably
means the server terminated abnormally#012#011before or while processing the
request.#012#012Traceback (most recent call last):#012 File
"/usr/lib/python2.7/site-packages/pscheduler/saferun.py", line 41, in
safe_run#012 function()#012 File
"/usr/libexec/pscheduler/daemons/runner", line 907, in <lambda>#012
pscheduler.safe_run(lambda: main_program())#012 File
"/usr/libexec/pscheduler/daemons/runner", line 817, in main_program#012
""", [refresh]);#012DatabaseError: server closed the connection
unexpectedly#012#011This probably means the server terminated
abnormally#012#011before or while processing the request.
Oct 9 12:19:47 perfsonar journal: safe_run/runner ERROR Waiting 0.25
seconds before restarting
Oct 9 12:19:47 perfsonar journal: safe_run/scheduler ERROR Exception:
DatabaseError: server closed the connection unexpectedly#012#011This probably
means the server terminated abnormally#012#011before or while processing the
request.#012#012Traceback (most recent call last):#012 File
"/usr/lib/python2.7/site-packages/pscheduler/saferun.py", line 41, in
safe_run#012 function()#012 File
"/usr/libexec/pscheduler/daemons/scheduler", line 793, in <lambda>#012
pscheduler.safe_run(lambda: main_program())#012 File
"/usr/libexec/pscheduler/daemons/scheduler", line 749, in main_program#012
cursor.execute(query, args)#012DatabaseError: server closed the connection
unexpectedly#012#011This probably means the server terminated
abnormally#012#011before or while processing the request.
Oct 9 12:19:47 perfsonar journal: safe_run/scheduler ERROR Waiting 0.25
seconds before restarting
Oct 9 12:19:47 perfsonar journal: safe_run/scheduler ERROR Restarting
Oct 9 12:19:47 perfsonar journal: safe_run/archiver ERROR Restarting
Oct 9 12:19:47 perfsonar journal: safe_run/runner ERROR Restarting
Oct 9 12:20:01 perfsonar journal: safe_run/ticker ERROR Program threw an
exception after 2:00:25.984952
Oct 9 12:20:01 perfsonar journal: safe_run/ticker ERROR Exception:
OperationalError: terminating connection due to administrator
command#012server closed the connection unexpectedly#012#011This probably
means the server terminated abnormally#012#011before or while processing the
request.#012#012Traceback (most recent call last):#012 File
"/usr/lib/python2.7/site-packages/pscheduler/saferun.py", line 41, in
safe_run#012 function()#012 File
"/usr/libexec/pscheduler/daemons/ticker", line 153, in <lambda>#012
pscheduler.safe_run(lambda: main_program())#012 File
"/usr/libexec/pscheduler/daemons/ticker", line 133, in main_program#012
cursor.execute("SELECT ticker()")#012OperationalError: terminating connection
due to administrator command#012server closed the connection
unexpectedly#012#011This probably means the server terminated
abnormally#012#011before or while processing the request.
Oct 9 12:20:01 perfsonar journal: safe_run/ticker ERROR Waiting 0.25
seconds before restarting
Oct 9 12:20:01 perfsonar journal: safe_run/ticker ERROR Restarting

> You can also do some basic tests at the command-line like
> “pscheduler task rtt —dest <remote-ip>” to do a simple ping test and/or
> “pscheduler task throughput —dest <remote-ip>” to see if you get results
> from iperf. Those *should* work but if they throw an error it may be
> indicative of another problem.

No errors while running this task:

[root@perfsonar ~]# pscheduler task rtt --dest 2001:db8:x:y::1
Submitting task...
Task URL:
https://perfsonar.FQDN/pscheduler/tasks/046b3539-c8ae-49ff-8642-1d77eba1af4c
Running with tool 'ping'
Fetching first run...

Next scheduled run:
https://perfsonar.FQDN/pscheduler/tasks/046b3539-c8ae-49ff-8642-1d77eba1af4c/runs/0d757c05-c0ae-474f-99e5-16880789699d
Starts 2017-10-09T10:53:37Z (~7 seconds)
Ends 2017-10-09T10:53:48Z (~10 seconds)
Waiting for result...

1 somehost.FQDN (2001:db8:x:y::1) 64 Bytes TTL 64 RTT 7.2000 ms
2 somehost.FQDN (2001:db8:x:y::1) 64 Bytes TTL 64 RTT 0.9870 ms
3 somehost.FQDN (2001:db8:x:y::1) 64 Bytes TTL 64 RTT 3.8000 ms
4 somehost.FQDN (2001:db8:x:y::1) 64 Bytes TTL 64 RTT 0.9590 ms
5 somehost.FQDN (2001:db8:x:y::1) 64 Bytes TTL 64 RTT 0.9930 ms

0% Packet Loss RTT Min/Mean/Max/StdDev = 0.959000/2.788000/7.202000/2.462000
ms

No further runs scheduled.

No results are present in the web UI.

This is the only task to be logged:

Oct 9 12:30:12 perfsonar journal: runner INFO 1: Running
https://perfsonar.FQDN/pscheduler/tasks/9cbb7861-6eec-410a-a61a-bd8e79025b90/runs/06cfdc66-c174-4a6d-93f1-241e82c74f70
Oct 9 12:30:12 perfsonar journal: runner INFO 1: With ping: rtt --dest
2001:db8:x:y::1
Oct 9 12:30:16 perfsonar journal: runner INFO 1: Run succeeded.

--
Trond.


Archive powered by MHonArc 2.6.19.

Top of Page