Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Can't access pS test results from homepage

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Can't access pS test results from homepage


Chronological Thread 
  • From: Michael Johnson <>
  • To: Andrew Lake <>
  • Cc: Kathy Benninger <>,
  • Subject: Re: [perfsonar-user] Can't access pS test results from homepage
  • Date: Thu, 24 Jan 2019 13:19:30 -0500
  • Ironport-phdr: 9a23:+JsW9BzBoatYtNjXCy+O+j09IxM/srCxBDY+r6Qd0uoULfad9pjvdHbS+e9qxAeQG9mDu7Qc06L/iOPJYSQ4+5GPsXQPItRndiQuroEopTEmG9OPEkbhLfTnPGQQFcVGU0J5rTngaRAGUMnxaEfPrXKs8DUcBgvwNRZvJuTyB4Xek9m72/q99pHPYAhEniaxba9vJxiqsAvdsdUbj5F/Iagr0BvJpXVIe+VSxWx2IF+Yggjx6MSt8pN96ipco/0u+dJOXqX8ZKQ4UKdXDC86PGAv5c3krgfMQA2S7XYBSGoWkx5IAw/Y7BHmW5r6ryX3uvZh1CScIMb7Vq4/Vyi84Kh3SR/okCYHOCA/8GHLkcx7kaZXrAu8qxBj34LYZYeYP+d8cKzAZ9MXXWRPUMZPWSJcAY28YYQAAPYcMulatIT9qEcCoAGkCAWwGO/iyDlFjWL2060g1OQhFBnL0gshH90SsHTbtsv6NKMPWu6xy6nIzTPDb+hR2Tf79YPFdRUhofOPUL5uasfRxk0vFwTegVqKrYzlOTSV2fgXv2ia8upsT/yghHM6qwxopDWk28kiio7Mho0Py1DE8z10wJ4uJdKkUkJ0f8OrEIZIuyGCLIt2RN8tT3twuCY+0rEGuJi7fDQUx5Q9wR7QdeCHf5aS7h39SemRPC90iG9hdb6hnRq+70atxvDmWsWq31tHoTBJn9nDu3wVzxzc9tOHSuBn8ke53TaDzx3f5v9HLEwuiKbWKIAtzqQtmpcSrUjPBDL6lFjygaKQa04q+fCo5vz5brn6upOQKop5hhz9P6swmcGyBOo4MgYSU2SH/OmwyrPu8Ej8TbhIjvA7lLTSvorAKsQBvKG5BhdY0oY95Ba7CDeryNEYkmMGLFJBYh6Ik5TpNE3ULPD3F/e/hVOsnytxy/DHI73tGIvCIWXekLv5fLZ97VBTyBYrwNxB+55YFqwNLffuVkLyqtPVDRo0MwK6w+r7FNlw050SVGyKD6KcMq7fsUeE5uc1LOmNYI8Vtiz9K/8g5/P2iH85m1wQcbey0JsYbHC4Ge9pLF+dYXrqhdcODX0GvgsjTODwllKNTCNTa26oX60g/jE7FJ6mDYDbS4C1nrOBxim7HoZRZmBAEF+MC27kd5ifW/gSciKfOcthkj0fVbi9UI8tywuitA78y7p7MOXU4CsYuoz/1NRr/eHciww99SEnR/mbyHyHGmFognsTFXhx2KFkvVc7y1Gf3LJ+juACU9Ff+rRSQwIiPNnHzuN8DN78HRrMZNmST1CvWJC7GjwrHe42lvYIb1xwBJ2GhxPOl36jBbMEv7GQQpo57vSP8WL2IpNYyn3G3aQlx3shQs1UMmvu0qt9p1L7CIrOmUydkaGhM6UBmiPB6THQniK1oEhEXVsoAu3+VncFax6T9Iyh6w==

Hi Kathy,

It does appear that your measurement archive is working; however, it appears
that the CGI that retrieves test result data to create the test listing is
timing out. i.e.

http://192.231.244.54/perfsonar-graphs/cgi-bin/graphData.cgi?action=test_list&timeperiod=604800,86400&url=http%3A%2F%2F192.231.244.54%2Fesmond%2Fperfsonar%2Farchive%2F

Do you see some issues in /var/log/httpd/ssl_error_log relating to
graphData.cgi? This could shed some light on what's going on.

Thanks,
Michael

On Thu, Jan 24, 2019 at 09:12:06AM -0800, Andrew Lake wrote:
Hi Kathy,

Sorry, I realized I lost both user list and Michael from the CC. My
apologies, must have been going too fast. Both are added back now.

Thanks,
Andy


On January 24, 2019 at 4:51:30 AM, Kathy Benninger
()
wrote:

Hi Andy,

I haven't heard from Michael and was wondering if he had recommendations on
how to debug the apparent time-out problem I'm seeing. His email address
didn't show up in your reply so I can't bug him directly :-) .

Thanks,
Kathy


On 1/18/2019 3:49 PM, Andrew Lake wrote:

Hi Kathy,

Thanks for the addresses, cassandra appears to be working fine. I can get
data out of your archive. Examples:

http://192.231.244.54/esmond/perfsonar/archive/?time-range=86400
http://192.231.244.54/esmond/perfsonar/archive/013df91b72784f1ebfc16746cbfbf706/packet-loss-rate-bidir/base?time-range=86400

For some reason the javascript is timing-put when creating that listing.
Michael, CC’ed may have some insight since he is our javascript expert.

Thanks,
Andy



On January 18, 2019 at 3:23:46 PM, Kathy Benninger
()
wrote:

Hi Andy,

I tried the restart, but still get the same error on the homepage:
Error loading test listing; measurement archive unreachable:
http://192.231.244.54/esmond/perfsonar/archive/

It looks like cassandra is running after the restart:

[root@ps-test log]# pkill -9 -f java
[root@ps-test log]# systemctl restart cassandra


[root@ps-test log]# systemctl status cassandra
● cassandra.service - SYSV: Starts and stops Cassandra
Loaded: loaded (/etc/rc.d/init.d/cassandra; bad; vendor preset: disabled)
Active: active (exited) since Fri 2019-01-18 14:49:57 EST; 4s ago
Docs: man:systemd-sysv-generator(8)
Process: 10129 ExecStop=/etc/rc.d/init.d/cassandra stop (code=exited,
status=1/FAILURE)
Process: 10185 ExecStart=/etc/rc.d/init.d/cassandra start (code=exited,
status=0/SUCCESS)

Jan 18 14:49:57 ps-test.psc.edu cassandra[10129]: Jan 18 14:49:52
ps-test.psc.edu su[10139]: (to cassandra) root on none
Jan 18 14:49:57 ps-test.psc.edu cassandra[10129]: Jan 18 14:49:52
ps-test.psc.edu cassandra[10129]: Shutdown Cassandra: bash: line 0: kill:
(6169) - No such process
Jan 18 14:49:57 ps-test.psc.edu systemd[1]: cassandra.service: control
process exited, code=exited status=1
Jan 18 14:49:57 ps-test.psc.edu systemd[1]: Stopped SYSV: Starts and stops
Cassandra.
Jan 18 14:49:57 ps-test.psc.edu systemd[1]: Unit cassandra.service entered
failed state.
Jan 18 14:49:57 ps-test.psc.edu systemd[1]: cassandra.service failed.
Jan 18 14:49:57 ps-test.psc.edu systemd[1]: Starting SYSV: Starts and stops
Cassandra...
Jan 18 14:49:57 ps-test.psc.edu su[10194]: (to cassandra) root on none
Jan 18 14:49:57 ps-test.psc.edu cassandra[10185]: Starting Cassandra: OK
Jan 18 14:49:57 ps-test.psc.edu systemd[1]: Started SYSV: Starts and stops
Cassandra.
[root@ps-test log]#

Do you have some other ideas about what to try? The fact that the problem
persists after a Toolkit reinstall really has me baffled.

Thanks,
Kathy


On 1/18/2019 2:11 PM, Andrew Lake wrote:
Hi Kathy,

Have you tried restarting cassandra? It’s not uncommon for it to get stuck
or stop running with little indication unfortunately. Usually the
following
is a pretty sure-fire way to make sure cassandra is stopped and then
restart it:

pkill 9 -f java
systemctl restart cassasandra

Thanks,
Andy


On January 18, 2019 at 11:15:17 AM, Kathy Benninger
(
<mailto:>)
wrote:

Attached are the cassandra/cassandra.log and cassandra/system.log files
as
it's likely I could be staring at the problem and not recognizing it.

Thanks,
Kathy


On 1/17/2019 1:33 PM, Kathy Benninger wrote:
> Hi Szymon,
>
> cassandra/cassandra.log and cassandra/system.log look similar to log
files
> on other hosts in the mesh. There's nothing that jumps out as clearly
> indicating a problem
>
> There are 11 hosts in the mesh. Each runs both throughput and latency
tests
> on the same interface.
>
> Kathy
>
>
> On 1/17/2019 9:34 AM, Szymon Trocha wrote:
>> W dniu 17.01.2019 o 15:11, Kathy Benninger pisze:
>>> A perfSONAR built with NetInstall Toolkit v4.1.5-1.el7 from late
December
>>> is reporting the following error on its homepage under "Test
Results":
>>>
>>> Error loading test listing; measurement archive unreachable:
>>> http://192.231.244.54/esmond/perfsonar/archive/
>>>
>>> The pS is running tests and responds to test requests from other pSs
in
>>> the same meshconfig file.
>>>
>>> httpd/error_log shows timeouts when I bring up the homepage, e.g.,:
>>>
>>> [Wed Jan 16 11:17:46.585696 2019] [cgi:warn] [pid 21553] [client
>>> 128.182.160.119:52235 <http://128.182.160.119:52235>] AH01220:
Timeout
waiting for output from CGI
>>> script /usr/lib/perfsonar/graphs/cgi-bin/graphData.cgi, referer:
>>> http://192.231.244.54/toolkit/
>>> [Wed Jan 16 11:17:46.585765 2019] [cgi:error] [pid 21553] [client
>>> 128.182.160.119:52235 <http://128.182.160.119:52235>] Script timed
out
before returning headers:
>>> graphData.cgi, referer: http://192.231.244.54/toolkit/
>>> [Wed Jan 16 11:21:48.043369 2019] [cgi:warn] [pid 19343] [client
>>> 128.182.160.119:52338 <http://128.182.160.119:52338>] AH01220:
Timeout
waiting for output from CGI
>>> script /usr/lib/perfsonar/graphs/cgi-bin/graphData.cgi, referer:
>>> http://192.231.244.54/toolkit/
>>> [Wed Jan 16 11:21:48.043438 2019] [cgi:error] [pid 19343] [client
>>> 128.182.160.119:52338 <http://128.182.160.119:52338>] Script timed
out
before returning headers:
>>> graphData.cgi, referer: http://192.231.244.54/toolkit/
>>> [Wed Jan 16 11:33:43.135517 2019] [cgi:warn] [pid 23720] [client
>>> 128.182.160.119:52536 <http://128.182.160.119:52536>] AH01220:
Timeout
waiting for output from CGI
>>> script /usr/lib/perfsonar/graphs/cgi-bin/graphData.cgi, referer:
>>> http://192.231.244.54/toolkit/
>>> [Wed Jan 16 11:33:43.135577 2019] [cgi:error] [pid 23720] [client
>>> 128.182.160.119:52536 <http://128.182.160.119:52536>] Script timed
out
before returning headers:
>>> graphData.cgi, referer: http://192.231.244.54/toolkit/
>>>
>>> esmond/django.log:
>>>
>>> 2019-01-16 10:32:05,652 [ERROR]
>>>
/usr/lib/esmond/lib/python2.7/site-packages/django/core/handlers/exception.py:
>>> Internal Server Error:
>>> /esmond/perfsonar/archive/7c9ba2a49ac24a42833efaeec83e6ab7/
>>> Traceback (most recent call last):
>>> File
>>>
"/usr/lib/esmond/lib/python2.7/site-packages/django/core/handlers/exception.py",
>>> line 42, in inner
>>> response = get_response(request)
>>> File
>>>
"/usr/lib/esmond/lib/python2.7/site-packages/django/core/handlers/base.py",
>>> line 249, in _legacy_get_response
>>> response = self._get_response(request)
>>> File
>>>
"/usr/lib/esmond/lib/python2.7/site-packages/django/core/handlers/base.py",
>>> line 187, in _get_response
>>> response = self.process_exception_by_middleware(e, request)
>>> File
>>>
"/usr/lib/esmond/lib/python2.7/site-packages/django/core/handlers/base.py",
>>> line 185, in _get_response
>>> response = wrapped_callback(request, *callback_args,
**callback_kwargs)
>>> File
>>>
"/usr/lib/esmond/lib/python2.7/site-packages/django/views/decorators/csrf.py",
>>> line 58, in wrapped_view
>>> return view_func(*args, **kwargs)
>>> File
>>>
"/usr/lib/esmond/lib/python2.7/site-packages/rest_framework/viewsets.py",
>>> line 90, in view
>>> return self.dispatch(request, *args, **kwargs)
>>> File
>>>
"/usr/lib/esmond/lib/python2.7/site-packages/rest_framework/views.py",
>>> line 489, in dispatch
>>> response = self.handle_exception(exc)
>>> File
>>>
"/usr/lib/esmond/lib/python2.7/site-packages/rest_framework/views.py",
>>> line 449, in handle_exception
>>> self.raise_uncaught_exception(exc)
>>> File
>>>
"/usr/lib/esmond/lib/python2.7/site-packages/rest_framework/views.py",
>>> line 486, in dispatch
>>> response = handler(request, *args, **kwargs)
>>> File "/usr/lib/esmond/esmond/api/perfsonar/api_v2.py", line 887,
in update
>>> check_connection()
>>> File "/usr/lib/esmond/esmond/api/perfsonar/api_v2.py", line 78, in
>>> check_connection
>>> db = CASSANDRA_DB(get_config(get_config_path()))
>>> File "/usr/lib/esmond/esmond/cassandra.py", line 134, in __init__
>>> "at %s - %s" % (config.cassandra_servers[0], e))
>>> ConnectionException: "System Manager can't connect to Cassandra at
>>> localhost:9160 - Could not connect to localhost:9160"
>>>
>>> I've tried re-installing the Toolkit, but end up in the same
situation.
>>> Oddly, a new install works correctly for ~5 minutes (i.e., all the
tests
>>> are listed and linked on the homepage), but then it eventually starts
>>> returning the error that the archive is unreachable.
>>>
>>> Thoughts on how to debug or what could be wrong?
>>
>>
>> Hi Kathy,
>>
>> is there anything special in /var/log/cassandra/ ?
>>
>> Are there many tests to load in your host?
>>
>> Regards,
>>
>> --
>> Szymon Trocha
>> Poznań Supercomputing & Networking Center
>> General NOC phone +48 61-858-2015 | noc.pcss.pl <http://noc.pcss.pl> <
http://noc.pcss.pl>
>> Personal desk phone +48 61-858-2022
>> Wysłaliśmy do Ciebie ten e-mail w odpowiedzi na Twoje zapytanie lub w
>> związku z oferowaną usługą. Przesłanie korespondencji do Centrum
>> Zarządzania PCSS lub zgłoszenie telefoniczne jest równoznaczne z
>> wyrażeniem zgody na przetwarzanie danych osobowych przez Instytut
Chemii
>> Bioorganicznej Polskiej Akademii Nauk w Poznaniu adres: ul. Z.
>> Noskowskiego 12/14, 61-704 Poznań. Szczegółowe informacje znajdują
się w
>> naszej Polityce prywatności
>> <http://noc.pcss.pl/index.html#PolitykaPrywatnosci>. | This message
has
>> been sent as a part of communication with PSNC NOC or your service
request
>> sent to us. For more information read our Privacy Policy
>> <http://noc.psnc.pl/index.html#PrivacyPolicy>.
>>
--
To unsubscribe from this list:
https://lists.internet2.edu/sympa/signoff/perfsonar-user

--
Michael Johnson
GlobalNOC DevOps Engineer

Attachment: smime.p7s
Description: S/MIME cryptographic signature




Archive powered by MHonArc 2.6.19.

Top of Page