Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Can't access pS test results from homepage

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Can't access pS test results from homepage


Chronological Thread 
  • From: Andrew Lake <>
  • To: Kathy Benninger <>
  • Cc: Michael Johnson <>,
  • Subject: Re: [perfsonar-user] Can't access pS test results from homepage
  • Date: Thu, 24 Jan 2019 09:12:06 -0800
  • Ironport-phdr: 9a23:W8fp6x1kJHMV7KqGsmDT+DRfVm0co7zxezQtwd8ZseMfIvad9pjvdHbS+e9qxAeQG9mDu7Qc06L/iOPJYSQ4+5GPsXQPItRndiQuroEopTEmG9OPEkbhLfTnPGQQFcVGU0J5rTngaRAGUMnxaEfPrXKs8DUcBgvwNRZvJuTyB4Xek9m72/q99pHPYAhEniaxba9vJxiqsAvdsdUbj5F/Iagr0BvJpXVIe+VSxWx2IF+Yggjx6MSt8pN96ipco/0u+dJOXqX8ZKQ4UKdXDC86PGAv5c3krgfMQA2S7XYBSGoWkx5IAw/Y7BHmW5r6ryX3uvZh1CScIMb7Vq4/Vyi84Kh3SR/okCYHOCA/8GHLkcx7kaZXrAu8qxBj34LYZYeYOvp5fqPHctMVW3dOVdtSWSJbH4i8a4oPD+wCPe1Fq4XwpEcCoR64CAKxBu3g1yVIi2f206I43eQuHg/I0g89EdwQrHvZt8/6OLsIXO2v0KXE0TfOYvVL0jn98ojIdRUhrOmQULJ2bMXR01cgFg3YhVuWs4PlPC2a3fkKvmeB6epvSOKuhnU5pAFquDSvwNkjipXQi48T11vK+yJ5wIMvKt25Tk52ed6pHZRIuyGCLIt5XtkuTH91tyYn0rEGoYW7fC4Wx5g93x7fb+SLc4mO4hL/SumROzF4i2x5eL6hnRq971WvyvDkWsWp1FtGsDBJnsfCu3wQzRDf9MeKR/Rn8kqh1juDzxjT5f1fIUAvj6bbM5ohzqYtmpYNsknPBDL6lUbogKOMa0kp+/Sk5/76brn7opKRMZJ/hALmMqk2h8CyD/k0PhIQU2WU9+mwzqPv8VP6TblQjvA6jLHVvI7GKckfvKK0AA9Y3pw95xqiEzuqytYVkWQBIVlYYhyIlZLpNEvLIP3gDfewnVCskDBzyvDIILLsDI/BLnzYn7flZ7p95ElcyBQrwdBe4ZJbFK0BLeruVkL/qdDUFAE1PgO6zur9FtlxyIATVXiPD6OHKK/StEWH5uMrI+mCfo8VvzP9JuA/6P7okHA4mUQQcrey3ZcNbnC3AOhmL12DYXXwmtcBDXsKvg0mQezyllKCSzBTZ221X6I6/D47EpuqDZrYRo+zmryMxyO7HpxNZmBaEVCAD23kd4SCW/cQdi2SOMlhnSIYVbS/UYMuywyhtBKpg4dhNffery0EqYr4hp8y4+zIiQp09DpoAt6b3n3XCWx4gyQTVjou1edkoEN7zVmNlrB1mfJDEtpa/bZUSQogZqLbms9zF9DjEijMeNGITlmrWJ3yCzw3SNM3zNIUS15mEJOvgg2VjASwBLpAvrqHHpEruobV3HW5c897x2fu2bJniVQ6FJgcfVa6j7JyolCAT7XClF+UwuPwLak=

Hi Kathy,

Sorry, I realized I lost both user list and Michael from the CC. My apologies, must have been going too fast. Both are added back now.

Thanks,
Andy


On January 24, 2019 at 4:51:30 AM, Kathy Benninger () wrote:

Hi Andy,

I haven't heard from Michael and was wondering if he had recommendations on how to debug the apparent time-out problem I'm seeing. His email address didn't show up in your reply so I can't bug him directly  :-) .

Thanks,
Kathy


On 1/18/2019 3:49 PM, Andrew Lake wrote:
Hi Kathy,

Thanks for the addresses, cassandra appears to be working fine. I can get data out of your archive. Examples:


For some reason the _javascript_ is timing-put when creating that listing. Michael, CC’ed may have some insight since he is our _javascript_ expert.

Thanks,
Andy



On January 18, 2019 at 3:23:46 PM, Kathy Benninger () wrote:

Hi Andy,

I tried the restart, but still get the same error on the homepage:
Error loading test listing; measurement archive unreachable:
http://192.231.244.54/esmond/perfsonar/archive/

It looks like cassandra is running after the restart:

[root@ps-test log]# pkill -9 -f java
[root@ps-test log]# systemctl restart cassandra


[root@ps-test log]# systemctl status cassandra
● cassandra.service - SYSV: Starts and stops Cassandra
Loaded: loaded (/etc/rc.d/init.d/cassandra; bad; vendor preset: disabled)
Active: active (exited) since Fri 2019-01-18 14:49:57 EST; 4s ago
Docs: man:systemd-sysv-generator(8)
Process: 10129 ExecStop=/etc/rc.d/init.d/cassandra stop (code=exited,
status=1/FAILURE)
Process: 10185 ExecStart=/etc/rc.d/init.d/cassandra start (code=exited,
status=0/SUCCESS)

Jan 18 14:49:57 ps-test.psc.edu cassandra[10129]: Jan 18 14:49:52
ps-test.psc.edu su[10139]: (to cassandra) root on none
Jan 18 14:49:57 ps-test.psc.edu cassandra[10129]: Jan 18 14:49:52
ps-test.psc.edu cassandra[10129]: Shutdown Cassandra: bash: line 0: kill:
(6169) - No such process
Jan 18 14:49:57 ps-test.psc.edu systemd[1]: cassandra.service: control
process exited, code=exited status=1
Jan 18 14:49:57 ps-test.psc.edu systemd[1]: Stopped SYSV: Starts and stops
Cassandra.
Jan 18 14:49:57 ps-test.psc.edu systemd[1]: Unit cassandra.service entered
failed state.
Jan 18 14:49:57 ps-test.psc.edu systemd[1]: cassandra.service failed.
Jan 18 14:49:57 ps-test.psc.edu systemd[1]: Starting SYSV: Starts and stops
Cassandra...
Jan 18 14:49:57 ps-test.psc.edu su[10194]: (to cassandra) root on none
Jan 18 14:49:57 ps-test.psc.edu cassandra[10185]: Starting Cassandra: OK
Jan 18 14:49:57 ps-test.psc.edu systemd[1]: Started SYSV: Starts and stops
Cassandra.
[root@ps-test log]#

Do you have some other ideas about what to try? The fact that the problem
persists after a Toolkit reinstall really has me baffled.

Thanks,
Kathy


On 1/18/2019 2:11 PM, Andrew Lake wrote:
> Hi Kathy,
>
> Have you tried restarting cassandra? It’s not uncommon for it to get stuck
> or stop running with little indication unfortunately. Usually the following
> is a pretty sure-fire way to make sure cassandra is stopped and then restart it:
>
> pkill 9 -f java
> systemctl restart cassasandra
>
> Thanks,
> Andy
>
>
> On January 18, 2019 at 11:15:17 AM, Kathy Benninger (
> <mailto:>) wrote:
>
>> Attached are the cassandra/cassandra.log and cassandra/system.log files as
>> it's likely I could be staring at the problem and not recognizing it.
>>
>> Thanks,
>> Kathy
>>
>>
>> On 1/17/2019 1:33 PM, Kathy Benninger wrote:
>> > Hi Szymon,
>> >
>> > cassandra/cassandra.log and cassandra/system.log look similar to log files
>> > on other hosts in the mesh. There's nothing that jumps out as clearly
>> > indicating a problem
>> >
>> > There are 11 hosts in the mesh. Each runs both throughput and latency tests
>> > on the same interface.
>> >
>> > Kathy
>> >
>> >
>> > On 1/17/2019 9:34 AM, Szymon Trocha wrote:
>> >> W dniu 17.01.2019 o 15:11, Kathy Benninger pisze:
>> >>> A perfSONAR built with NetInstall Toolkit v4.1.5-1.el7 from late December
>> >>> is reporting the following error on its homepage under "Test Results":
>> >>>
>> >>>   Error loading test listing; measurement archive unreachable:
>> >>> http://192.231.244.54/esmond/perfsonar/archive/
>> >>>
>> >>> The pS is running tests and responds to test requests from other pSs in
>> >>> the same meshconfig file.
>> >>>
>> >>> httpd/error_log shows timeouts when I bring up the homepage, e.g.,:
>> >>>
>> >>> [Wed Jan 16 11:17:46.585696 2019] [cgi:warn] [pid 21553] [client
>> >>> 128.182.160.119:52235 <http://128.182.160.119:52235>] AH01220: Timeout
>> waiting for output from CGI
>> >>> script /usr/lib/perfsonar/graphs/cgi-bin/graphData.cgi, referer:
>> >>> http://192.231.244.54/toolkit/
>> >>> [Wed Jan 16 11:17:46.585765 2019] [cgi:error] [pid 21553] [client
>> >>> 128.182.160.119:52235 <http://128.182.160.119:52235>] Script timed out
>> before returning headers:
>> >>> graphData.cgi, referer: http://192.231.244.54/toolkit/
>> >>> [Wed Jan 16 11:21:48.043369 2019] [cgi:warn] [pid 19343] [client
>> >>> 128.182.160.119:52338 <http://128.182.160.119:52338>] AH01220: Timeout
>> waiting for output from CGI
>> >>> script /usr/lib/perfsonar/graphs/cgi-bin/graphData.cgi, referer:
>> >>> http://192.231.244.54/toolkit/
>> >>> [Wed Jan 16 11:21:48.043438 2019] [cgi:error] [pid 19343] [client
>> >>> 128.182.160.119:52338 <http://128.182.160.119:52338>] Script timed out
>> before returning headers:
>> >>> graphData.cgi, referer: http://192.231.244.54/toolkit/
>> >>> [Wed Jan 16 11:33:43.135517 2019] [cgi:warn] [pid 23720] [client
>> >>> 128.182.160.119:52536 <http://128.182.160.119:52536>] AH01220: Timeout
>> waiting for output from CGI
>> >>> script /usr/lib/perfsonar/graphs/cgi-bin/graphData.cgi, referer:
>> >>> http://192.231.244.54/toolkit/
>> >>> [Wed Jan 16 11:33:43.135577 2019] [cgi:error] [pid 23720] [client
>> >>> 128.182.160.119:52536 <http://128.182.160.119:52536>] Script timed out
>> before returning headers:
>> >>> graphData.cgi, referer: http://192.231.244.54/toolkit/
>> >>>
>> >>> esmond/django.log:
>> >>>
>> >>> 2019-01-16 10:32:05,652 [ERROR]
>> >>> /usr/lib/esmond/lib/python2.7/site-packages/django/core/handlers/exception.py:
>> >>> Internal Server Error:
>> >>> /esmond/perfsonar/archive/7c9ba2a49ac24a42833efaeec83e6ab7/
>> >>> Traceback (most recent call last):
>> >>>   File
>> >>> "/usr/lib/esmond/lib/python2.7/site-packages/django/core/handlers/exception.py",
>> >>> line 42, in inner
>> >>>     response = get_response(request)
>> >>>   File
>> >>> "/usr/lib/esmond/lib/python2.7/site-packages/django/core/handlers/base.py",
>> >>> line 249, in _legacy_get_response
>> >>>     response = self._get_response(request)
>> >>>   File
>> >>> "/usr/lib/esmond/lib/python2.7/site-packages/django/core/handlers/base.py",
>> >>> line 187, in _get_response
>> >>>     response = self.process_exception_by_middleware(e, request)
>> >>>   File
>> >>> "/usr/lib/esmond/lib/python2.7/site-packages/django/core/handlers/base.py",
>> >>> line 185, in _get_response
>> >>>     response = wrapped_callback(request, *callback_args, **callback_kwargs)
>> >>>   File
>> >>> "/usr/lib/esmond/lib/python2.7/site-packages/django/views/decorators/csrf.py",
>> >>> line 58, in wrapped_view
>> >>>     return view_func(*args, **kwargs)
>> >>>   File
>> >>> "/usr/lib/esmond/lib/python2.7/site-packages/rest_framework/viewsets.py",
>> >>> line 90, in view
>> >>>     return self.dispatch(request, *args, **kwargs)
>> >>>   File
>> >>> "/usr/lib/esmond/lib/python2.7/site-packages/rest_framework/views.py",
>> >>> line 489, in dispatch
>> >>>     response = self.handle_exception(exc)
>> >>>   File
>> >>> "/usr/lib/esmond/lib/python2.7/site-packages/rest_framework/views.py",
>> >>> line 449, in handle_exception
>> >>>     self.raise_uncaught_exception(exc)
>> >>>   File
>> >>> "/usr/lib/esmond/lib/python2.7/site-packages/rest_framework/views.py",
>> >>> line 486, in dispatch
>> >>>     response = handler(request, *args, **kwargs)
>> >>>   File "/usr/lib/esmond/esmond/api/perfsonar/api_v2.py", line 887, in update
>> >>>     check_connection()
>> >>>   File "/usr/lib/esmond/esmond/api/perfsonar/api_v2.py", line 78, in
>> >>> check_connection
>> >>>     db = CASSANDRA_DB(get_config(get_config_path()))
>> >>>   File "/usr/lib/esmond/esmond/cassandra.py", line 134, in __init__
>> >>>     "at %s - %s" % (config.cassandra_servers[0], e))
>> >>> ConnectionException: "System Manager can't connect to Cassandra at
>> >>> localhost:9160 - Could not connect to localhost:9160"
>> >>>
>> >>> I've tried re-installing the Toolkit, but end up in the same situation.
>> >>> Oddly, a new install works correctly for ~5 minutes (i.e., all the tests
>> >>> are listed and linked on the homepage), but then it eventually starts
>> >>> returning the error that the archive is unreachable.
>> >>>
>> >>> Thoughts on how to debug or what could be wrong?
>> >>
>> >>
>> >> Hi Kathy,
>> >>
>> >> is there anything special in /var/log/cassandra/ ?
>> >>
>> >> Are there many tests to load in your host?
>> >>
>> >> Regards,
>> >>
>> >> --
>> >> Szymon Trocha
>> >> Poznań Supercomputing & Networking Center
>> >> General NOC phone +48 61-858-2015 | noc.pcss.pl <http://noc.pcss.pl> <http://noc.pcss.pl>
>> >> Personal desk phone +48 61-858-2022
>> >> Wysłaliśmy do Ciebie ten e-mail w odpowiedzi na Twoje zapytanie lub w
>> >> związku z oferowaną usługą. Przesłanie korespondencji do Centrum
>> >> Zarządzania PCSS lub zgłoszenie telefoniczne jest równoznaczne z
>> >> wyrażeniem zgody na przetwarzanie danych osobowych przez Instytut Chemii
>> >> Bioorganicznej Polskiej Akademii Nauk w Poznaniu adres: ul. Z.
>> >> Noskowskiego 12/14, 61-704 Poznań. Szczegółowe informacje znajdują się w
>> >> naszej Polityce prywatności
>> >> <http://noc.pcss.pl/index.html#PolitykaPrywatnosci>. | This message has
>> >> been sent as a part of communication with PSNC NOC or your service request
>> >> sent to us. For more information read our Privacy Policy
>> >> <http://noc.psnc.pl/index.html#PrivacyPolicy>.
>> >>
>> --
>> To unsubscribe from this list:
>> https://lists.internet2.edu/sympa/signoff/perfsonar-user




Archive powered by MHonArc 2.6.19.

Top of Page