perfsonar-user - RE: [perfsonar-user] pscheduler problems on Rocky 9 after reboot
Subject: perfSONAR User Q&A and Other Discussion
List archive
- From: "Contardo, Gianni Carlo" <>
- To: Andrew Gallo <>, "" <>
- Subject: RE: [perfsonar-user] pscheduler problems on Rocky 9 after reboot
- Date: Thu, 11 Jan 2024 18:36:14 +0000
- Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=llnl.gov; dmarc=pass action=none header.from=llnl.gov; dkim=pass header.d=llnl.gov; arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=S+jmUM3BTpgz539aJ6XjR+CqPlQNWsanL4r5L/VReFA=; b=UyNHm5xm0vfqHUoUX1hP3yqzw2Yi5xbfcKHZfEEr39NUFiZtIDOxx+bCl4FafESjLJnVFbGWiKif0geD9EPKKeOrJYhti/weBrt44sHAeVrLiRFDBB8VqI94K+IJFZcDfbndyWIF33JVoSWNLsxJ5WDkSe1z41/HQ/IRhclbpFIv4SfobzPStDjdfXVXw6ZBg8RX1EAX3+gDaXVMJMxZidUv1VGwoobcwpnGOTAOFh0jjUIu3YeJ+8FI3+/scXB55hnS09eI37bQg9UPIfxnKzUuIlWlw1zFN9nnTSBFnZ94IBvVLJToAnA3k5hsY6njrVMxQdbVfpKeqI1MnKbQWQ==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IZwSuN9cJn65/+6NVkm3pcySRvQrzWl9ACKY5kvlL4F0L0rHCDOzEM/XDG+DZaEpw1PBb1CgylV9e5cQOAe2glvNSotXTjxkaxLsEtRWzKp9RuZMrIoqvrwtL2kO+1pfzWoV299ErCd/cVbi5vSHZmZpALrWCzTNtpnfDIAav1MmilKvTc8HmwF7ln7cOD9yEB0rWFBmta7CYZKTS2X14OCTM+ML21EkdVJcs4mJ1QODT69aNAKF+OrOm/bRR9AI/FqbW9Okp7dzGO3WzEN3iMI4G8uYznikBri9nPxnvJGinU0Cr2EzAzcHC7GMKydGrivAPr26I5cyMi5UPbDhcw==
After upgrading to the most recent release we are having a similar issue. We
are on RHEL 7.9 running the toolkit bundle. When we upgraded everything
seemed fine. After the we rebooted a couple of the measurement hosts for
reasons unrelated to perfSONAR, the toolkit web page show pscheduler as "Not
Running". That's something we haven't seen before.
Gianni
-----Original Message-----
From:
<> On Behalf Of Andrew Gallo
Sent: Thursday, January 11, 2024 10:29 AM
To:
Subject: [perfsonar-user] pscheduler problems on Rocky 9 after reboot
Greetings,
I've installed perfsonar toolkit on a fresh install of Rocky 9.
I had some one-way latency tests working, but after a reboot, the tests
aren't running.
I can see that some pscheduler services aren't starting as expected
> systemctl --failed
> UNIT LOAD ACTIVE SUB DESCRIPTION
>
> ● pscheduler-archiver.service loaded failed failed pScheduler server -
> archiver
> ● pscheduler-runner.service loaded failed failed pScheduler server -
> runner
> ● pscheduler-ticker.service loaded failed failed pScheduler server -
> ticker
The logs for pscheduler-ticker are at the end of this message.
If I restart the services (via systemctl reset-failed), everything reports
running, but still, tests aren't running.
The hosts can communicate
> [root@supp-synclab agallo]# pscheduler ping 162.250.137.9
> 162.250.137.9: pScheduler is alive
> [root@supp-synclab agallo]# pscheduler clock 162.250.137.9
> clock 2024-01-11T13:25:49.820681-05:00 synchronized localhost
> clock 2024-01-11T13:25:49.859667-05:00 synchronized 162.250.137.9
> difference PT0.038986S safe
The below logs seem to indicate a problem connecting to the postgres unix
socket. The socket exists:
> total 4.0K
> srwxrwxrwx 1 postgres postgres 0 Jan 11 12:59 .s.PGSQL.5432=
> -rw------- 1 postgres postgres 61 Jan 11 12:59 .s.PGSQL.5432.lock
Not sure what the problem is.
Thank you.
> journalctl -u pscheduler-ticker.service Jan 11 12:58:33 acad-synclab
> systemd[1]: pscheduler-ticker.service: Main process exited,
> code=exited, status=1/FAILURE Jan 11 12:58:33 acad-synclab systemd[1]:
> pscheduler-ticker.service: Failed with result 'exit-code'.
> Jan 11 12:58:33 acad-synclab systemd[1]: pscheduler-ticker.service:
> Scheduled restart job, restart counter is at 4.
> Jan 11 12:58:33 acad-synclab systemd[1]: Stopped pScheduler server - ticker.
> Jan 11 12:58:33 acad-synclab systemd[1]: Starting pScheduler server -
> ticker...
> Jan 11 12:58:33 acad-synclab systemd[1]: Started pScheduler server - ticker.
> Jan 11 12:58:33 acad-synclab ticker[1493]: ticker INFO Started
> Jan 11 12:58:33 acad-synclab ticker[1493]: Exception in thread Thread-1:
> Jan 11 12:58:33 acad-synclab ticker[1493]: Traceback (most recent call
> last):
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/lib64/python3.9/threading.py", line 980, in _bootstrap_inner
> Jan 11 12:58:33 acad-synclab ticker[1493]: Traceback (most recent call
> last):
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/libexec/pscheduler/daemons/ticker", line 155, in <module>
> Jan 11 12:58:33 acad-synclab ticker[1493]: main_program()
> Jan 11 12:58:33 acad-synclab ticker[1493]: self.run()
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/libexec/pscheduler/daemons/ticker", line 115, in main_program
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/lib64/python3.9/threading.py", line 917, in run
> Jan 11 12:58:33 acad-synclab ticker[1493]: db =
> pscheduler.pg_connection(dsn)
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/lib/python3.9/site-packages/pscheduler/db.py", line 43, in
> pg_connection
> Jan 11 12:58:33 acad-synclab ticker[1493]: self._target(*self._args,
> **self._kwargs)
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/libexec/pscheduler/daemons/ticker", line 109, in <lambda>
> Jan 11 12:58:33 acad-synclab ticker[1493]: pg = psycopg2.connect(dsn)
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/lib64/python3.9/site-packages/psycopg2/__init__.py", line 127, in
> connect
> Jan 11 12:58:33 acad-synclab ticker[1493]: target=lambda:
> http_queue_maintainer(log))
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/libexec/pscheduler/daemons/ticker", line 58, in http_queue_maintainer
> Jan 11 12:58:33 acad-synclab ticker[1493]: conn = _connect(dsn,
> connection_factory=connection_factory, **kwasync)
> Jan 11 12:58:33 acad-synclab ticker[1493]: conn =
> pscheduler.pg_connection(dsn)
> Jan 11 12:58:33 acad-synclab ticker[1493]: psycopg2.OperationalError: could
> not connect to server: No such file or directory
> Jan 11 12:58:33 acad-synclab ticker[1493]: Is the server running
> locally and accepting
> Jan 11 12:58:33 acad-synclab ticker[1493]: connections on Unix
> domain socket "/var/run/postgresql/.s.PGSQL.5432"?
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/lib/python3.9/site-packages/pscheduler/db.py", line 43, in
> pg_connection
> Jan 11 12:58:33 acad-synclab systemd[1]: pscheduler-ticker.service:
> Main process exited, code=exited, status=1/FAILURE Jan 11 12:58:33
> acad-synclab systemd[1]: pscheduler-ticker.service: Failed with result
> 'exit-code'.
> Jan 11 12:58:33 acad-synclab systemd[1]: pscheduler-ticker.service:
> Scheduled restart job, restart counter is at 5.
> Jan 11 12:58:33 acad-synclab systemd[1]: Stopped pScheduler server - ticker.
> Jan 11 12:58:33 acad-synclab systemd[1]: pscheduler-ticker.service: Start
> request repeated too quickly.
> Jan 11 12:58:33 acad-synclab systemd[1]: pscheduler-ticker.service: Failed
> with result 'exit-code'.
> Jan 11 12:58:33 acad-synclab systemd[1]: Failed to start pScheduler server
> - ticker.
--
________________________________
Andrew Gallo
The George Washington University
- [perfsonar-user] pscheduler problems on Rocky 9 after reboot, Andrew Gallo, 01/11/2024
- RE: [perfsonar-user] pscheduler problems on Rocky 9 after reboot, Contardo, Gianni Carlo, 01/11/2024
- Re: [perfsonar-user] pscheduler problems on Rocky 9 after reboot, Mark Feit, 01/11/2024
- Re: [perfsonar-user] pscheduler problems on Rocky 9 after reboot, Andrew Gallo, 01/16/2024
- Re: [perfsonar-user] pscheduler problems on Rocky 9 after reboot, Laurie Zirkle, 01/23/2024
- RE: [perfsonar-user] pscheduler problems on Rocky 9 after reboot, Contardo, Gianni Carlo, 01/25/2024
- Re: [perfsonar-user] pscheduler problems on Rocky 9 after reboot, Otto J Wittner, 01/25/2024
- Re: [perfsonar-user] pscheduler problems on Rocky 9 after reboot, Mark Feit, 01/26/2024
- Re: [perfsonar-user] pscheduler problems on Rocky 9 after reboot, Andrew Gallo, 01/16/2024
Archive powered by MHonArc 2.6.24.