Skip to Content.
Sympa Menu

perfsonar-user - RE: [perfsonar-user] pscheduler problems on Rocky 9 after reboot

Subject: perfSONAR User Q&A and Other Discussion

List archive

RE: [perfsonar-user] pscheduler problems on Rocky 9 after reboot


Chronological Thread 
  • From: "Contardo, Gianni Carlo" <>
  • To: Andrew Gallo <>, "" <>
  • Subject: RE: [perfsonar-user] pscheduler problems on Rocky 9 after reboot
  • Date: Thu, 11 Jan 2024 18:36:14 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=llnl.gov; dmarc=pass action=none header.from=llnl.gov; dkim=pass header.d=llnl.gov; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=S+jmUM3BTpgz539aJ6XjR+CqPlQNWsanL4r5L/VReFA=; b=UyNHm5xm0vfqHUoUX1hP3yqzw2Yi5xbfcKHZfEEr39NUFiZtIDOxx+bCl4FafESjLJnVFbGWiKif0geD9EPKKeOrJYhti/weBrt44sHAeVrLiRFDBB8VqI94K+IJFZcDfbndyWIF33JVoSWNLsxJ5WDkSe1z41/HQ/IRhclbpFIv4SfobzPStDjdfXVXw6ZBg8RX1EAX3+gDaXVMJMxZidUv1VGwoobcwpnGOTAOFh0jjUIu3YeJ+8FI3+/scXB55hnS09eI37bQg9UPIfxnKzUuIlWlw1zFN9nnTSBFnZ94IBvVLJToAnA3k5hsY6njrVMxQdbVfpKeqI1MnKbQWQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IZwSuN9cJn65/+6NVkm3pcySRvQrzWl9ACKY5kvlL4F0L0rHCDOzEM/XDG+DZaEpw1PBb1CgylV9e5cQOAe2glvNSotXTjxkaxLsEtRWzKp9RuZMrIoqvrwtL2kO+1pfzWoV299ErCd/cVbi5vSHZmZpALrWCzTNtpnfDIAav1MmilKvTc8HmwF7ln7cOD9yEB0rWFBmta7CYZKTS2X14OCTM+ML21EkdVJcs4mJ1QODT69aNAKF+OrOm/bRR9AI/FqbW9Okp7dzGO3WzEN3iMI4G8uYznikBri9nPxnvJGinU0Cr2EzAzcHC7GMKydGrivAPr26I5cyMi5UPbDhcw==

After upgrading to the most recent release we are having a similar issue. We
are on RHEL 7.9 running the toolkit bundle. When we upgraded everything
seemed fine. After the we rebooted a couple of the measurement hosts for
reasons unrelated to perfSONAR, the toolkit web page show pscheduler as "Not
Running". That's something we haven't seen before.



Gianni



-----Original Message-----
From:
<> On Behalf Of Andrew Gallo
Sent: Thursday, January 11, 2024 10:29 AM
To:
Subject: [perfsonar-user] pscheduler problems on Rocky 9 after reboot

Greetings,

I've installed perfsonar toolkit on a fresh install of Rocky 9.

I had some one-way latency tests working, but after a reboot, the tests
aren't running.

I can see that some pscheduler services aren't starting as expected

> systemctl --failed
> UNIT LOAD ACTIVE SUB DESCRIPTION
>
> ● pscheduler-archiver.service loaded failed failed pScheduler server -
> archiver
> ● pscheduler-runner.service loaded failed failed pScheduler server -
> runner
> ● pscheduler-ticker.service loaded failed failed pScheduler server -
> ticker


The logs for pscheduler-ticker are at the end of this message.

If I restart the services (via systemctl reset-failed), everything reports
running, but still, tests aren't running.

The hosts can communicate
> [root@supp-synclab agallo]# pscheduler ping 162.250.137.9
> 162.250.137.9: pScheduler is alive
> [root@supp-synclab agallo]# pscheduler clock 162.250.137.9
> clock 2024-01-11T13:25:49.820681-05:00 synchronized localhost
> clock 2024-01-11T13:25:49.859667-05:00 synchronized 162.250.137.9
> difference PT0.038986S safe



The below logs seem to indicate a problem connecting to the postgres unix
socket. The socket exists:
> total 4.0K
> srwxrwxrwx 1 postgres postgres 0 Jan 11 12:59 .s.PGSQL.5432=
> -rw------- 1 postgres postgres 61 Jan 11 12:59 .s.PGSQL.5432.lock

Not sure what the problem is.

Thank you.











> journalctl -u pscheduler-ticker.service Jan 11 12:58:33 acad-synclab
> systemd[1]: pscheduler-ticker.service: Main process exited,
> code=exited, status=1/FAILURE Jan 11 12:58:33 acad-synclab systemd[1]:
> pscheduler-ticker.service: Failed with result 'exit-code'.
> Jan 11 12:58:33 acad-synclab systemd[1]: pscheduler-ticker.service:
> Scheduled restart job, restart counter is at 4.
> Jan 11 12:58:33 acad-synclab systemd[1]: Stopped pScheduler server - ticker.
> Jan 11 12:58:33 acad-synclab systemd[1]: Starting pScheduler server -
> ticker...
> Jan 11 12:58:33 acad-synclab systemd[1]: Started pScheduler server - ticker.
> Jan 11 12:58:33 acad-synclab ticker[1493]: ticker INFO Started
> Jan 11 12:58:33 acad-synclab ticker[1493]: Exception in thread Thread-1:
> Jan 11 12:58:33 acad-synclab ticker[1493]: Traceback (most recent call
> last):
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/lib64/python3.9/threading.py", line 980, in _bootstrap_inner
> Jan 11 12:58:33 acad-synclab ticker[1493]: Traceback (most recent call
> last):
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/libexec/pscheduler/daemons/ticker", line 155, in <module>
> Jan 11 12:58:33 acad-synclab ticker[1493]: main_program()
> Jan 11 12:58:33 acad-synclab ticker[1493]: self.run()
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/libexec/pscheduler/daemons/ticker", line 115, in main_program
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/lib64/python3.9/threading.py", line 917, in run
> Jan 11 12:58:33 acad-synclab ticker[1493]: db =
> pscheduler.pg_connection(dsn)
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/lib/python3.9/site-packages/pscheduler/db.py", line 43, in
> pg_connection
> Jan 11 12:58:33 acad-synclab ticker[1493]: self._target(*self._args,
> **self._kwargs)
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/libexec/pscheduler/daemons/ticker", line 109, in <lambda>
> Jan 11 12:58:33 acad-synclab ticker[1493]: pg = psycopg2.connect(dsn)
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/lib64/python3.9/site-packages/psycopg2/__init__.py", line 127, in
> connect
> Jan 11 12:58:33 acad-synclab ticker[1493]: target=lambda:
> http_queue_maintainer(log))
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/libexec/pscheduler/daemons/ticker", line 58, in http_queue_maintainer
> Jan 11 12:58:33 acad-synclab ticker[1493]: conn = _connect(dsn,
> connection_factory=connection_factory, **kwasync)
> Jan 11 12:58:33 acad-synclab ticker[1493]: conn =
> pscheduler.pg_connection(dsn)
> Jan 11 12:58:33 acad-synclab ticker[1493]: psycopg2.OperationalError: could
> not connect to server: No such file or directory
> Jan 11 12:58:33 acad-synclab ticker[1493]: Is the server running
> locally and accepting
> Jan 11 12:58:33 acad-synclab ticker[1493]: connections on Unix
> domain socket "/var/run/postgresql/.s.PGSQL.5432"?
> Jan 11 12:58:33 acad-synclab ticker[1493]: File
> "/usr/lib/python3.9/site-packages/pscheduler/db.py", line 43, in
> pg_connection
> Jan 11 12:58:33 acad-synclab systemd[1]: pscheduler-ticker.service:
> Main process exited, code=exited, status=1/FAILURE Jan 11 12:58:33
> acad-synclab systemd[1]: pscheduler-ticker.service: Failed with result
> 'exit-code'.
> Jan 11 12:58:33 acad-synclab systemd[1]: pscheduler-ticker.service:
> Scheduled restart job, restart counter is at 5.
> Jan 11 12:58:33 acad-synclab systemd[1]: Stopped pScheduler server - ticker.
> Jan 11 12:58:33 acad-synclab systemd[1]: pscheduler-ticker.service: Start
> request repeated too quickly.
> Jan 11 12:58:33 acad-synclab systemd[1]: pscheduler-ticker.service: Failed
> with result 'exit-code'.
> Jan 11 12:58:33 acad-synclab systemd[1]: Failed to start pScheduler server
> - ticker.





--
________________________________
Andrew Gallo
The George Washington University



Archive powered by MHonArc 2.6.24.

Top of Page