perfsonar-user - Re: [perfsonar-user] pScheduler Internal Error on a mobile node
Subject: perfSONAR User Q&A and Other Discussion
List archive
- From: Elicia Heera <>
- To: Antoine Delvaux <>
- Cc:
- Subject: Re: [perfsonar-user] pScheduler Internal Error on a mobile node
- Date: Mon, 16 Oct 2017 15:12:12 +0200
- Ironport-phdr: 9a23:bFGIVBSpa2uOLgrzJLGESEGeCtpsv+yvbD5Q0YIujvd0So/mwa6ybBaN2/xhgRfzUJnB7Loc0qyN4vCmATRIyK3CmUhKSIZLWR4BhJdetC0bK+nBN3fGKuX3ZTcxBsVIWQwt1Xi6NU9IBJS2PAWK8TW94jEIBxrwKxd+KPjrFY7OlcS30P2594HObwlSijewZbB/IA+qoQnNq8IbnZZsJqEtxxXTv3BGYf5WxWRmJVKSmxbz+MK994N9/ipTpvws6ddOXb31cKokQ7NYCi8mM30u683wqRbDVwqP6WACXWgQjxFFHhLK7BD+Xpf2ryv6qu9w0zSUMMHqUbw5Xymp4rx1QxH0ligIKz858HnWisNuiqJbvAmhrAF7z4LNfY2ZKOZycqbbcNwUX2pBWttaWTJHDI2ycoADC+UMMeNeooLgpVUBsAG+CBGuC+PhyjFGiHz407Ak3es9CgzGxhAsEsgUvXjIsNn4NqEfWv21wqnSyjXDautb1zj56IjJbh8hoOuDVq9yf8XP10YvDRnKhUiXpIP7OzOV1/gCs2mB4Od7TuKgkWgnqxtvrTip3MsjkJXGipgUy1/e7Ch0xps+K9O/SE5+e9GkEZ1QujmVN4RoWMwiRX1otDw9yr0ctp62ejUBxpc/xxPHdfCKcpSE7xDmWeafIjp3n25pdbewihqu7USv1ujxWdWo3FtJqydIl8fAu34C2hHV98OJUOFy/l271jaKzw3T6v9LIUQzlafDLp4u2L8wlp4KvUTeBCD6hFz6jaCIekgq9eWk8evnYrLhpp+TM497lBvyPbgpmsy6Geg4Mw4OUHaH+emkyrHu/FH1TbdPg/04kaTWq4zWKMEUq6KlHwNazoMu5AqjAzql1dkVmGcLIVxKdR6fiojmIVDOIPT2DfelhFSslS9myO7CPr3gHprNL2LMnaz6fbln7U5c0hQ8zdZF651OFr4NOvPyVVXpuNzCEhA5KxC0w/rgCNhlzoMRR3iPAqGCMKPVt1+H/PgvL/CRZI8Opjn9MeMo5/rvjX8ihV8dZrel0YEWaHC+AvRpPV+ZYXzyjdcdD2sGpBQxQ/H3iA7KbTkGTnCuWKM663kXBYa4BJrTDtSkhqCG3SG2WJFbYn5PF0CkFH7uMZ2DSfEFbmSKJpkyvCYDUO2MT44q1BjmlAL30bMveufQ+yYZvLr43d5v6uuVnBp05zUiXJfV6H2EU2whxjBAfDQxxq0q5B0jxw==
Hi Antoine,
Thank you for getting back to me.
You are correct, when I checked the status of the PostgeSQL, it showed as failed. I was unable to restart the service. I digged around the logs and PostgeSQL files as you suggested and found the "pg_hba.conf" was blank. I copied one from a working device and this resolved all the issues.
I am running CentOS7 and I installed from the NetInstall ISO images hence its quite strange. I've never had this issue before using the ISO image so I never thought to check the PostgeSQL status or logs.
Thank you for your help though!
Kind regards
Elicia Heera
Network Engineer
On Mon, Oct 16, 2017 at 2:43 PM, Antoine Delvaux <> wrote:
Hello Elicia,
The error messages you see in the logs seem to indicate that PostgreSQL is not running on your machine. Can you try to start/restart it? If you cannot have it running, it would be good to check the logs in /var/lib/pgsql/ for any error.
pscheduler requires postgresql-9.5 to run.
Can you confirm you are running on CentOS7? Did you installed from the ISO image or from the bundle packages?
Thanks,
-- --
Antoine Delvaux Systems Engineer
Poznań Supercomputing & Network Center Skype: toninb
GÉANT project Tel: +221.703368313
http://www.geant.org XMPP:
PGP fingerprint: DC65 0D8B 6938 9229 33C3 18CA 4EB6 09D3 A333 3378
> <SANReN Logo.PNG>
> Le 16 oct. 2017 à 11:58, Elicia Heera <> a écrit :
>
> Hi Everyone,
>
> I need some help regarding the pScheduler server and some of the errors I am getting and how to resolve them. I recently did a fresh install of perfSONAR on a FIT-PC 3 device. When this device was deployed to site the pScheduler process has been giving numerous errors including Internal error on on local pScheduler server when I run pscheduler task idle --duration PT2S.
>
> Under the pscheduler log:
> Oct 16 13:33:38 localhost journal: safe_run/scheduler ERROR Restarting
> Oct 16 13:33:38 localhost journal: safe_run/scheduler ERROR Program threw an exception after 0:00:00.001245
> Oct 16 13:33:38 localhost journal: safe_run/scheduler ERROR Exception: OperationalError: could not connect to server: Connection refused#012#011Is the server running on host "127.0.0.1"$
> Oct 16 13:33:38 localhost journal: safe_run/scheduler ERROR Waiting 19.75 seconds before restarting
> Oct 16 13:33:38 localhost journal: safe_run/runner ERROR Restarting
> Oct 16 13:33:38 localhost journal: safe_run/runner ERROR Program threw an exception after 0:00:00.001344
> Oct 16 13:33:38 localhost journal: safe_run/runner ERROR Exception: OperationalError: could not connect to server: Connection refused#012#011Is the server running on host "127.0.0.1" an$
> Oct 16 13:33:38 localhost journal: safe_run/runner ERROR Waiting 19.75 seconds before restarting
> Oct 16 13:33:38 localhost journal: safe_run/archiver ERROR Restarting
> Oct 16 13:33:38 localhost journal: safe_run/archiver ERROR Program threw an exception after 0:00:00.001308
> Oct 16 13:33:38 localhost journal: safe_run/archiver ERROR Exception: OperationalError: could not connect to server: Connection refused#012#011Is the server running on host "127.0.0.1" $
> Oct 16 13:33:38 localhost journal: safe_run/archiver ERROR Waiting 19.75 seconds before restarting
> Oct 16 13:33:39 localhost journal: safe_run/ticker ERROR Restarting
> Oct 16 13:33:39 localhost journal: safe_run/ticker ERROR Program threw an exception after 0:00:00.005828
> Oct 16 13:33:39 localhost journal: safe_run/ticker ERROR Exception: OperationalError: could not connect to server: Connection refused#012#011Is the server running on host "127.0.0.1" an$
> Oct 16 13:33:39 localhost journal: safe_run/ticker ERROR Waiting 19.75 seconds before restarting
> Oct 16 13:33:58 localhost journal: safe_run/scheduler ERROR Restarting
> Oct 16 13:33:58 localhost journal: safe_run/scheduler ERROR Program threw an exception after 0:00:00.001213
> Oct 16 13:33:58 localhost journal: safe_run/scheduler ERROR Exception: OperationalError: could not connect to server: Connection refused#012#011Is the server running on host "127.0.0.1"$
> Oct 16 13:33:58 localhost journal: safe_run/scheduler ERROR Waiting 20.0 seconds before restarting
> Oct 16 13:33:58 localhost journal: safe_run/runner ERROR Restarting
> Oct 16 13:33:58 localhost journal: safe_run/runner ERROR Program threw an exception after 0:00:00.001320
> Oct 16 13:33:58 localhost journal: safe_run/runner ERROR Exception: OperationalError: could not connect to server: Connection refused#012#011Is the server running on host "127.0.0.1" an$
> Oct 16 13:33:58 localhost journal: safe_run/runner ERROR Waiting 20.0 seconds before restarting
> Oct 16 13:33:58 localhost journal: safe_run/archiver ERROR Restarting
> Oct 16 13:33:58 localhost journal: safe_run/archiver ERROR Program threw an exception after 0:00:00.001256
> Oct 16 13:33:58 localhost journal: safe_run/archiver ERROR Exception: OperationalError: could not connect to server: Connection refused#012#011Is the server running on host "127.0.0.1" $
> Oct 16 13:33:58 localhost journal: safe_run/archiver ERROR Waiting 20.0 seconds before restarting
> Oct 16 13:33:59 localhost journal: safe_run/ticker ERROR Restarting
> Oct 16 13:33:59 localhost journal: safe_run/ticker ERROR Program threw an exception after 0:00:00.005256
> Oct 16 13:33:59 localhost journal: safe_run/ticker ERROR Exception: OperationalError: could not connect to server: Connection refused#012#011Is the server running on host "127.0.0.1" an$
> Oct 16 13:33:59 localhost journal: safe_run/ticker ERROR Waiting 20.0 seconds before restarting
>
> I get these errors when checking the status of any of the pscheduler services. All show active (running):
> pscheduler-runner.service - pScheduler server - runner
> Loaded: loaded (/usr/lib/systemd/system/pscheduler-runner.service; enabled; vendor preset: disabled)
> Active: active (running) since Mon 2017-10-16 13:20:41 SAST; 16min ago
> Main PID: 1254 (runner)
> CGroup: /system.slice/pscheduler-runner.service
> └─1254 /usr/bin/python /usr/libexec/pscheduler/daemons/runner --daemon --pid-file /var/run/pscheduler-runner.pid --dsn @/etc/pscheduler/database/ database-dsn
>
> Oct 16 13:36:02 localhost.localdomain runner[1254]: safe_run/runner ERROR Exception: OperationalError: could not connect to server: Connection refused
> Is the server running on host "127.0.0.1" and accepting
> TCP/IP connections on port 5432?...
> Oct 16 13:36:02 localhost.localdomain runner[1254]: safe_run/runner ERROR Waiting 21.5 seconds before restarting
> Oct 16 13:36:23 localhost.localdomain runner[1254]: safe_run/runner ERROR Restarting
> Oct 16 13:36:23 localhost.localdomain runner[1254]: safe_run/runner ERROR Program threw an exception after 0:00:00.001386
> Oct 16 13:36:23 localhost.localdomain runner[1254]: safe_run/runner ERROR Exception: OperationalError: could not connect to server: Connection refused
> Is the server running on host "127.0.0.1" and accepting
> TCP/IP connections on port 5432?...
> Oct 16 13:36:23 localhost.localdomain runner[1254]: safe_run/runner ERROR Waiting 21.75 seconds before restarting
> Oct 16 13:36:45 localhost.localdomain runner[1254]: safe_run/runner ERROR Restarting
> Oct 16 13:36:45 localhost.localdomain runner[1254]: safe_run/runner ERROR Program threw an exception after 0:00:00.000983
> Oct 16 13:36:45 localhost.localdomain runner[1254]: safe_run/runner ERROR Exception: OperationalError: could not connect to server: Connection refused
> Is the server running on host "127.0.0.1" and accepting
> TCP/IP connections on port 5432?...
> Oct 16 13:36:45 localhost.localdomain runner[1254]: safe_run/runner ERROR Waiting 22.0 seconds before restarting
>
>
> I have tried reinstalling the pscheduler server and restarting all the processes with no luck.
>
> Any help would be appreciated.
>
> Sincerly
> Elicia Heera
> Network Engineer
>
>
- [perfsonar-user] pScheduler Internal Error on a mobile node, Elicia Heera, 10/16/2017
- Re: [perfsonar-user] pScheduler Internal Error on a mobile node, Antoine Delvaux, 10/16/2017
- Re: [perfsonar-user] pScheduler Internal Error on a mobile node, Elicia Heera, 10/16/2017
- Re: [perfsonar-user] pScheduler Internal Error on a mobile node, Mark Feit, 10/16/2017
- Re: [perfsonar-user] pScheduler Internal Error on a mobile node, Elicia Heera, 10/16/2017
- Re: [perfsonar-user] pScheduler Internal Error on a mobile node, Antoine Delvaux, 10/16/2017
Archive powered by MHonArc 2.6.19.