Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] help with configuration

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] help with configuration


Chronological Thread 
  • From: Petr Šesták <>
  • To: Mark Feit <>
  • Cc:
  • Subject: Re: [perfsonar-user] help with configuration
  • Date: Thu, 08 Aug 2024 15:53:59 +0200

Thank you very much for your investigation.

If you would like, you can look from the last 48 hours:

● pscheduler-archiver.service - pScheduler server - archiver
Loaded: loaded (/usr/lib/systemd/system/pscheduler-archiver.service; enabled; preset: disabled)
Active: active (running) since Thu 2024-08-01 10:16:25 CEST; 1 week 0 days ago
Process: 3212225 ExecStartPre=/bin/mkdir -p /var/pscheduler-server/archiver (code=exited, status=0/SUCCESS)
Process: 3212226 ExecStartPre=/bin/chmod 755 /var/pscheduler-server/archiver (code=exited, status=0/SUCCESS)
Process: 3212227 ExecStartPre=/bin/mkdir -p /var/pscheduler-server/archiver/tmp (code=exited, status=0/SUCCESS)
Process: 3212228 ExecStartPre=/bin/chmod 700 /var/pscheduler-server/archiver/tmp (code=exited, status=0/SUCCESS)
Process: 3212229 ExecStartPre=/bin/chown -R pscheduler:pscheduler /var/pscheduler-server/archiver (code=exited, status=0/SUCCESS)
Process: 3212230 ExecStartPre=/bin/sh -c if [ -r /etc/pscheduler/daemons/archiver.conf ]; then opts=$(sed -e 's/#.*$//' /etc/pscheduler/daemons/archiver.conf); echo OPTIONS=$opts > /var/pscheduler-server/archiver/options; chown pscheduler:pscheduler /var/pscheduler-server/archiver/options; fi (code=exited, status=0/SUCCESS)
Main PID: 3212233 (archiver)
Tasks: 2 (limit: 819046)
Memory: 32.5M
CPU: 4min 15.654s
CGroup: /system.slice/pscheduler-archiver.service
└─3212233 /usr/bin/python3 /usr/libexec/pscheduler/daemons/archiver --dsn @/etc/pscheduler/database/database-dsn

Aug 06 23:48:16 ps01-l.farm.particle.cz archiver[3681367]: archiver WARNING 3668: Failed to archive https://ps01-l.farm.particle.cz/pscheduler/tasks/9c5cc719-9a8a-4eca-8b6d-7ea6699348b1/runs/79d7df53-7e82-41c3-a94c-635bbfde6b33 to http: Possibly-permanent archiver error: Program failed to start after 3 tries: Timed out waiting for read
Aug 06 23:48:16 ps01-l.farm.particle.cz archiver[3681367]: archiver WARNING 3668: Gave up archiving https://ps01-l.farm.particle.cz/pscheduler/tasks/9c5cc719-9a8a-4eca-8b6d-7ea6699348b1/runs/79d7df53-7e82-41c3-a94c-635bbfde6b33 to http
Aug 07 09:35:13 ps01-l.farm.particle.cz archiver[4007687]: archiver WARNING 3725: Failed to archive https://ps01-l.farm.particle.cz/pscheduler/tasks/9c5cc719-9a8a-4eca-8b6d-7ea6699348b1/runs/4ce426fc-e3cb-4e33-af19-f74fe19e9044 to http: Possibly-permanent archiver error: Program failed to start after 3 tries: Timed out waiting for read
Aug 07 09:35:13 ps01-l.farm.particle.cz archiver[4007687]: archiver WARNING 3725: Gave up archiving https://ps01-l.farm.particle.cz/pscheduler/tasks/9c5cc719-9a8a-4eca-8b6d-7ea6699348b1/runs/4ce426fc-e3cb-4e33-af19-f74fe19e9044 to http
Aug 07 17:03:16 ps01-l.farm.particle.cz archiver[62923]: archiver WARNING 3777: Failed to archive https://ps01-l.farm.particle.cz/pscheduler/tasks/fe680baf-6ff9-45e0-93aa-4674549be839/runs/52866a05-2be5-4519-8579-9bc6e959a00a to http: Possibly-permanent archiver error: Program failed to start after 3 tries: Timed out waiting for read
Aug 07 17:03:16 ps01-l.farm.particle.cz archiver[62923]: archiver WARNING 3777: Gave up archiving https://ps01-l.farm.particle.cz/pscheduler/tasks/fe680baf-6ff9-45e0-93aa-4674549be839/runs/52866a05-2be5-4519-8579-9bc6e959a00a to http
Aug 08 03:45:18 ps01-l.farm.particle.cz archiver[428097]: archiver WARNING 3840: Failed to archive https://ps01-l.farm.particle.cz/pscheduler/tasks/fe680baf-6ff9-45e0-93aa-4674549be839/runs/1753cb64-e97e-42d7-bfba-1eaac1c57f69 to http: Possibly-permanent archiver error: Program failed to start after 3 tries: Timed out waiting for read
Aug 08 03:45:18 ps01-l.farm.particle.cz archiver[428097]: archiver WARNING 3840: Gave up archiving https://ps01-l.farm.particle.cz/pscheduler/tasks/fe680baf-6ff9-45e0-93aa-4674549be839/runs/1753cb64-e97e-42d7-bfba-1eaac1c57f69 to http
Aug 08 07:06:57 ps01-l.farm.particle.cz archiver[550612]: archiver WARNING 3863: Failed to archive https://ps01-l.farm.particle.cz/pscheduler/tasks/b08ffc1f-01a6-4452-8886-34d246de2adc/runs/86ad9630-fde2-4ab2-8f58-b5b46cfd7b22 to http: Possibly-permanent archiver error: Program failed to start after 3 tries: Timed out waiting for read
Aug 08 07:06:57 ps01-l.farm.particle.cz archiver[550612]: archiver WARNING 3863: Gave up archiving https://ps01-l.farm.particle.cz/pscheduler/tasks/b08ffc1f-01a6-4452-8886-34d246de2adc/runs/86ad9630-fde2-4ab2-8f58-b5b46cfd7b22 to http

regards,
Petr Šesták

Dne 2024-08-07 17:04, Mark Feit napsal:
Petr Šesták writes:

Aug 05 10:26:13 ps01-l.farm.particle.cz runner[2405952]: runner
WARNING

39765: Unable to retrieve run

https://130.184.244.8/pscheduler/tasks/6c36ec1f-e51a-4d6a-bb9f-540162709dcf/runs/c9b8849a-9e85-4765-b4c7-d6ae4e18cc1e:


400: Operation timed out after 14500 milliseconds with 0 bytes
received

The host at the other end (dmz-perf.uark.edu) or the network between
it and your system has problems. I’ll cover that below.

Aug 05 10:27:54 ps01-l.farm.particle.cz scheduler[2398130]: scheduler

INFO 23438: Posting non-starting run at 2024-08-05T19:21:56+02:00

for task e68c4cfa-6133-48eb-8ed9-b4cfbddf5a51: Gave up after too many

scheduling conflicts.

This happens when, after repeated attempts to schedule a run where a
conflict arose during the scheduling process, the scheduler elected to
stop trying. It should be a very-rare event.

This task was still in your system, so I pulled it out through the API
and determined that the other end is the system at UArk. After
running a couple of tests and other checks, I noticed that its clock
is not synchronized and has drifted by nearly nine minutes:

$ pscheduler clock ps01-l.farm.particle.cz 130.184.244.8

clock 2024-08-07T15:05:34.982705+02:00 synchronized
ps01-l.farm.particle.cz

clock 2024-08-07T08:56:50.204645-04:00 unsynchronized
130.184.244.8

difference PT8M44.77806S unsafe

That will make it unable to properly run tests involving multiple
pSchedulers (i.e., throughput). The clock on your system is in
correct sync (checked against one of mine). The UArk system also
appears to have some intermittent problems being able to schedule and
run tests. Based that, I’d say your system is fine and that you
should disregard any errors that are a result of tasks involving the
UArk system.

Aug 03 19:50:33 ps01-l.farm.particle.cz archiver[1055237]: archiver

WARNING 3204: Failed to archive

https://ps01-l.farm.particle.cz/pscheduler/tasks/e24f7abc-21d2-4cb3-88dc-67d30cc38989/runs/74f58e57-178b-4112-9485-13dff578d041


to http: Possibly-permanent archiver error: Program failed to start

after 3 tries: Timed out waiting for read

This task is no longer in your system, so I wasn’t able look at what
it was trying to do. If you can point me at something from the last
48 hours, I can take a look at it.

Hope that helps.

--Mark
--
To unsubscribe from this list:
https://lists.internet2.edu/sympa/signoff/perfsonar-user



Archive powered by MHonArc 2.6.24.

Top of Page