Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Docker testpoint is not downloading the meshconfig file

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Docker testpoint is not downloading the meshconfig file


Chronological Thread 
  • From: Johann Hugo <>
  • To: Sowmya Balasubramanian <>
  • Cc: "" <>
  • Subject: Re: [perfsonar-user] Docker testpoint is not downloading the meshconfig file
  • Date: Wed, 8 Jun 2022 16:07:40 +0200

Thanks a lot to all that helped me

After adding proper reverse DNS for my docker testpoint, everything is up and running.

Regards
Johann


On Tue, Jun 7, 2022 at 9:50 PM Sowmya Balasubramanian <> wrote:
Hi Johann,

psconfig relies on reverse dns to match the addresses and configure tests. 

You can also try setting psconfig agentctl pscheduler match-addresses <host-name>
and restart psconfig agent.

Regards,
Sowmya


On Tue, Jun 7, 2022 at 9:36 AM Johann Hugo <> wrote:
Hi Andy

Those directories were missing:
[root@ps-100-100g /]# ls -l /etc/perfsonar/psconfig/
total 12
-rw-r--r-- 1 root root 2082 May 24 18:01 lsregistrationdaemon.conf
-rw-r--r-- 1 root root 1703 Mar 18 13:00 pscheduler-agent-logger.conf
-rw-r--r-- 1 root root  221 Jun  7 16:26 pscheduler-agent.json

Adding them fixed the issue:
[root@ps-100-100g psconfig]# systemctl status psconfig-pscheduler-agent
● psconfig-pscheduler-agent.service - pSConfig PScheduler Agent
   Loaded: loaded (/usr/lib/systemd/system/psconfig-pscheduler-agent.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2022-06-07 17:16:09 SAST; 22s ago

With tcpdump I can see it's fetching the config file, but its not recognizing any tests from the mesh config file

[root@ps-100-100g /]# psconfig pscheduler-stats
Agent Last Run Start Time: 2022/06/07 15:18:41
Agent Last Run End Time: 2022/06/07 15:18:47
Agent Last Run Process ID (PID): 397
Agent Last Run Log GUID: 1D292306-E675-11EC-9098-E485E82F1FF7
Total tasks managed by agent: 0

Is reverse DNS required for it to work ? Only forward DNS is working at the moment. 
I did add the reverse IPs in /etc/hosts, but it's resolves to some docker names
[root@ps-100-100g /]# [root@ps-100-100g /]# host 155.232.40.194
194.40.232.155.in-addr.arpa domain name pointer testpoint_macvlan.perfsonar-testpoint-docker_LAN1.

With psconfig translate I can see the tests are listed in the config file (attached). 

I see this log file doesn't exist: 
/var/log/perfsonar/configdaemon.log

Is this file only created with a toolkit install, but not with a testpoint ?

Regards
Johann

On Tue, Jun 7, 2022 at 4:53 PM Andrew Lake <> wrote:
Hi Johann,

Based on the line where the code is failing I think the problem is likely that the /etc/perfsonar/psconfig is missing some subdirectories. It should look like the following:

[root@perfsonar-docker /]# ls -l /etc/perfsonar/psconfig/
total 20
drwxr-xr-x 2 perfsonar perfsonar 4096 Apr  5 20:01 archives.d
-rw-r--r-- 1 perfsonar perfsonar 1703 Apr  4 17:25 pscheduler-agent-logger.conf
-rw-r--r-- 1 perfsonar perfsonar   21 Apr  4 17:25 pscheduler-agent.json
drwxr-xr-x 2 perfsonar perfsonar 4096 Apr  5 20:02 pscheduler.d
drwxr-xr-x 2 perfsonar perfsonar 4096 Apr  5 20:01 transforms.d

I think if you add an empty archives.d, pscheduler.d and transforms.d then you should be good. We may need to update the repo to include those by default.

Thanks,
Andy

On June 7, 2022 at 10:21:53 AM, Garnizov, Ivan () wrote:

Hello Johann,

 

Please tell, where did you fetch this Docker image of pS Testpoint from?

 

Regards,

Ivan Garnizov

 

GEANT WP6T3: pS development team

GEANT WP7T1: pS deployments GN Operations

GEANT WP9T2: Software governance in GEANT

 

 

 

From: Johann Hugo [mailto:]
Sent: Tuesday, June 7, 2022 3:00 PM
To: Garnizov, Ivan (RRZE) <>
Cc:
Subject: Re: [perfsonar-user] Docker testpoint is not downloading the meshconfig file

 

Hi Ivan

 

The JSON file is not the problem. It's accessible and valid when I verify it by hand:

[root@ps-100-100g /]# psconfig validate http://perf-pwa.sanren.ac.za/pub/config/sanren_mesh_2019.json?format=psconfig
Loading template ...... OK
Validating JSON schema ...... OK
Verifying object references ...... OK
pScheduler Validation (Quick) ...... OK
pSConfig JSON is valid

 

The problem is that the pSConfig PScheduler Agent fails and the config file is never fetched.

[root@ps-100-100g /]# systemctl status psconfig*
● psconfig-pscheduler-agent.service - pSConfig PScheduler Agent
   Loaded: loaded (/usr/lib/systemd/system/psconfig-pscheduler-agent.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2022-06-07 12:09:08 SAST; 2h 44min ago
  Process: 160 ExecStart=/usr/lib/perfsonar/bin/psconfig_pscheduler_agent --config=/etc/perfsonar/psconfig/pscheduler-agent.json --logger=/etc/perfsonar/psconfig/pscheduler-agent-logger.conf --pidfile=/var/run/psconfig-pscheduler-agent.pid --user=perfsonar --group=perfsonar (code=exited, status=0/SUCCESS)
 Main PID: 380 (code=exited, status=25)

Jun 07 12:09:08 ps-100-100g systemd[1]: Starting pSConfig PScheduler Agent...
Jun 07 12:09:08 ps-100-100g systemd[1]: Started pSConfig PScheduler Agent.
Jun 07 12:09:08 ps-100-100g systemd[1]: psconfig-pscheduler-agent.service: main process exited, code=exited, status=25/n/a
Jun 07 12:09:08 ps-100-100g systemd[1]: Unit psconfig-pscheduler-agent.service entered failed state.
Jun 07 12:09:08 ps-100-100g systemd[1]: psconfig-pscheduler-agent.service failed.

 

Regards

Johann

 

On Tue, Jun 7, 2022 at 1:04 PM Garnizov, Ivan <> wrote:

Hi Johann,

 

Please check the current state with these commands:

 

-          curl http://perf-pwa.sanren.ac.za/pub/config/sanren_mesh_2019.json?format=psconfig

-          psconfig validate http://perf-pwa.sanren.ac.za/pub/config/sanren_mesh_2019.json?format=psconfig

 

I believe these should give you an idea of what’s wrong on the system.

 

 

Regards,

Ivan Garnizov

 

GEANT WP6T3: pS development team

GEANT WP7T1: pS deployments GN Operations

GEANT WP9T2: Software governance in GEANT

 

 

 

From: [mailto:] On Behalf Of Johann Hugo
Sent: Tuesday, June 7, 2022 10:48 AM
To:
Subject: [perfsonar-user] Docker testpoint is not downloading the meshconfig file

 

Hi all

 

My docker testpoint is not downloading it's meshconfig file (verified with tcpdump on both the docker host and on my PWA server)

 

[root@ps-100-100g /]# psconfig remote list
=== pScheduler Agent ===
[
   {
      "bind-address" : "155.232.40.194",
      "url" : "http://perf-pwa.sanren.ac.za/pub/config/sanren_mesh_2019.json?format=psconfig"
   }
]

 

/var/log/perfsonar/psconfig-pscheduler-agent-tasks.log is empty

/var/log/perfsonar/psconfig-pscheduler-agent-transactions.log is empty

/var/log/perfsonar/psconfig-pscheduler-agent.log has the following error:
2022/06/07 07:46:27 ERROR pid=387 prog=main::__ANON__ line=131 msg=Died: /etc/perfsonar/psconfig/pscheduler.d watcher creation failed at /usr/lib/perfsonar/bin/psconfig_pscheduler_agent line 160

No errors in /var/log/messages

 

DNS inside the container is working fine and it can resolve perf-pwa.sanren.ac.za

Only forward DNS is configured for the container and it's working fine

[root@ps-100-100g /]# host ps-100-100g.perfsonar.ac.za
ps-100-100g.perfsonar.ac.za has address 155.232.40.194

 

ntpd is running on the docker host. 

lsregistration is working fine

The mesh config file is built on a PWA server. All my other servers + Maddash are happy with the mesh config file

I can download the meshconfig with "psconfig translate http://perf-pwa.sanren.ac.za/pub/config/sanren_mesh_2019.json?format=psconfig" and it looks fine.

 

My server setup:

Ubuntu 20.04.4, with a dual 100g Ethernet interface card

DNS + ntpd is running on the host. 

The container uses MACVLAN networking, on its own dedicated Ethernet adapter and it starts up using docker-compose

The testpoint image is set to image: perfsonar/testpoint:latest

The timezone is configured inside docker-compose.yml

 

Any ideas on how to debug this ?

 

Regards

Johann

 

--

SANReN Engineer

South African National Research Network (SANReN)

National Integrated Cyber Infrastructure System (NICIS)

CSIR NextGen Enterprises and Institutions Cluster

Office: 012 841 2066Email: Website: www.sanren.ac.za / www.csir.co.za

--
To unsubscribe from this list: https://lists.internet2.edu/sympa/signoff/perfsonar-user


--
SANReN Engineer
South African National Research Network (SANReN)
National Integrated Cyber Infrastructure System (NICIS)
CSIR NextGen Enterprises and Institutions Cluster

Office: 012 841 2066Email: Website: www.sanren.ac.za / www.csir.co.za

--
To unsubscribe from this list: https://lists.internet2.edu/sympa/signoff/perfsonar-user



Archive powered by MHonArc 2.6.24.

Top of Page