Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Psconfig-maddash-agent 5.0.6 can no longer build grids

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Psconfig-maddash-agent 5.0.6 can no longer build grids


Chronological Thread 
  • From: "White, Doug" <>
  • To: Andrew Lake <>
  • Cc: "" <>
  • Subject: Re: [perfsonar-user] Psconfig-maddash-agent 5.0.6 can no longer build grids
  • Date: Fri, 1 Dec 2023 07:06:19 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=ipac.caltech.edu; dmarc=pass action=none header.from=ipac.caltech.edu; dkim=pass header.d=ipac.caltech.edu; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+O8QK5FvZYNiVeiY5nh+i+E6TFuwlXfOPKyyIOEK3hs=; b=Nx2KQJNM/QvkXVu35Ps1ikDrgAmS/Ekc2eKEvtnT554WdHaSicKieiqGKIM6k5W2JqVFXn0ti/nEqqM45JNMbdvBLBLtqB0g/Yapb6uHqHsxB4NLIJ+SIt+0XFDMwZ37KA6OpJfDD2iULk8/c6Oe9HdtYF/po2F8qmxv9dMMpdiof3Vpvv1dtZ2BzhD/uxj6T5amRT4U2Y68mvgZuA75gMMJZelpiBXLnvjUFXpEaNCOo6olfdmMXvgr7UYQgSxUoQwWfwtBmj1zek58+BV/AGkN+0dyZ9N0TbaS4HY9/TBxtcRj2FOdT/KRgvlmjElQObSDrOvKo8ttsQJVG6hXPg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kZGsRAuTqK/jZtY+ALYIeHMYbon1uVAyhgafbN88IUQHn0gITLnXIbQlOeKkKxsNxH70KxG2Boq4qcfg43+gpWyEjg4deiE92qMS4kuQS5TmTp1ocg9+2y0JDWVf7p8SG64JWItPXQIrtdfNYRlLoTrCO4UtTEbzosBOE7PYuVHWP82Utl6sg+gWNLrKQWnZl5K2StXZJJJna6vqWIuLEIZHdUC/J0YCgZIQ/PX6xn+SPJ+UecHSPzKsoJHOm5RwY1I9EGqQ3y3s4viROf70WyfjjNHHXA4aVJptf8JL+oafF+imuKSaRW0TK7AZ/5DSWJpB9jAG2z+ihU+Eall6Vg==

Andrew,
    All my Measurement Hosts are running CentOS 7.9 including the problematic Central Management/MaDDash system. None of the other systems are seeing this anomaly, only the Central Management/MaDDash server. As an example, accessing the “toolkit” webpage on each server provides test results/graphs. However, the “toolkit” webpage on the Central Management/MaDDash server displays an error that it cannot connect to the archive (paraphrasing here).

FYI, Doug.

Sent from my iPhone

On Nov 30, 2023, at 8:25 AM, Andrew Lake <> wrote:


What OS is your logstash instance? Looks like its just exiting right away which is strange.I have a couple OS-dependent ideas potentially.


On November 29, 2023 at 8:59:48 PM, White, Doug () wrote:

Andrew,
I’ve narrowed my problem down to the Archive having a problem on my Central Management/MaDDash server. Hopefully, you can shed some light on the following log entries that are filling my log.

Thanks, Doug.

[2023-11-29T17:47:17,664][INFO ][logstash.runner          ] Log4j configuration path used is: /etc/logstash/log4j2.properties
[2023-11-29T17:47:17,671][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.17.9", "jruby.version"=>"jruby 9.2.20.1 (2.5.8) 2021-11-30 2a2962fbd1 OpenJDK 64-Bit Server VM 11.0.18+10 on 11.0.18+10 +indy +jit [linux-x86_64]"}
[2023-11-29T17:47:17,673][INFO ][logstash.runner          ] JVM bootstrap flags: [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djdk.io.File.enableADS=true, -Djruby.compile.invokedynamic=true, -Djruby.jit.threshold=0, -Djruby.regexp.interruptible=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true]
[2023-11-29T17:47:17,697][FATAL][org.logstash.Logstash    ] Logstash stopped processing because of an error: (SystemExit) exit
org.jruby.exceptions.SystemExit: (SystemExit) exit
        at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:747) ~[jruby-complete-9.2.20.1.jar:?]
        at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:710) ~[jruby-complete-9.2.20.1.jar:?]
        at usr.share.logstash.lib.bootstrap.environment.<main>(/usr/share/logstash/lib/bootstrap/environment.rb:94) ~[?:?]
[2023-11-29T17:47:30,618][INFO ][logstash.runner          ] Log4j configuration path used is: /etc/logstash/log4j2.properties
[2023-11-29T17:47:30,625][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.17.9", "jruby.version"=>"jruby 9.2.20.1 (2.5.8) 2021-11-30 2a2962fbd1 OpenJDK 64-Bit Server VM 11.0.18+10 on 11.0.18+10 +indy +jit [linux-x86_64]"}
[2023-11-29T17:47:30,627][INFO ][logstash.runner          ] JVM bootstrap flags: [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djdk.io.File.enableADS=true, -Djruby.compile.invokedynamic=true, -Djruby.jit.threshold=0, -Djruby.regexp.interruptible=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true]
[2023-11-29T17:47:30,651][FATAL][org.logstash.Logstash    ] Logstash stopped processing because of an error: (SystemExit) exit
org.jruby.exceptions.SystemExit: (SystemExit) exit
        at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:747) ~[jruby-complete-9.2.20.1.jar:?]
        at org.jruby.RubyKernel.exit(org/jruby/RubyKernel.java:710) ~[jruby-complete-9.2.20.1.jar:?]
        at usr.share.logstash.lib.bootstrap.environment.<main>(/usr/share/logstash/lib/bootstrap/environment.rb:94) ~[?:?]
[2023-11-29T17:47:43,723][INFO ][logstash.runner          ] Log4j configuration path used is: /etc/logstash/log4j2.properties
[2023-11-29T17:47:43,731][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.17.9", "jruby.version"=>"jruby 9.2.20.1 (2.5.8) 2021-11-30 2a2962fbd1 OpenJDK 64-Bit Server VM 11.0.18+10 on 11.0.18+10 +indy +jit [linux-x86_64]"}

On Nov 22, 2023, at 1:20 PM, White, Doug <> wrote:

Andrew,
I’m continuing to work on this, trying various things but it can definitely wait until next week. So don’t worry about this for now and go have yourself an awesome Thanksgiving and we’ll connect again next week.

Thanks for all you assistance, Doug.

On Nov 22, 2023, at 11:55 AM, White, Doug <> wrote:

Andrew,
Thank you for your prompt response. The information you provided led me to resolve the immediate problem I was having and now I’m onto the next problem. But before I get to that let me explain more about our setup.

All of our Measurement Hosts are getting their testing configuration from a Central Management Host and posting their test results both locally and to the Central Management Host. This Central Management Host is also hosting MaDDash. On the Measurement Hosts we are specifying archiving by having two separate json files in /etc/perfsonar/psconfig/pshceduler/archives.d/. The first is the default json placed here by the installation of the software. The second was generated by using the output of the “/usr/lib/perfsonar/archive/perfsonar-scripts/psconfig_archive.sh -a none -n isgpscore.ipac.caltech.edu” command. Since the Central Management Host (running MaDDash) does not use a remote archive and is only archiving locally we were using the default “http_logstash.json” file. 

Now a little bit about our MaDDash configuration. In an effort to have only one file to edit to make configuration changes in our environment the “maddash.yaml” file is generated by pointing to the central configuration file “/etc/perfsonar/psconfig/pscheduler.d/central-management.json” and applying transforms to get what we need for maddish.yaml. Apparently, this process also uses the /etc/perfsonar/psconfig/archives.d/ directory. I took the information you provided and modified the http_logstash.json file in this directory and now the psconfig-maddash-agent is building the maddish.yaml file correctly:

# psconfig maddash-stats
Agent Last Run Start Time: 2023/11/22 11:11:50
Agent Last Run End Time: 2023/11/22 11:13:06
Agent Last Run Process ID (PID): 2893
Agent Last Run Log GUID: FD8FF0F2-896A-11EE-955D-AF70775AE083
Total grids managed by agent: 5
From remote definitions: 5
    /etc/perfsonar/psconfig/pscheduler.d/central-management.json: 5

After this modification, our grids have returned in the webui. However (now onto the next problem), all of our test results within the grids are “Purple” and say “INTERNAL SERVER ERROR” when you hover over them. Is this a case of just needing to wait a couple of hours for tests to populate? I no longer have an Esmond Archive on this system because I had to rebuild the server. At some point in the past while Auto Updating this server became corrupted and I had to rebuild it from scratch this week. In addition, is there anything about what I’ve done here that you would suggest I do differently.

Thanks again for your input. It is greatly appreciated.

Cheers, Doug.

On Nov 22, 2023, at 5:58 AM, Andrew Lake <> wrote:

Hi Doug,

It sounds like you need to change the archiver definition in your central pSConfig JSON template file. I’m guessing it either does not have any archive definitions or it has an “http” archiver defined but does not contain the following to tell MaDDash where to get data in a format it understands:

"_meta": {
    }

See the docs here for more details on generating the archive definition: https://docs.perfsonar.net/multi_ma_install.html#scenario-3-writing-to-a-remote-archive-with-ip-authentication 

If none of the above helps, send me the URL to the pSConfig json (i.e. the one you’d get from “psconfig remote list”) and I’ll see if I notice anything.

Thanks,
Andy


On November 22, 2023 at 5:19:08 AM, White, Doug () wrote:

We’ve been upgrading all of our Measurement Hosts to Perfsonar 5.0.6. In conjunction with this we updated our Maddash implementation to 5.0.6 but after doing so the psconfig-maddash-agent can no longer successfully build a maddash.yaml file with the grids we had before the upgrade. A psconfig maddish-stats shows 0 grids:

]# psconfig maddash-stats
Agent Last Run Start Time: 2023/11/21 17:32:46
Agent Last Run End Time: 2023/11/21 17:32:49
Agent Last Run Process ID (PID): 1772
Agent Last Run Log GUID: 0A619144-88D7-11EE-92C2-9548F7BB3364
Total grids managed by agent: 0

And the psconfig-maddash-agent.log shows warnings and errors we have not seen before:

2023/11/13 15:37:54 INFO pid=12453 prog=main:: line=176 guid=AAAF8468-827D-11EE-B3FC-EFAA0437A9E4 msg=Running agent...
2023/11/13 15:37:54 WARN pid=12453 prog=main::__ANON__ line=123 guid=AAAF8468-827D-11EE-B3FC-EFAA0437A9E4 msg=Warned: Use of uninitialized value $archive_accessor in scalar chomp at /usr/lib/perfsonar/bin/../lib/perfSONAR_PS/PSConfig/MaDDash/Agent.pm line 923.
2023/11/13 15:37:54 WARN pid=12453 prog=main::__ANON__ line=123 guid=AAAF8468-827D-11EE-B3FC-EFAA0437A9E4 msg=Warned: Use of uninitialized value $archive_accessor in substitution (s///) at /usr/lib/perfsonar/bin/../lib/perfSONAR_PS/PSConfig/MaDDash/Agent.pm line 924.
2023/11/13 15:37:54 ERROR pid=12453 prog=perfSONAR_PS::PSConfig::MaDDash::Agent::_build_check line=721 guid=AAAF8468-827D-11EE-B3FC-EFAA0437A9E4 task_name=IPAC - Throughput_10Gig_0_NS config_url=/etc/perfsonar/psconfig/pscheduler.d/central-management.json config_src=remote viz_type=ps-graphs check_type=ps-nagios-throughput grid_name=IPAC - NS Throughput 10Gig - IPERF3 msg=Unable to find suitable archive between isgpscore and isgpspns01. Check the plugin requirements as well as any selectors you may have defined and verify they match your configuration.

Any input to point me in the right direction would be greatly appreciated.

Thanks, Doug.
-- 
To unsubscribe from this list: https://lists.internet2.edu/sympa/signoff/perfsonar-user






Archive powered by MHonArc 2.6.24.

Top of Page