Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Logstash killing CPU on pS-Toolkit node

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Logstash killing CPU on pS-Toolkit node


Chronological Thread 
  • From: Andrew Lake <>
  • To: perfsonar-user <>, Kathy Benninger <>
  • Subject: Re: [perfsonar-user] Logstash killing CPU on pS-Toolkit node
  • Date: Wed, 10 May 2023 07:56:08 -0700

Hi Kathy,

Anything interesting in "/var/log/logstash/logstash-plain.log”?Interesting would include frequent errors or anything like that. Possible its doing something unexpected.

Thanks,
Andy 


On May 9, 2023 at 9:37:00 PM, Kathy Benninger () wrote:

Greetings!

I have a bare-metal server that was up to date and running fine with the
version 4 pS-Toolkit. After the 5.0 upgrade, the perfSONAR is reporting high
CPU utilization by logstash. All testing is stopped (Actually pscheduler
testing doesn't run and I believe I also have to fix the limits.conf issue,
though I never modified the limits file.).

The OS is CentOS Linux release 7.9.2009 (Core)

Example utilization:

[benninge@ps100g-10g ~]$ top -H
top - 13:41:53 up  1:55,  2 users,  load average: 3.33, 3.11, 3.10
Threads: 542 total,   6 running, 536 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.4 us,  2.0 sy, 72.1 ni, 25.5 id,  0.0 wa,  0.0 hi, 0.0 si,  0.0 st
KiB Mem : 47884028 total, 19617196 free, 27611884 used,   654948 buff/cache
KiB Swap: 52428796 total, 52428796 free,        0 used. 19800300 avail Mem

   PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM TIME+ COMMAND
 42145 logstash  39  19 3889836 456264  14888 R 91.4  1.0 0:02.76 java
 42158 logstash  39  19 3889836 456264  14888 R 71.2  1.0 0:02.15 C2
CompilerThre
 42153 logstash  39  19 3889836 456264  14888 R 70.2  1.0 0:02.12 C2
CompilerThre
 42154 logstash  39  19 3889836 456264  14888 R 46.4  1.0 0:01.40 C1
CompilerThre
 42147 logstash  39  19 3889836 456264  14888 R  6.6  1.0 0:00.20 CMS Main
Thread
 42146 logstash  39  19 3889836 456264  14888 S  1.7  1.0 0:00.05 GC Thread#0
  7325 benninge  20   0  121236   2792   1452 S  0.7  0.0 0:53.08 htop
 40457 benninge  20   0  160652   2712   1568 R  0.7  0.0 0:01.20 top
 42162 logstash  39  19 3889836 456264  14888 S  0.7  1.0 0:00.02 GC Thread#2
     9 root      20   0       0      0      0 S  0.3  0.0 0:11.79 rcu_sched
  1387 opensea+  20   0   26.6g  23.1g  23432 S  0.3 50.7 0:01.52 G1 Service
  1540 opensea+  20   0   26.6g  23.1g  23432 S  0.3 50.7 0:03.36
pa-collectors-t
  4261 opensea+  20   0   26.6g  23.1g  23432 S  0.3 50.7 0:00.25
opensearch[ps10
  6144 opensea+  20   0   26.6g  23.1g  23432 S  0.3 50.7 0:00.02
opensearch[ps10
  3054 root      20   0  638544  15176   5096 S  0.3  0.0 0:03.51 f2b/f.sshd
 24256 root      20   0  638544  15176   5096 S  0.3  0.0 0:02.45 f2b/observer
  3184 perfson+  20   0  313844  27952   5276 S  0.3  0.1 0:33.51 python3
  7148 benninge  20   0  157500   2552   1188 S  0.3  0.0 0:00.72 sshd
 10376 benninge  20   0  157500   2560   1204 S  0.3  0.0 0:00.35 sshd
 42148 logstash  39  19 3889836 456264  14888 S  0.3  1.0 0:00.01 VM Thread
 42161 logstash  39  19 3889836 456264  14888 S  0.3  1.0 0:00.01 GC Thread#1
 42163 logstash  39  19 3889836 456264  14888 S  0.3  1.0 0:00.01 GC Thread#3

I saw that Cas D'Angelo had reported a similar problem on 24-Apr-2023, but I
do not see a resolution. This node was not an archive host, but it was a
standalone host storing its own measurements.

Thanks for any guidance!

Kathy Benninger
Pittsburgh Supercomputing Center

--
To unsubscribe from this list: https://lists.internet2.edu/sympa/signoff/perfsonar-user



Archive powered by MHonArc 2.6.24.

Top of Page