perfsonar-user - RE: [perfsonar-user] meshconfig-agent-tasks not scheduling tasks regularly
Subject: perfSONAR User Q&A and Other Discussion
List archive
- From: "Garnizov, Ivan (RRZE)" <>
- To: Casey Russell <>, "" <>
- Subject: RE: [perfsonar-user] meshconfig-agent-tasks not scheduling tasks regularly
- Date: Mon, 16 Oct 2017 07:23:08 +0000
- Accept-language: en-GB, de-DE, en-US
- Ironport-phdr: 9a23:3QSwbh175TU2JfI7smDT+DRfVm0co7zxezQtwd8ZsesWL/zxwZ3uMQTl6Ol3ixeRBMOAuqIC07KempujcFRI2YyGvnEGfc4EfD4+ouJSoTYdBtWYA1bwNv/gYn9yNs1DUFh44yPzahANS47xaFLIv3K98yMZFAnhOgppPOT1HZPZg9iq2+yo9ZDeZwZFiCChbb9uMR67sRjfus4KjIV4N60/0AHJonxGe+RXwWNnO1eelAvi68mz4ZBu7T1et+ou+MBcX6r6eb84TaFDAzQ9L281/szrugLdQgaJ+3ART38ZkhtMAwjC8RH6QpL8uTb0u+ZhxCWXO9D9QKsqUjq+8ahkVB7oiD8GNzEn9mHXltdwh79frB64uhBz35LYbISTOfFjfK3SYMkaSHJBUMhPSiJBHo2yYYgBD+UDPOZXs4byqkAUoheiAQShHv/jxiNKi3LwwKY00/4hEQbD3AE4Ed4AsW7brM/zNKcTUOG1y7fIwS3eZP1Vxzfy8o7IeQ0lrf+QQbx/csXfxUwhGwjYiViQqJLlMCmT1ugXvGiU9fBgVeSui28mrAFxvCOixsk2hYnUnI4a107L+Dx/zY0oK9O4T0t7bsSlEJtWryyaM4p2QsUhQ252pSk6y7sGtYSncygNzZQr3wLfavKGc4iU/hLsSvyRLS1ki3JifbKznxWy/lKmyu3mSMa01kxGrixbndnQsH0Gyh/d6tCfR/dg8EqtxCyD2x3Q5+xHO0w4iLfXJ4Q/zrIujpYfrErOEjLslEnrj6KbdV8o9+u15+j9fLnrqZuRO5dqhg3iLqgih8myDOU2PwUNWmWU4/iw2KH/8UD8XblGkuM6n6ncvZ3fK8kWoqi0CBJP3Ik58RawFTKm3cwYnXYZKFJFfwqKj43mOl3XOvD5DO6zjlq2nDdx3fDHP6PuDo/QLnjFkbftZKhy61RYyAUpy9Bf6IxbCrcbLP3uRED9rN3YDhknPAyo2+vrFdtw2p4EVW6SAaKUM7nevUKV6u41PeWAepcZuDPnJPgk4/7ug2U5mVgYfaSx3JsXdmq3HvJ8L0WWe3XsmNEBEWYLvwo5Uuznk1OCXiROZ3qoQq0z+Ck3CJi6AofbWoCtnLuB0T+jHpJIfGBGBEuMEXDud4qeQfcAcT+SItR/nTweTrWhT44h1QqytA/h1bZrNOvU+isEtZ39zth14fPclQ0s+TBuEcuSznyNHClImTYjTiQ7zehFvFdmx1OHmfxzmeFDDtFX49tKWwEgOJiayeFmXZS6EBrMZNmST1CvWJC7GjwrZtM339IUZUthQZOvgg2Jl36yDqUbjLuNDYZx77nRxVDwIdpw0XDLyPNnglU7FJhhL2qj0+RQ/hLVBpzOjQHRtrijc+xc5hT/2SbJhT6PokheFgF5S6PET3cBTkXfsJL16xWRHPeVFb07P14Zmoa5IaxQZ4is1A0eSQ==
Hello Cassey, Please share your version of the pS software. Are you able to observe a pattern of the issue (timewise)? Do the systems automatically recover the flow of measurements? OR What steps are required for the schedule to be recovered? Do you have any specifics in your meschconfig-agent.conf file or are you using the defaults?
More specifically have you adjusted the interval parameters in the conf file? Regards, Ivan From: [mailto:]
On Behalf Of Casey Russell Group, I mentioned it some time back, when I thought it was a problem with my 4 lower powered hosts running out of CPU, but I've been chasing it ever since and it's hitting my larger hosts as well. Ever since I upgraded to 4.0 several months
ago, I've had an issue where regularly, my hosts stop scheduling tests from the mesh. My dashboard today shows a mess of hosts that failed to schedule tests last night some of them are on their second, (or more) continuous day. I can't figure out if this is a problem with the mesh config file or on the hosts (although since it's spread everywhere, even a newly installed CentOS7 host) I'm leaning toward some problem in the mesh config file. I'm not sure what to give you that will help, so below you'll find some diagnostic commands from an affected host this morning that is only running bandwidth tests, none of the latency tests scheduled. Any ideas or help is appreciated. Since the latency tests were never scheduled, I don't have anything from the API to show you, the mesh config file is at: [root@ps-ksu-bw crussell]# pscheduler schedule 2017-10-13T09:47:54-05:00 - 2017-10-13T09:48:23-05:00 (Pending) throughput --duration PT20S --source
ps-fhsu-bw.perfsonar.kanren.net --ip-version 4 --dest
ps-ksu-bw.perfsonar.kanren.net --parallel 1 (Run with tool 'iperf3') 2017-10-13T09:49:33-05:00 - 2017-10-13T09:49:52-05:00 (Pending) throughput --bandwidth 920000000 --duration PT10S --source
ps-esu-bw.perfsonar.kanren.net --ip-version 4 --dest
ps-ksu-bw.perfsonar.kanren.net --parallel 1 --udp (Run with tool 'iperf3') 2017-10-13T09:52:08-05:00 - 2017-10-13T09:52:27-05:00 (Pending) throughput --bandwidth 920000000 --duration PT10S --source
ps-bryant-bw.perfsonar.kanren.net --ip-version 6 --dest
ps-ksu-bw.perfsonar.kanren.net --parallel 1 --udp (Run with tool 'iperf3') 2017-10-13T09:58:44-05:00 - 2017-10-13T09:59:03-05:00 (Pending) throughput --bandwidth 920000000 --duration PT10S --source
ps-bryant-bw.perfsonar.kanren.net --ip-version 4 --dest
ps-ksu-bw.perfsonar.kanren.net --parallel 1 --udp (Run with tool 'iperf3') 2017-10-13T10:07:36-05:00 - 2017-10-13T10:08:05-05:00 (Pending) throughput --duration PT20S --source
ps-ku-bw.perfsonar.kanren.net --ip-version 6 --dest
ps-ksu-bw.perfsonar.kanren.net --parallel 1 (Run with tool 'iperf3') 2017-10-13T10:08:38-05:00 - 2017-10-13T10:08:57-05:00 (Pending) throughput --bandwidth 920000000 --duration PT10S --source
ps-ku-bw.perfsonar.kanren.net --ip-version 6 --dest
ps-ksu-bw.perfsonar.kanren.net --parallel 1 --udp (Run with tool 'iperf3') 2017-10-13T10:10:18-05:00 - 2017-10-13T10:10:47-05:00 (Pending) throughput --duration PT20S --source
ps-esu-bw.perfsonar.kanren.net --ip-version 6 --dest
ps-ksu-bw.perfsonar.kanren.net --parallel 1 (Run with tool 'iperf3') 2017-10-13T10:10:49-05:00 - 2017-10-13T10:11:08-05:00 (Pending) throughput --bandwidth 920000000 --duration PT10S --source
ps-esu-bw.perfsonar.kanren.net --ip-version 6 --dest
ps-ksu-bw.perfsonar.kanren.net --parallel 1 --udp (Run with tool 'iperf3') 2017-10-13T10:16:39-05:00 - 2017-10-13T10:17:08-05:00 (Pending) throughput --duration PT20S --source
ps-fhsu-bw.perfsonar.kanren.net --ip-version 6 --dest
ps-ksu-bw.perfsonar.kanren.net --parallel 1 (Run with tool 'iperf3') 2017-10-13T10:36:46-05:00 - 2017-10-13T10:37:15-05:00 (Pending) throughput --duration PT20S --source
ps-esu-bw.perfsonar.kanren.net --ip-version 4 --dest
ps-ksu-bw.perfsonar.kanren.net --parallel 1 (Run with tool 'iperf3') [root@ps-ksu-bw crussell]# service pscheduler-runner status runner (pid 13073) is running... [root@ps-ksu-bw crussell]# service pscheduler-ticker status ticker (pid 13071) is running... [root@ps-ksu-bw crussell]# service pscheduler-archiver status archiver (pid 13078) is running... [root@ps-ksu-bw crussell]# service pscheduler-server status pscheduler-server: unrecognized service [root@ps-ksu-bw crussell]# service pscheduler-scheduler status scheduler (pid 13090) is running... [root@ps-ksu-bw crussell]# ps -ax | grep pscheduler Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ 3448 pts/0 S+ 0:00 grep pscheduler 8236 ? Ss 0:17 postgres: pscheduler pscheduler 127.0.0.1(41520) idle 13071 ? Sl 0:42 /usr/bin/python /usr/libexec/pscheduler/daemons/ticker --daemon --pid-file /var/run/pscheduler-ticker.pid --dsn @/etc/pscheduler/database/database-dsn 13073 ? Sl 21:20 /usr/bin/python /usr/libexec/pscheduler/daemons/runner --daemon --pid-file /var/run/pscheduler-runner.pid --dsn @/etc/pscheduler/database/database-dsn 13075 ? Ss 1:20 postgres: pscheduler pscheduler 127.0.0.1(48114) idle 13076 ? Ss 9:40 postgres: pscheduler pscheduler 127.0.0.1(48116) idle 13078 ? S 67:00 /usr/bin/python /usr/libexec/pscheduler/daemons/archiver --daemon --pid-file /var/run/pscheduler-archiver.pid --dsn @/etc/pscheduler/database/database-dsn 13079 ? Ss 360:11 postgres: pscheduler pscheduler 127.0.0.1(48118) idle 13081 ? Ss 8:31 postgres: pscheduler pscheduler 127.0.0.1(48122) idle 13083 ? Ss 0:00 postgres: pscheduler pscheduler 127.0.0.1(48126) idle 13090 ? Sl 65:19 /usr/bin/python /usr/libexec/pscheduler/daemons/scheduler --daemon --pid-file /var/run/pscheduler-scheduler.pid --dsn @/etc/pscheduler/database/database-dsn 13108 ? Ss 115:36 postgres: pscheduler pscheduler 127.0.0.1(48132) idle 13114 ? Ss 0:00 postgres: pscheduler pscheduler 127.0.0.1(48136) idle 28737 ? Ss 0:01 postgres: pscheduler pscheduler 127.0.0.1(55217) idle [root@ps-ksu-bw crussell]# [root@ps-ksu-bw crussell]# service perfsonar-meshconfig-agent usage: /etc/init.d/perfsonar-meshconfig-agent (start|stop|restart|help) start - start perfSONAR MeshConfig Agent stop - stop perfSONAR MeshConfig Agent restart - restart perfSONAR MeshConfig Agent if running by sending a SIGHUP or start if not running status - Indicates if the service is running help - this screen [root@ps-ksu-bw crussell]# service perfsonar-meshconfig-agent restart /etc/init.d/perfsonar-meshconfig-agent stop: perfSONAR MeshConfig Agent stopped waiting... /usr/lib/perfsonar/bin/perfsonar_meshconfig_agent --config=/etc/perfsonar/meshconfig-agent.conf --pidfile=/var/run/perfsonar-meshconfig-agent.pid --logger=/etc/perfsonar/meshconfig-agent-logger.conf --user=perfsonar --group=perfsonar /etc/init.d/perfsonar-meshconfig-agent start: perfSONAR MeshConfig Agent started [root@ps-ksu-bw crussell]# tail -n 50 /var/log/perfsonar/meshconfig-agent.log 2017/10/12 20:10:55 (8826) INFO> perfsonar_meshconfig_agent:438 main:: - Added 3 new tasks, and deleted 0 old tasks 2017/10/12 21:10:10 (8826) INFO> perfsonar_meshconfig_agent:438 main:: - Added 1 new tasks, and deleted 0 old tasks 2017/10/13 03:10:37 (8826) INFO> perfsonar_meshconfig_agent:438 main:: - Added 2 new tasks, and deleted 0 old tasks 2017/10/13 04:10:40 (8826) WARN> perfsonar_meshconfig_agent:430 main:: - Problem determining which pscheduler to submit test to for deletion, skipping test throughput/iperf3(ps-ksu-bw.perfsonar.kanren.net->ps-fhsu-bw.perfsonar.kanren.net):
500 Internal Server Error: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>500 Internal Server Error</title> </head><body> <h1>Internal Server Error</h1> <p>The server encountered an internal error or misconfiguration and was unable to complete your request.</p> <p>Please contact the server administrator at root@localhost to inform them of the time this error occurred, and the actions you performed just before this error.</p> <p>More information about this error may be available in the server error log.</p> </body></html> 2017/10/13 07:11:39 (8826) INFO> perfsonar_meshconfig_agent:438 main:: - Added 5 new tasks, and deleted 0 old tasks 2017/10/13 09:20:23 (8826) INFO> perfsonar_meshconfig_agent:438 main:: - Added 97 new tasks, and deleted 0 old tasks |
- [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, (continued)
- [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, Casey Russell, 10/13/2017
- [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, Casey Russell, 10/13/2017
- Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, Andrew Lake, 10/13/2017
- Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, Larry Blunk, 10/13/2017
- Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, Mark Feit, 10/17/2017
- Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, Casey Russell, 10/17/2017
- Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, Casey Russell, 10/19/2017
- Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, Casey Russell, 10/19/2017
- Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, Mark Feit, 10/20/2017
- Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, Casey Russell, 10/20/2017
- Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, Casey Russell, 10/19/2017
- Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, Casey Russell, 10/17/2017
- Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, Mark Feit, 10/17/2017
- Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, Larry Blunk, 10/13/2017
- Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly, Andrew Lake, 10/13/2017
- RE: [perfsonar-user] meshconfig-agent-tasks not scheduling tasks regularly, Garnizov, Ivan (RRZE), 10/16/2017
- Re: [perfsonar-user] meshconfig-agent-tasks not scheduling tasks regularly, Casey Russell, 10/16/2017
- RE: [perfsonar-user] meshconfig-agent-tasks not scheduling tasks regularly, Garnizov, Ivan (RRZE), 10/16/2017
- Re: [perfsonar-user] meshconfig-agent-tasks not scheduling tasks regularly, Casey Russell, 10/16/2017
- RE: [perfsonar-user] meshconfig-agent-tasks not scheduling tasks regularly, Garnizov, Ivan (RRZE), 10/17/2017
- Re: [perfsonar-user] meshconfig-agent-tasks not scheduling tasks regularly, Casey Russell, 10/17/2017
- RE: [perfsonar-user] meshconfig-agent-tasks not scheduling tasks regularly, Garnizov, Ivan (RRZE), 10/17/2017
- Re: [perfsonar-user] meshconfig-agent-tasks not scheduling tasks regularly, Casey Russell, 10/16/2017
- RE: [perfsonar-user] meshconfig-agent-tasks not scheduling tasks regularly, Garnizov, Ivan (RRZE), 10/16/2017
- Re: [perfsonar-user] meshconfig-agent-tasks not scheduling tasks regularly, Casey Russell, 10/16/2017
Archive powered by MHonArc 2.6.19.