perfsonar-user - Re: [perfsonar-user] new install problems with test results and adding tests
Subject: perfSONAR User Q&A and Other Discussion
List archive
- From: "Fligor, Debbie" <>
- To: Michael Johnson <>
- Cc: "Fligor, Debbie" <>, "" <>
- Subject: Re: [perfsonar-user] new install problems with test results and adding tests
- Date: Wed, 6 Mar 2019 15:31:50 +0000
Thanks Michael,
Sorry to put so much into one email. With the problem with the task file
showing up part way through working on the “why don’t the tests results show
up” problem, I ended up with everything rolled together and I wasn’t sure
which things might be inter-related. I’ve provided some more details on the 3
issues, with details below. In short, the updates were on but not working,
now they are on and working. Removing and replacing the gui-tasks.conf file
doesn’t help, and the graphs not showing up with the older code but working
after an upgrade would be hard to troubleshoot.
> On Mar 5, 2019, at 14:39, Michael Johnson <> wrote:
>
> Hi Debbie,
>
> That's a lot all at once, I will try to answer all your questions, but let
> me know if I've missed anything. Answers inline below:
>
>
>> On Mar 5, 2019, at 8:00 AM, Fligor, Debbie <> wrote:
>>
>> Hi everyone,
>>
>> First thing I figured out was that even though the button for auto update
>> was set, it hadn’t done any updates on either system. So I did by-hand yum
>> update and it got everything up to the current versions as best I can tell
>> (psconfig is 4.1.6-1el7). So that’s my first ask - what else do I need to
>> do for auto update to work?
>>
>
> Setting auto-updates via the GUI should work; does it show that
> auto-updates are enabled ("green" switch)?
yes, they both showed a green switch before and after using yum update to
bring them up to current version.
>
> Regardless, you can enable/disable auto updates from the commandline. This
> is handled by a service called yum-cron
>
> To view the current status of auto-updates:
> $ sudo systemctl status yum-cron
>
This is on.
● yum-cron.service - Run automatic yum updates as a cron job
Loaded: loaded (/usr/lib/systemd/system/yum-cron.service; enabled; vendor
preset: disabled)
Active: active (exited) since Mon 2019-03-04 23:08:46 CST; 21h ago
Process: 12029 ExecStart=/bin/touch /var/lock/subsys/yum-cron (code=exited,
status=0/SUCCESS)
Main PID: 12029 (code=exited, status=0/SUCCESS)
CGroup: /system.slice/yum-cron.service
Mar 04 23:08:46 res-perfsonar.techservices.illinois.edu systemd[1]: Starting
Run automatic yum updates as a cron job...
Mar 04 23:08:46 res-perfsonar.techservices.illinois.edu systemd[1]: Started
Run automatic yum updates as a cron job.
I noticed that the bottom two lines look like they came out of a log file, so
I looked for similar messages:
[root@res-perfsonar log]# grep " yum updates" *
grep: audit: Is a directory
boot.log-20181207: Starting Run automatic yum updates as a cron job...
boot.log-20181207:[ OK ] Started Run automatic yum updates as a cron job.
boot.log-20181208: Starting Run automatic yum updates as a cron job...
boot.log-20181208:[ OK ] Started Run automatic yum updates as a cron job.
boot.log-20190205: Starting Run automatic yum updates as a cron job...
boot.log-20190205:[ OK ] Started Run automatic yum updates as a cron job.
boot.log-20190209: Starting Run automatic yum updates as a cron job...
boot.log-20190209:[ OK ] Started Run automatic yum updates as a cron job.
boot.log-20190302: Starting Run automatic yum updates as a cron job...
boot.log-20190302:[ OK ] Started Run automatic yum updates as a cron job.
boot.log-20190302: Starting Run automatic yum updates as a cron job...
boot.log-20190302:[ OK ] Started Run automatic yum updates as a cron job.
boot.log-20190305: Starting Run automatic yum updates as a cron job...
boot.log-20190305:[ OK ] Started Run automatic yum updates as a cron job.
boot.log-20190305: Starting Run automatic yum updates as a cron job...
boot.log-20190305:[ OK ] Started Run automatic yum updates as a cron job.
boot.log-20190305: Starting Run automatic yum updates as a cron job...
boot.log-20190305:[ OK ] Started Run automatic yum updates as a cron job.
grep: cacti: Is a directory
grep: cassandra: Is a directory
grep: chrony: Is a directory
grep: cups: Is a directory
grep: esmond: Is a directory
grep: httpd: Is a directory
messages:Mar 4 10:27:51 res-perfsonar systemd: Starting Run automatic yum
updates as a cron job...
messages:Mar 4 10:27:51 res-perfsonar systemd: Started Run automatic yum
updates as a cron job.
messages:Mar 4 22:13:40 res-perfsonar systemd: Stopping Run automatic yum
updates as a cron job...
messages:Mar 4 22:15:05 res-perfsonar systemd: Starting Run automatic yum
updates as a cron job...
messages:Mar 4 22:15:05 res-perfsonar systemd: Started Run automatic yum
updates as a cron job.
messages:Mar 4 23:08:46 res-perfsonar systemd: Starting Run automatic yum
updates as a cron job...
messages:Mar 4 23:08:46 res-perfsonar systemd: Started Run automatic yum
updates as a cron job.
messages-20190210:Feb 8 09:05:45 res-perfsonar systemd: Starting Run
automatic yum updates as a cron job...
messages-20190210:Feb 8 09:05:45 res-perfsonar systemd: Started Run
automatic yum updates as a cron job.
messages-20190303:Mar 1 13:29:58 res-perfsonar systemd: Stopping Run
automatic yum updates as a cron job...
messages-20190303:Mar 1 13:31:26 res-perfsonar systemd: Starting Run
automatic yum updates as a cron job...
messages-20190303:Mar 1 13:31:26 res-perfsonar systemd: Started Run
automatic yum updates as a cron job.
messages-20190303:Mar 1 14:00:59 res-perfsonar systemd: Starting Run
automatic yum updates as a cron job...
messages-20190303:Mar 1 14:00:59 res-perfsonar systemd: Started Run
automatic yum updates as a cron job.
[root@res-perfsonar log]# ls messages*
messages messages-20190210 messages-20190217 messages-20190224
messages-20190303
As you can see, it appeared to be turning on every boot. if I grep for “yum”
I get a lot more hits, showing the cron entry that runs yum-hourly, but the
only log entries that show “yum[pid]: Updated” lines are from 3/4 when I ran
it by hand, and 3 entries from 3/6. Nothing at all in the earlier logs. It
appears the same on both systems.
So it looks like it’s working now. I’m not sure there’s anything left to dig
into here. I will not trust it in the future without checking on it, but I
probably should have been checking on it anyway.
> To enable auto-updates
> $ sudo systemctl enable yum-cron
>
> Docs here:
> http://docs.perfsonar.net/manage_update.html#managing-automatic-updates-from-the-command-line
>
>>
>> Then I added some ping tests so that I should start seeing results sooner.
>> I started getting errors when I changed tests, and would hit cancel, and
>> try again, and fairly soon on that host clicking on the tests tab got me a
>> spinning load circle that never resolved. So I thought I’d corrupted a
>> database or something on one of the two servers.
>
> FWIW, there is no database backend for the test configs.
thanks. good to know.
[snipped details]
> Indeed, you should not need to hand-edit gui-tasks.conf. I think two things
> are happening. First, there might be something invalid in that config file.
> The easiest way to fix this is probably to just delete it (don't forget to
> back it up first), and then re-create it, as an empty file.
>
> $ sudo rm -f /var/lib/perfsonar/toolkit/gui-tasks.conf
> $ sudo touch /var/lib/perfsonar/toolkit/gui-tasks.conf
>
>
> And, when you edited the file it may have gotten saved with the wrong
> permissions, at which point the web backend can't edit it. On my system, it
> looks like this:
>
> $ ls -l /var/lib/perfsonar/toolkit/gui-tasks.conf
> -rw-r--r-- 1 perfsonar perfsonar 6867 Sep 28 15:02
> /var/lib/perfsonar/toolkit/gui-tasks.conf
>
> Its owner and group are "perfsonar", and it's rw for user, and read-only
> for everyone else. You could restore those permissions like this:
> $ sudo chmod 0644 /var/lib/perfsonar/toolkit/gui-tasks.conf
> $ sudo chown perfsonar:perfsonar /var/lib/perfsonar/toolkit/gui-tasks.conf
>
>
Just to be clear, I had made throughput and latency tests, saved, edited,
etc. with no problem on Friday 3/1, before upgrading. All the problems
happened as I was troubleshooting the lack of graphs, and I think that they
happened after the yum upgrade on Monday 3/4.
When I first moved the file to gui-tasks.conf- and tried to recreate it the
ownership was wrong (root was owner and group). So I moved the file back,
made a copy for backup, and used vi to edit it and blank it out. After that
it would let me add and save configurations, for a little while.
For example I just copied my current config to a backup file, and then added
a host as a latency test. it saved with no problem. Then I added the same
host as a ping test, and went to save it, and got "Error - Internal Server
Error”. This was probably already in a ping list, so I dismissed the error
and hit cancel. everything looked fine from the gui. Then I chose edit to an
existing test, and just changed the name. Clicked “okay” then “save” and
again got an internal server error. At this point I can either blank the
file, or edit it if I don’t want to have to start from scratch. I’ve been
able to replicate this 100% on both servers, starting from an empty
gui-tasks.conf file. To be sure it was really empty (and vi didn’t leave
anything hidden), I just did it again, starting from touching a file, setting
user/group, and checking permissions:
[root@res-perfsonar toolkit]# mv gui-tasks.conf gui-tasks.conf--
[root@res-perfsonar toolkit]# ls
gui-tasks.conf-- gui-tasks.conf-2019-03-05-2048 gui-tasks.conf-bak
[root@res-perfsonar toolkit]# touch gui-tasks.conf
[root@res-perfsonar toolkit]# chown perfsonar gui-tasks.conf
[root@res-perfsonar toolkit]# chgrp perfsonar gui-tasks.conf
[root@res-perfsonar toolkit]# ls -l
total 24
-rw-r--r-- 1 perfsonar perfsonar 0 Mar 5 20:56 gui-tasks.conf
-rw-r--r-- 1 perfsonar perfsonar 5020 Mar 5 20:55 gui-tasks.conf--
-rw-r--r-- 1 root root 4135 Mar 5 20:48
gui-tasks.conf-2019-03-05-2048
-rw-r--r-- 1 root root 5940 Mar 4 22:20 gui-tasks.conf-bak
I added throughput tests and ping tests, and saved it with no problem. then I
added a latency test, and got a server error. This one is slightly different:
[Tue Mar 05 20:58:57.931578 2019] [cgi:error] [pid 289353] [client
107.152.10.165:1877] AH01215: [Tue Mar 5 20:58:57 2019] regular_testing.cgi:
Can't call method "json" on an undefined value at
/usr/lib/perfsonar/web-ng/root/admin/services/../../../../lib/perfSONAR_PS/NPToolkit/Config/RegularTesting.pm
line 128., referer:
https://res-perfsonar.techservices.illinois.edu/toolkit/auth/admin/tests.cgi
If I dismiss the error, and then press “cancel” instead of save. the
gui-tasks.conf file does not appear to revert, it still has the stanza for
the latency test, and can’t save.
While I’m sure I could have done something that’s messing up the ability to
cancel out of errors, if there’s any use of watching this happen, it’s really
easy to replicate and I’d be happy to set up a screen share and find out why
it can’t keep adding tests, on the off chance it’s not my install, but an
actual problem.
> At that point, hopefully you will be able to create a configuration, and
> save it.
>
>>
>> And once I had that solved, the host that I had deleted and re-installed
>> perfsonar-toolkit on started showing test results (on the dashboard and
>> the esmond archive), but the other host did not. Rebooting didn’t make the
>> tests show up. So on that host I also uninstalled and reinstalled
>> perfsonar-toolkit and now the tests results show up. On this one I’m
>> curios as to why this worked, but since it’s working and continued to
>> update all night, I’m not looking for a better way to do it.
>
> My guess is that maybe you didn't wait long enough for the test results to
> show up, and reinstalling/rebooting most likely was just a coincidence.
> Hard to say.
I successfully made and saved tests on Friday, and there were still none
displaying on Monday. It wasn’t until Monday that I upgraded versions, and
started having problems saving tests. When I added ping tests (And finally
got them to save) to the host that I had reinstalled on earlier, the first
two showed up almost immediately, but they didn’t show up on the other
server. I gave it more than 5 minutes, ran some of the cassandra and could
see them in the log file for pscheduler. While it could have been
coincidence, it really didn’t seem like it. This one there’s no point in
really going after, there’s nothing I can do to replicate it that I can think
of, short of re-installing the servers from the old image and seeing if it
happens again. I’d rather not spend the time that would take.
>
>> Now I’m moving on to system tuning.
>>
>> One last minor question for anyone that got this far. How do I set the
>> primary (default) interface so it’s what I want? one system is picking the
>> 100G, the other the 1 G copper port. on both systems the 100G has the
>> default route for v4 and v6.
>
>
> You can set the primary interface in
> /usr/lib/perfsonar/web-ng/etc/web_admin.conf by setting the
> "primary_interface" directive.
thanks.
>
> Hope this helps.
>
> Thanks,
> Michael
>
> Michael Johnson
> GlobalNOC DevOps Engineer
>
>
--
-debbie
Debbie Fligor, n9dn Lead Network Engineer @ Univ. of Il
email:
- [perfsonar-user] new install problems with test results and adding tests, Fligor, Debbie, 03/05/2019
- Re: [perfsonar-user] new install problems with test results and adding tests, Michael Johnson, 03/05/2019
- Re: [perfsonar-user] new install problems with test results and adding tests, Fligor, Debbie, 03/06/2019
- Re: [perfsonar-user] new install problems with test results and adding tests [solved], Fligor, Debbie, 03/19/2019
- Re: [perfsonar-user] new install problems with test results and adding tests, Fligor, Debbie, 03/06/2019
- Re: [perfsonar-user] new install problems with test results and adding tests, Michael Johnson, 03/05/2019
Archive powered by MHonArc 2.6.19.