Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Maddash WebUI does not detect VM agent down on dashboard when VM is powered off

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Maddash WebUI does not detect VM agent down on dashboard when VM is powered off


Chronological Thread 
  • From: Andrew Lake <>
  • To: Mike Cruzz <>
  • Cc: perfsonar-user <>
  • Subject: Re: [perfsonar-user] Maddash WebUI does not detect VM agent down on dashboard when VM is powered off
  • Date: Tue, 4 Aug 2020 15:31:04 -0400

Hi,

What problem are you trying to fix? Is it that your maddash checks weren't updating? If it indeed is reporting unknown then the box should flip to orange. If that is not happening then you likely have to wait longer for maddash to update or something else is happening. Whatever it is the update to the script to report 100% loss when there is no data is not the right way to do it and you are just creating work for yourself with no benefit (and will probably just make things worse and misleading).

Thanks,
Andy 

On Tue, Aug 4, 2020 at 11:05 AM Mike Cruzz <> wrote:
Hi Ivan

I have been checking this further and I have a theory of why the alerts don't
come through.

At the point when I run the nagios check_ping_loss.pl script, when I
disconnect the VM nic, I get the following output.

[root@perf7 ~]# /usr/lib64/nagios/plugins/check_ping_loss.pl -u
https://1.1.1.1/esmond/perfsonar/archive -s 2.2.2.2 -d 3.3.3.3 -r 100 -c 0.1 -
w 0.001 -t 60
Subroutine JSON::PP::Boolean::("" redefined at /usr/share/perl5/overload.pm
line 49.
Subroutine JSON::PP::Boolean::(eq redefined at /usr/share/perl5/overload.pm
line 49.
PS_CHECK_PING_LOSS UNKNOWN - Unable to find any tests with data in the given
time range where source is 2.2.2.2 and destination is 3.3.3.3

[root@perf7 ~]# /usr/lib64/nagios/plugins/check_ping_loss.pl -u
https://1.1.1.1/esmond/perfsonar/archive -s 2.2.2.2 -d 3.3.3.3 -r 100 -c 0.1 -
w 0.001 -t 60
Subroutine JSON::PP::Boolean::("" redefined at /usr/share/perl5/overload.pm
line 49.
Subroutine JSON::PP::Boolean::(eq redefined at /usr/share/perl5/overload.pm
line 49.
PS_CHECK_PING_LOSS UNKNOWN - Unable to find any tests with data in the given
time range where source is 2.2.2.2 and destination is 3.3.3.3

[root@perf7 ~]# /usr/lib64/nagios/plugins/check_ping_loss.pl -u
https://1.1.1.1/esmond/perfsonar/archive -s 2.2.2.2 -d 3.3.3.3 -r 100 -c 0.1 -
w 0.001 -t 60
Subroutine JSON::PP::Boolean::("" redefined at /usr/share/perl5/overload.pm
line 49.
Subroutine JSON::PP::Boolean::(eq redefined at /usr/share/perl5/overload.pm
line 49.
PS_CHECK_PING_LOSS UNKNOWN - Unable to find any tests with data in the given
time range where source is 2.2.2.2 and destination is 3.3.3.3


This of course is accurate as there is no data in the esmond archive.

When I re-enable the nic the data starts to populate.


[root@perf7 ~]# /usr/lib64/nagios/plugins/check_ping_loss.pl -u
https://1.1.1.1/esmond/perfsonar/archive -s 2.2.2.2 -d 3.3.3.3 -r 100 -c 0.1 -
w 0.001 -t 60
Subroutine JSON::PP::Boolean::("" redefined at /usr/share/perl5/overload.pm
line 49.
Subroutine JSON::PP::Boolean::(eq redefined at /usr/share/perl5/overload.pm
line 49.
PS_CHECK_PING_LOSS OK - Average loss is 0.00% | Count=1;; Min=0;; Max=0;;
Average=0;; Standard_Deviation=0;;

[root@perf7 ~]# /usr/lib64/nagios/plugins/check_ping_loss.pl -u
https://1.1.1.1/esmond/perfsonar/archive -s 2.2.2.2 -d 3.3.3.3 -r 100 -c 0.1 -
w 0.001 -t 60
Subroutine JSON::PP::Boolean::("" redefined at /usr/share/perl5/overload.pm
line 49.
Subroutine JSON::PP::Boolean::(eq redefined at /usr/share/perl5/overload.pm
line 49.
PS_CHECK_PING_LOSS OK - Average loss is 0.00% | Count=1;; Min=0;; Max=0;;
Average=0;; Standard_Deviation=0;;

[root@perf7 ~]# /usr/lib64/nagios/plugins/check_ping_loss.pl -u
https://1.1.1.1/esmond/perfsonar/archive -s 2.2.2.2 -d 3.3.3.3 -r 100 -c 0.1 -
w 0.001 -t 60
Subroutine JSON::PP::Boolean::("" redefined at /usr/share/perl5/overload.pm
line 49.
Subroutine JSON::PP::Boolean::(eq redefined at /usr/share/perl5/overload.pm
line 49.
PS_CHECK_PING_LOSS OK - Average loss is 0.00% | Count=1;; Min=0;; Max=0;;
Average=0;; Standard_Deviation=0;;

[root@perf7 ~]# /usr/lib64/nagios/plugins/check_ping_loss.pl -u
https://1.1.1.1/esmond/perfsonar/archive -s 2.2.2.2 -d 3.3.3.3 -r 100 -c 0.1 -
w 0.001 -t 60
Subroutine JSON::PP::Boolean::("" redefined at /usr/share/perl5/overload.pm
line 49.
Subroutine JSON::PP::Boolean::(eq redefined at /usr/share/perl5/overload.pm
line 49.
PS_CHECK_PING_LOSS OK - Average loss is 0.00% | Count=1;; Min=0;; Max=0;;
Average=0;; Standard_Deviation=0;;

[root@perf7 ~]# /usr/lib64/nagios/plugins/check_ping_loss.pl -u
https://1.1.1.1/esmond/perfsonar/archive -s 2.2.2.2 -d 3.3.3.3 -r 100 -c 0.1 -
w 0.001 -t 60
Subroutine JSON::PP::Boolean::("" redefined at /usr/share/perl5/overload.pm
line 49.
Subroutine JSON::PP::Boolean::(eq redefined at /usr/share/perl5/overload.pm
line 49.
PS_CHECK_PING_LOSS OK - Average loss is 0.00% | Count=1;; Min=0;; Max=0;;
Average=0;; Standard_Deviation=0;;

I guess I will need to use an additional script as a filter where if it sees
"PS_CHECK_PING_LOSS UNKNOWN" for a 60 second interval then flag it as ping
loss 100%.

I don't suppose you can share any other elegant way to do this?

Thanks again.


--
To unsubscribe from this list: https://lists.internet2.edu/sympa/signoff/perfsonar-user



Archive powered by MHonArc 2.6.19.

Top of Page