Skip to Content.
Sympa Menu

perfsonar-user - Re: [perf-node-users] [perfsonar-user] Regular Testing services showing "Not Running"

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perf-node-users] [perfsonar-user] Regular Testing services showing "Not Running"


Chronological Thread 
  • From: Soichi Hayashi <>
  • To: "" <>, Performance Node Users <>
  • Subject: Re: [perf-node-users] [perfsonar-user] Regular Testing services showing "Not Running"
  • Date: Fri, 27 Sep 2013 16:59:52 -0400

I believe bwmaster processes are already running

2013-09-27 20:37:29 UTC [root@perfsonar-bw:/opt/perfsonar_ps/traceroute_ma/etc]# ps -ef | grep buoy
497       2257     1  0 19:32 ?        00:00:00 /opt/perfsonar_ps/perfsonarbuoy_ma/bin/bwcollector.pl:master
497       2264     1 99 19:32 ?        01:04:15 /opt/perfsonar_ps/perfsonarbuoy_ma/bin/bwmaster.pl:master

But, I've restarted anyway. 

2013-09-27 20:39:39 UTC [root@perfsonar-bw:/opt/perfsonar_ps/traceroute_ma/etc]# /etc/init.d/perfsonarbuoy_bw_master start
perfSONAR-BUOY BWCTL Measurement Service Started

That may give a clue if the they don't start/stop cleanly. 
So no clue here?

By the way, the init script said "perfSONAR-BUOY BWCTL Measurement Service Stopped", but I couldn't start it back again, due to "bwmaster.pl:24644 still running...".. I had to do kill -9 on the master, then start again. Something wrong with the init script?

Anway, I still see nothing under "Active" throughput tests after the restart. I've also rebooted the machine, with no change.

Have you configured regular traceroute tests?  If you haven't, then it will show as not running.
I see traceroute tests listed in OSG mesh config (http://myosg.grid.iu.edu/pfmesh/json) for perfsonar-lt, and both perfsonar-bw and -lt insances uses this config in mesh  agent config. Do you mean to I need to do something else to get this configured?

> There is a longer write up here on lots of other fun steps:
http://psps.perfsonar.net/toolkit/FAQs.html#Q54
Q:My (OWAMP|BWCTL) measurements have stopped, and I notice mysql errors in the logs. What should I do?

So you want me to go through this mysql troubleshooting? Or.. do you mean to read through all FAQs? By the way, my MySQL DB *did* crash in the past, and I have followed similar steps to recover (actually I think I've re built from scratch at least once since this happened). 

Soichi







On Fri, Sep 27, 2013 at 4:00 PM, Jason Zurawski <> wrote:
Hi Soichi;

To answer your questions:

>> 1) When I go to throughput service page > https://perfsonar-bw.grid.iu.edu/serviceTest/index.cgi?eventType=bwctl
>> I see no entry under "Active Tests" for perfsonar-bw. Do you know how to enable all inactive tests?

Are the bwcollector an bwmaster processes running, if not start them (or perhaps just restart them completely):

sudo /etc/init.d/perfsonarbuoy_bw_collector restart
sudo /etc/init.d/perfsonarbuoy_bw_master restart

That may give a clue if the they don't start/stop cleanly.  Sometimes iperf zombies stick around, and you need to go kill them (a likely problem).  There is a longer write up here on lots of other fun steps:

http://psps.perfsonar.net/toolkit/FAQs.html#Q54

>> 2) For both -bw and -lt instances, I still see "Traceroute Regular Testing" showing "Not Running". Do I need to worry?

Have you configured regular traceroute tests?  If you haven't, then it will show as not running. If you have, you may need to follow a similar step to link I sent above (just for the traceroute tools).

Thanks;

-jason

On Sep 27, 2013, at 3:45 PM, Soichi Hayashi <> wrote:

> I see.. I did following.
>
> 2013-09-27 19:31:07 UTC [root@perfsonar-bw:/var/lib]# ls -la perfsonar
> lrwxrwxrwx 1 root root 20 Sep 27 19:31 perfsonar -> /usr/local/perfsonar
> (on both perfsonar-bw and perfsonar-lt)
>
> Reverted config change, rebooted them, and now I seeing "Running" next to perfsonar BUOY regular testing.
>
> I still have 2 issues, however,
>
> 1) When I go to throughput service page > https://perfsonar-bw.grid.iu.edu/serviceTest/index.cgi?eventType=bwctl
> I see no entry under "Active Tests" for perfsonar-bw. Do you know how to enable all inactive tests?
>
> 2) For both -bw and -lt instances, I still see "Traceroute Regular Testing" showing "Not Running". Do I need to worry?
>
> Soichi
>
>
> On Fri, Sep 27, 2013 at 3:27 PM, Jason Zurawski <> wrote:
> Hi Soichi;
>
> Correct, if you changed the defaults the web page is unlikely to work.  There are two options:
>
>  - Edit the web page to point to the new locations, on a regular pS Performance Toolkit it is located here (this may be different for your custom setup): /opt/perfsonar_ps/toolkit/web/root/gui/services/index.cgi
>
>  - Symlink the /var locations to the /usr/local locations that you are using.
>
> Thanks;
>
> -jason
>
> On Sep 27, 2013, at 3:18 PM, Soichi Hayashi <> wrote:
>
> > Jason,
> >
> > I have following in the /opt/perfsonar_ps/perfsonarbuoy_ma/etc
> > (perfsonar-bw)
> > > BWDataDir       /usr/local/perfsonar/perfsonarbuoy_ma/bwctl
> >
> > (perfsonar-lt)
> > > OWPDataDir      /usr/local/perfsonar/perfsonarbuoy_ma/owamp
> >
> > These re-configuration were needed to prevent perfsonar from running out of disk on the default /var partition.
> >
> > The config creates pid file in following location (for perfsonar-bw)
> > > /usr/local/perfsonar/perfsonarbuoy_ma/bwctl/bwmaster.pid
> > which contains a correct PID
> >
> > 2013-09-27 19:11:47 UTC [root@perfsonar-bw:/usr/local/perfsonar/perfsonarbuoy_ma/bwctl]# cat bwmaster.pid
> > 1850
> > 2013-09-27 19:11:50 UTC [root@perfsonar-bw:/usr/local/perfsonar/perfsonarbuoy_ma/bwctl]# ps - grep 1850
> > 497       1850     1 99 17:44 ?        01:27:31 /opt/perfsonar_ps/perfsonarbuoy_ma/bin/bwmaster.pl:master
> >
> > I am guessing that.. web interface is using the default pid location to look for the PID file, and incorrectly determining that the process are not running?
> >
> > Thanks!
> > Soichi
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Fri, Sep 27, 2013 at 12:09 PM, Jason Zurawski <> wrote:
> > Hi Soichi;
> >
> > The local services page is designed to look for PID files and process names for the various services.  In the case of Latency testing:
> >
> > > my $pSB_owamp_master_pid        = "/var/lib/perfsonar/perfsonarbuoy_ma/owamp/powmaster.pid";
> > > my $pSB_owamp_master_pname      = "powmaster";
> > > my $pSB_owamp_collector_pid     = "/var/lib/perfsonar/perfsonarbuoy_ma/owamp/upload/powcollector.pid";
> > > my $pSB_owamp_collector_pname   = "powcollector";
> >
> > And in the case of Bandwidth testing:
> >
> > > my $pSB_bwctl_master_pid    = "/var/lib/perfsonar/perfsonarbuoy_ma/bwctl/bwmaster.pid";
> > > my $pSB_bwctl_master_pname      = "bwmaster";
> > > my $pSB_bwctl_collector_pid     = "/var/lib/perfsonar/perfsonarbuoy_ma/bwctl/upload/bwcollector.pid";
> > > my $pSB_bwctl_collector_pname   = "bwcollector";
> >
> > The check will go into 'not running' if there are no PID files, or the PID in the file doesn't match the process name that is currently running.  That would be the first place to look (and naturally the old fashioned 'reboot' has been known to fix this).
> >
> > With regards to your Bandwidth test machine - I don't see any tests in an active state via the results page, so that may be additional troubleshooting you will want to do.
> >
> > Thanks;
> >
> > -jason
> >
> > On Sep 27, 2013, at 11:38 AM, Soichi Hayashi <> wrote:
> >
> > > Hello.
> > >
> > > I have following perfsonar instances (installed via RPM)
> > >
> > > > http://perfsonar-lt.grid.iu.edu/toolkit/
> > > > http://perfsonar-bw.grid.iu.edu/toolkit/
> > >
> > > For -lt, I see following
> > > perfSONAR-BUOY Regular Testing (One-Way Latency)[1]   Not Running
> > > And on -bw instance, I see following
> > > perfSONAR-BUOY Regular Testing (Throughput)[1]        Not Running
> > > I am not sure if these message are red-herring, but these services are enabled (via UI) and collectors are running, for example for -lt
> > >
> > > 2013-09-27 15:36:49 UTC [root@perfsonar-lt:/var/log/perfsonar]# ps -ef | grep colle
> > > 497        485   511  3 15:01 ?        00:01:24 /opt/perfsonar_ps/perfsonarbuoy_ma/bin/powcollector.pl:handle_req[129.79.53.52]
> > > 497        511     1  0 01:06 ?        00:00:00 /opt/perfsonar_ps/perfsonarbuoy_ma/bin/powcollector.pl:master
> > >
> > > How can I troubleshoot this issue?
> > > Soichi




Archive powered by MHonArc 2.6.16.

Top of Page