Skip to Content.
Sympa Menu

perfsonar-user - [perfsonar-user] Re: perfS0NAR Services not running

Subject: perfSONAR User Q&A and Other Discussion

List archive

[perfsonar-user] Re: perfS0NAR Services not running


Chronological Thread 
  • From: "Yao,Rong" <>
  • To: "Garnizov, Ivan (RRZE)" <>, "" <>
  • Cc: "Moye,Roger V" <>, "Adams,Andrew M" <>, William J Allen <>
  • Subject: [perfsonar-user] Re: perfS0NAR Services not running
  • Date: Tue, 14 Jun 2016 18:10:33 +0000
  • Accept-language: en-US
  • Ironport-phdr: 9a23:9TFaEBGFec5/jupDAjiioJ1GYnF86YWxBRYc798d s5kLTJ74o8WwAkXT6L1XgUPTWs2DsrQf27uQ4/yrADRZqb+681k8M7V0Hycfjs sXmwFySOWkMmbcaMDQUiohAc5ZX0Vk9XzoeWJcGcL5ekGA6ibqtW1aJBzzOEJP K/jvHcaK1oLsh7H0q8GYOl0RzBOGIppMbzyO5T3LsccXhYYwYo0Q8TDu5kVyRu JN2GlzLkiSlRuvru25/Zpk7jgC86l5r50IeezAcq85Vb1VCig9eyBwvZWz9EqL cQzarFQVSGMXgB1WR0Dk8hj8FtfLiBnX96A1jCiENMuwQrkuXzWw6LlDRhb1zi sKYXpxunnakMJri6RSule8vBFl64/SfIyPMvdiJOXQcc5QDT5ZU9xfTCtHC5n5 cpACFcIAO/pVtY/wuwFIoBejU1qCHuTqn3V5j3iy56o83K5pRTv31QsIGt4Htn jZts7yMuEVS+/jn/qA9inKc/4DgWS104PPaB105KjUBb8=

Greetings, Ivan,

Thank you for your reply. I and my colleague had trouble shooting for a while. 

After I restarted the three services, I did restart several times, the following messages are associated the most recent restart:

/var/log/messages

Jun 14 11:45:02 r1prpps01 ntpdate[22023]: step time server 10.113.39.40 offset -0.014549 sec

Jun 14 11:52:52 r1prpps01 fail2ban.filter[4260]: INFO [sshd] Found 172.18.38.136

Jun 14 11:56:04 r1prpps01 bwctld[3571]: FILE=bwctld.c, LINE=2751, bwctld: exiting...

Jun 14 11:56:04 r1prpps01 bwctld[3571]: FILE=bwctld.c, LINE=2805, bwctld: exited.

Jun 14 11:56:14 r1prpps01 bwctld[22461]: FILE=time.c, LINE=148, NTP: STA_NANO should be set. Make sure ntpd is running, and your NTP configuration is good.

Jun 14 11:56:14 r1prpps01 owampd[22463]: NTP: Status UNSYNC (clock offset issues likely)

Jun 14 11:56:14 r1prpps01 owampd[22463]: NTP: STA_NANO should be set. Make sure ntpd is running, and your NTP configuration is good.

Jun 14 11:56:35 r1prpps01 owampd[12594]: FILE=owampd.c, LINE=1848, owampd: exiting...

Jun 14 11:56:35 r1prpps01 owampd[12594]: FILE=owampd.c, LINE=1895, owampd: exited.

Jun 14 11:56:45 r1prpps01 owampd[22504]: FILE=time.c, LINE=112, NTP: Status UNSYNC (clock offset issues likely)

Jun 14 11:56:45 r1prpps01 owampd[22504]: FILE=time.c, LINE=118, NTP: STA_NANO should be set. Make sure ntpd is running, and your NTP configuration is good.

Jun 14 12:00:01 r1prpps01 root: 12:00:01 up 20:22, 1 user, load average: 0.02, 0.05, 0.01

Jun 14 12:00:02 r1prpps01 ntpdate[23154]: step time server 10.111.39.40 offset -0.014596 sec

Jun 14 12:15:01 r1prpps01 root: 12:15:01 up 20:37, 1 user, load average: 0.00, 0.03, 0.00

Jun 14 12:15:02 r1prpps01 ntpdate[23731]: step time server 10.113.39.40 offset -0.014555 sec

Jun 14 12:29:41 r1prpps01 owampd[24263]: FILE=sapi.c, LINE=303, Connection to ([mdaccps01.mdanderson.edu]:861) from ([mdaccps01.mdanderson.edu]:46111)

Jun 14 12:30:01 r1prpps01 root: 12:30:01 up 20:52, 1 user, load average: 0.12, 0.10, 0.03

Jun 14 12:30:02 r1prpps01 ntpdate[24300]: step time server 10.113.39.40 offset -0.014498 sec

Jun 14 12:31:34 r1prpps01 fail2ban.filter[4260]: INFO [sshd] Found 172.18.38.136


/var/log/perfsonar/owamp_bwctl.log

Jun 14 11:56:04 r1prpps01 bwctld[3571]: FILE=bwctld.c, LINE=2751, bwctld: exiting...

Jun 14 11:56:04 r1prpps01 bwctld[3571]: FILE=bwctld.c, LINE=2805, bwctld: exited.

Jun 14 11:56:14 r1prpps01 bwctld[22461]: FILE=time.c, LINE=148, NTP: STA_NANO should be set. Make sure ntpd is running, and your NTP configuration is good.

Jun 14 11:56:35 r1prpps01 owampd[12594]: FILE=owampd.c, LINE=1848, owampd: exiting...

Jun 14 11:56:35 r1prpps01 owampd[12594]: FILE=owampd.c, LINE=1895, owampd: exited.

Jun 14 11:56:45 r1prpps01 owampd[22504]: FILE=time.c, LINE=112, NTP: Status UNSYNC (clock offset issues likely)

Jun 14 11:56:45 r1prpps01 owampd[22504]: FILE=time.c, LINE=118, NTP: STA_NANO should be set. Make sure ntpd is running, and your NTP configuration is good.

Jun 14 12:29:41 r1prpps01 owampd[24263]: FILE=sapi.c, LINE=303, Connection to ([mdaccps01.mdanderson.edu]:861) from ([mdaccps01.mdanderson.edu]:46111)


/var/log/esmond/esmond.log

2016-06-13 15:42:48,838 [INFO] /usr/lib/esmond/esmond/cassandra.py: Schema check done

2016-06-13 15:42:48,843 [INFO] /usr/lib/esmond/esmond/cassandra.py: Connected to ['localhost:9160']

2016-06-13 16:13:13,365 [INFO] /usr/lib/esmond/esmond/cassandra.py: Checking/creating column families

2016-06-13 16:13:13,366 [INFO] /usr/lib/esmond/esmond/cassandra.py: Schema check done

2016-06-13 16:13:13,372 [INFO] /usr/lib/esmond/esmond/cassandra.py: Connected to ['localhost:9160']


/var/log/cassandra/cassandra.log

  INFO 11:57:08,273 Logging initialized

 INFO 11:57:08,297 Loading settings from file:/etc/cassandra/default.conf/cassandra.yaml

 INFO 11:57:08,456 Data files directories: [/var/lib/cassandra/data]

 INFO 11:57:08,457 Commit log directory: /var/lib/cassandra/commitlog

 INFO 11:57:08,457 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap

 INFO 11:57:08,458 disk_failure_policy is stop

 INFO 11:57:08,458 commit_failure_policy is stop

 INFO 11:57:08,461 Global memtable threshold is enabled at 1956MB

 INFO 11:57:08,526 Not using multi-threaded compaction

 INFO 11:57:08,655 Loading settings from file:/etc/cassandra/default.conf/cassandra.yaml

 INFO 11:57:08,665 Loading settings from file:/etc/cassandra/default.conf/cassandra.yaml

 INFO 11:57:08,672 JVM vendor/version: OpenJDK 64-Bit Server VM/1.8.0_91

 WARN 11:57:08,672 OpenJDK is not recommended. Please upgrade to the newest Oracle Java release

 INFO 11:57:08,672 Heap size: 8207532032/8207532032

( seems no error )



Regarding NTP,  our server currently synchronize with institutional time servers, I do not use external time servers.
As the current documentation : http://docs.perfsonar.net/install_centos.html Step3
command /usr/lib/perfsonar/scripts/system_environment/enable_ntpd appears to be outdate? No such script?

Regarding our server, it’s a RHEL 6 box.  We have due network interfaces on it in order to reach by two network systems.
Not sure if this makes perfSONAR confuse at some point.

I appreciate very much for your input!

Rong

From: "Garnizov, Ivan (RRZE)" <>
Date: Monday, June 13, 2016 at 3:19 AM
To: Rong Yao <>, "" <>
Cc: "Moye,Roger V" <>, "Adams,Andrew M" <>, William J Allen <>
Subject: RE: perfS0NAR Services not running

Hi Rong,

 

The web_admin.log is a very high level log for the web interface of the toolkit.

In order to give us a better picture of the current case, could you please restart:

-          sudo service bwctl-server restart

-          sudo service owamp-server restart

-          sudo service cassandra restart

 

and  provide us the messages from the commands above and excerpts of a little while before the restart from:

-          /var/log/messages

-          /var/log/perfsonar/owamp_bwctl.log

-          /var/log/esmond/esmond.log

-          /var/log/cassandra/cassandra.log

 

Please tell how did you get to this state? Is that a new install or an upgrade? What in your understanding led to this state of the toolkit?

 

Please note the above commands are not sufficient to restore the full operation of the toolkit.

 

Regards,

Ivan

 

From: [] On Behalf Of Yao,Rong
Sent: Freitag, 10. Juni 2016 23:23
To:
Cc: Moye,Roger V; Adams,Andrew M; William J Allen
Subject: [perfsonar-user] perfS0NAR Services not running

 

Greetings, 

 

Services bwctl, regular_test, owamp and esmond in the perfSONAR shown on web interface are “Not Running”.  Please see the attached screen shot.

I looked at web_admin.log and saw those error messages:

 

2016/06/10 15:39:55 (31270) ERROR> BWCTL.pm:646 perfSONAR_PS::NPToolkit::Config::BWCTL::get_port_range - No port range for peer_ports
2016/06/10 15:39:55 (31270) ERROR> BWCTL.pm:646 perfSONAR_PS::NPToolkit::Config::BWCTL::get_port_range - No port range for iperf_ports
2016/06/10 15:39:55 (31270) ERROR> BWCTL.pm:646 perfSONAR_PS::NPToolkit::Config::BWCTL::get_port_range - No port range for iperf3_ports
2016/06/10 15:39:55 (31270) ERROR> BWCTL.pm:646 perfSONAR_PS::NPToolkit::Config::BWCTL::get_port_range - No port range for nuttcp_ports
2016/06/10 15:39:55 (31270) ERROR> BWCTL.pm:646 perfSONAR_PS::NPToolkit::Config::BWCTL::get_port_range - No port range for thrulay_ports
2016/06/10 15:39:55 (31270) ERROR> BWCTL.pm:646 perfSONAR_PS::NPToolkit::Config::BWCTL::get_port_range - No port range for owamp_ports
2016/06/10 15:39:55 (31270) ERROR> BWCTL.pm:646 perfSONAR_PS::NPToolkit::Config::BWCTL::get_port_range - No port range for test_ports
2016/06/10 15:39:57 (31269) ERROR> Host.pm:342 perfSONAR_PS::NPToolkit::DataService::Host::get_details - Unable to find host record in LS using hostname r1prpps01.mdanderson.edu
I do see those values are defined in the /etc/bwctl-server/bwctl-server.conf
# bwctl control channel
peer_port       6001-6200
# bwctl measurement test ports
test_port       5001-5900
Can anyone please advise what’s wrong here? 
Thanks,
Rong

------------------------

Rong Yao

Research IS & Technology Services

University of Texas MD Anderson Cancer Center

Email: Tel: (713) 563-2687

 
 

The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems.

The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems.




Archive powered by MHonArc 2.6.16.

Top of Page