Skip to Content.
Sympa Menu

perfsonar-user - RE: [perfsonar-user] Puzzling problem with ntp and bwctl

Subject: perfSONAR User Q&A and Other Discussion

List archive

RE: [perfsonar-user] Puzzling problem with ntp and bwctl


Chronological Thread 
  • From: "Garnizov, Ivan (RRZE)" <>
  • To: Casey Russell <>, "" <>
  • Subject: RE: [perfsonar-user] Puzzling problem with ntp and bwctl
  • Date: Fri, 14 Oct 2016 11:31:48 +0000
  • Accept-language: en-GB, de-DE, en-US
  • Ironport-phdr: 9a23:Hlv6IhLDiWiV+QbOA9mcpTZWNBhigK39O0sv0rFitYgULf/xwZ3uMQTl6Ol3ixeRBMOAtKIC1rGd6v2ocFdDyKjCmUhKSIZLWR4BhJdetC0bK+nBN3fGKuX3ZTcxBsVIWQwt1Xi6NU9IBJS2PAWK8TWapAQfERTnNAdzOv+9WsuL15z2hKiO/Mj4ah5FlXKHfKhpIRGy5VHarNQNmoZmLo4ywx3Tr30OfelKkycgb0qehRjn4cG55tt+6ClKk/Mn68NaV6jmJeI1QaESRGA+Pno7/8rtvAOGUBCC/FMdVHkbiBxFH1KD4R3nCMTfqCz/46BS0TOcPN/xU/R8eCqr6e8rciXapWZNf2o47mjRzMN5lqRashW/jxJ23sjYbdfGZ7JFYqrBcIZCFiJ6VcFLWnkEW9vkYg==

Hi Casey,

 

I guess I can assume there are no events reported in the syslog for the operation of the NTP daemon, right?

Still with each restart you should be getting some messages there.

The only thing that comes to mind is to increase the debug level for NTP daemon and see if there is something, that makes it die.

You are also not mentioning, if on the ntpd restarts you are getting a successful “shutting down” message.

 

 

Regards,

Ivan

 

 

 

From: [mailto:] On Behalf Of Casey Russell
Sent: Donnerstag, 13. Oktober 2016 16:37
To:
Subject: [perfsonar-user] Puzzling problem with ntp and bwctl

 

Group,

 

     I've got a problem with a few of my nodes, I know what event precipitated it.  I just don't know exactly what's going on, and how to fix it.  Here are the symptoms:

 

1.  I fix the problem and for the next 8 hours or so, things work fine.  

 

2.  After about 8 hours, throughput tests stop working altogether (scheduled or manual).  "ntpstat" indicates that the system is properly synced, but BWCTL refuses to run a test, complaining that ntp is out of sync. 

 

3.  To fix the problem, all I have to do is restart the ntpd daemon and throughput tests work again for about 8 hours.  (for clarity, I've added the -x flag to /etc/sysconfig/ntpd to make sure that restarting the daemon forces a sync)

 

4.  Even when the node is "working", during a test, BWCTL still displays a warning that:   "NTP: STA_NANO should be set. Make sure ntpd is running, and your NTP configuration is good."

 

Now, what began this problem, is that I installed 10G interfaces in these nodes, which already had (2) 1G NICs.  I created an appropriate ifcfg-xxx file in /etc/sysconfig/network-scripts and then I added a 3rd line in /etc/rc.local to run the mod_interface_route script for this interface.

 

I did this on 3 nodes and 2 of them exhibit the problem, one of them has worked just fine for the last 2 weeks.  I can't for the life of me find any differences in the way I implemented them (although clearly something is different).  

 

Below my signature, you'll find some output that displays the problem:  I don't find anything out of place in the regulartesting.log file but I may be missing something in another relevant file.  I'll be glad to provide if asked. 

 

Does anyone have insight into what BWCTL looks for in NTPD?  how they tie together and what might be going on here?

 

Sincerely,

Casey Russell

Network Engineer

KanREN

phone785-856-9809

2029 Becker Drive, Suite 282
Lawrence, Kansas 66047

linkedintwittertwitter

 

ps-ku-bw is the node with the problem:

 

[crussell@ps-wsu-bw ~]$ bwctl -4 -s ps-bryant-bw.kanren.net -c ps-ku-bw.kanren.net

bwctl: NTP: STA_NANO should be set. Make sure ntpd is running, and your NTP configuration is good.

bwctl: NTP is unsynchronized. Skipping test. Use -a to run anyway.

 

[crussell@ps-wsu-bw ~]$ ntpstat

synchronised to NTP server (139.78.97.128) at stratum 2

   time correct to within 19 ms

   polling server every 128 s

 

[crussell@ps-wsu-bw ~]$ sudo service ntpd restart

[sudo] password for crussell:

Shutting down ntpd:                                        [  OK  ]

Starting ntpd:                                             [  OK  ]

 

### 5 minute pause to let NTP sync get down to sub 20ms #####

 

[crussell@ps-wsu-bw ~]$ bwctl -4 -s ps-bryant-bw.kanren.net -c ps-ku-bw.kanren.net

bwctl: NTP: STA_NANO should be set. Make sure ntpd is running, and your NTP configuration is good.

bwctl: Using tool: iperf3

bwctl: 17 seconds until test results available

 

SENDER START

Connecting to host 164.113.32.18, port 5827

[ 14] local 164.113.32.10 port 54402 connected to 164.113.32.18 port 5827

[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd

[ 14]   0.00-1.00   sec   119 MBytes  1000 Mbits/sec    0   1.05 MBytes

[ 14]   1.00-2.00   sec   118 MBytes   987 Mbits/sec    0   1.05 MBytes

[ 14]   2.00-3.00   sec   118 MBytes   988 Mbits/sec    0   1.05 MBytes

[ 14]   3.00-4.00   sec   118 MBytes   988 Mbits/sec    0   1.07 MBytes

[ 14]   4.00-5.00   sec   118 MBytes   987 Mbits/sec    0   1.07 MBytes

[ 14]   5.00-6.00   sec   118 MBytes   988 Mbits/sec    0   1.07 MBytes

[ 14]   6.00-7.00   sec   118 MBytes   987 Mbits/sec    0   1.08 MBytes

[ 14]   7.00-8.00   sec   118 MBytes   987 Mbits/sec    0   1.08 MBytes

[ 14]   8.00-9.00   sec   118 MBytes   986 Mbits/sec    0   1.08 MBytes

[ 14]   9.00-10.00  sec   118 MBytes   988 Mbits/sec    0   1.08 MBytes

- - - - - - - - - - - - - - - - - - - - - - - - -

[ ID] Interval           Transfer     Bandwidth       Retr

[ 14]   0.00-10.00  sec  1.15 GBytes   988 Mbits/sec    0             sender

[ 14]   0.00-10.00  sec  1.15 GBytes   987 Mbits/sec                  receiver

 

iperf Done.

 

SENDER END

 




Archive powered by MHonArc 2.6.19.

Top of Page