perfsonar-user - Re: [perfsonar-user] Puzzling problem with ntp and bwctl
Subject: perfSONAR User Q&A and Other Discussion
List archive
- From: Casey Russell <>
- To: "Garnizov, Ivan (RRZE)" <>
- Cc: "" <>
- Subject: Re: [perfsonar-user] Puzzling problem with ntp and bwctl
- Date: Fri, 14 Oct 2016 06:55:14 -0500
- Ironport-phdr: 9a23:PGsAgxCRiCW3rdEHlcRFUyQJP3N1i/DPJgcQr6AfoPdwSP76p8bcNUDSrc9gkEXOFd2Crakb26yL6Ou5BCQp2tWojjMrSNR0TRgLiMEbzUQLIfWuLgnFFsPsdDEwB89YVVVorDmROElRH9viNRWJ+iXhpRZbIBj0NBJ0K+LpAcaSyp3vj6Hhs6HUNk9njSC7YKF1MlH+jBvYsIFWu7FQB+d7gk/IvHJOPetf32VpN1WNtxj1+4G88cgw3T5XvqcK/tVNQO3AYr8jQLhcRGAtKX0u/8DvsTHARA2V631aVGgKxEkbSzPZ5Q33C8+i+hDxsfBwjWzDZZX7
Hi Casey,
I guess I can assume there are no events reported in the syslog for the operation of the NTP daemon, right?
Still with each restart you should be getting some messages there.
The only thing that comes to mind is to increase the debug level for NTP daemon and see if there is something, that makes it die.
You are also not mentioning, if on the ntpd restarts you are getting a successful “shutting down” message.
Regards,
Ivan
From: [mailto:] On Behalf Of Casey Russell
Sent: Donnerstag, 13. Oktober 2016 16:37
To:
Subject: [perfsonar-user] Puzzling problem with ntp and bwctl
Group,
I've got a problem with a few of my nodes, I know what event precipitated it. I just don't know exactly what's going on, and how to fix it. Here are the symptoms:
1. I fix the problem and for the next 8 hours or so, things work fine.
2. After about 8 hours, throughput tests stop working altogether (scheduled or manual). "ntpstat" indicates that the system is properly synced, but BWCTL refuses to run a test, complaining that ntp is out of sync.
3. To fix the problem, all I have to do is restart the ntpd daemon and throughput tests work again for about 8 hours. (for clarity, I've added the -x flag to /etc/sysconfig/ntpd to make sure that restarting the daemon forces a sync)
4. Even when the node is "working", during a test, BWCTL still displays a warning that: "NTP: STA_NANO should be set. Make sure ntpd is running, and your NTP configuration is good."
Now, what began this problem, is that I installed 10G interfaces in these nodes, which already had (2) 1G NICs. I created an appropriate ifcfg-xxx file in /etc/sysconfig/network-scripts and then I added a 3rd line in /etc/rc.local to run the mod_interface_route script for this interface.
I did this on 3 nodes and 2 of them exhibit the problem, one of them has worked just fine for the last 2 weeks. I can't for the life of me find any differences in the way I implemented them (although clearly something is different).
Below my signature, you'll find some output that displays the problem: I don't find anything out of place in the regulartesting.log file but I may be missing something in another relevant file. I'll be glad to provide if asked.
Does anyone have insight into what BWCTL looks for in NTPD? how they tie together and what might be going on here?
Sincerely,
Casey Russell
Network Engineer
2029 Becker Drive, Suite 282
Lawrence, Kansas 66047
ps-ku-bw is the node with the problem:
[crussell@ps-wsu-bw ~]$ bwctl -4 -s ps-bryant-bw.kanren.net -c ps-ku-bw.kanren.net
bwctl: NTP: STA_NANO should be set. Make sure ntpd is running, and your NTP configuration is good.
bwctl: NTP is unsynchronized. Skipping test. Use -a to run anyway.
[crussell@ps-wsu-bw ~]$ ntpstat
synchronised to NTP server (139.78.97.128) at stratum 2
time correct to within 19 ms
polling server every 128 s
[crussell@ps-wsu-bw ~]$ sudo service ntpd restart
[sudo] password for crussell:
Shutting down ntpd: [ OK ]
Starting ntpd: [ OK ]
### 5 minute pause to let NTP sync get down to sub 20ms #####
[crussell@ps-wsu-bw ~]$ bwctl -4 -s ps-bryant-bw.kanren.net -c ps-ku-bw.kanren.net
bwctl: NTP: STA_NANO should be set. Make sure ntpd is running, and your NTP configuration is good.
bwctl: Using tool: iperf3
bwctl: 17 seconds until test results available
SENDER START
Connecting to host 164.113.32.18, port 5827
[ 14] local 164.113.32.10 port 54402 connected to 164.113.32.18 port 5827
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 14] 0.00-1.00 sec 119 MBytes 1000 Mbits/sec 0 1.05 MBytes
[ 14] 1.00-2.00 sec 118 MBytes 987 Mbits/sec 0 1.05 MBytes
[ 14] 2.00-3.00 sec 118 MBytes 988 Mbits/sec 0 1.05 MBytes
[ 14] 3.00-4.00 sec 118 MBytes 988 Mbits/sec 0 1.07 MBytes
[ 14] 4.00-5.00 sec 118 MBytes 987 Mbits/sec 0 1.07 MBytes
[ 14] 5.00-6.00 sec 118 MBytes 988 Mbits/sec 0 1.07 MBytes
[ 14] 6.00-7.00 sec 118 MBytes 987 Mbits/sec 0 1.08 MBytes
[ 14] 7.00-8.00 sec 118 MBytes 987 Mbits/sec 0 1.08 MBytes
[ 14] 8.00-9.00 sec 118 MBytes 986 Mbits/sec 0 1.08 MBytes
[ 14] 9.00-10.00 sec 118 MBytes 988 Mbits/sec 0 1.08 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 14] 0.00-10.00 sec 1.15 GBytes 988 Mbits/sec 0 sender
[ 14] 0.00-10.00 sec 1.15 GBytes 987 Mbits/sec receiver
iperf Done.
SENDER END
- [perfsonar-user] Puzzling problem with ntp and bwctl, Casey Russell, 10/13/2016
- RE: [perfsonar-user] Puzzling problem with ntp and bwctl, Garnizov, Ivan (RRZE), 10/14/2016
- Re: [perfsonar-user] Puzzling problem with ntp and bwctl, Casey Russell, 10/14/2016
- RE: [perfsonar-user] Puzzling problem with ntp and bwctl, Garnizov, Ivan (RRZE), 10/14/2016
Archive powered by MHonArc 2.6.19.