Skip to Content.
Sympa Menu

perfsonar-user - RE: [perfsonar-user] Now BWCTL issue....

Subject: perfSONAR User Q&A and Other Discussion

List archive

RE: [perfsonar-user] Now BWCTL issue....


Chronological Thread 
  • From: "Hagen, Skye ()" <>
  • To: "Bruce A. Mah" <>, John Mann <>
  • Cc: Amit Kumar <>, Aaron Brown <>, "" <>
  • Subject: RE: [perfsonar-user] Now BWCTL issue....
  • Date: Tue, 1 Apr 2014 04:27:18 +0000
  • Accept-language: en-US

With NTP, one of the better setups is to call ntpdate during startup of the
system. This will set the clock. Then, run ntpd to keep the clock in sync. If
the clock is widely out of sync, ntpd will not correct it.

I use 5 servers, this will protect against one false chimer, and allow for
one to be off-line at the same time. Two is worse than one, unless you setup
ntp to prefer one server.

The interesting thing on his first server is the value of 'reach'. This is a
bit map of the last 8 contact attempts, displayed in octal. So, 352 means
that, working from the oldest attempt to the newest, attempts 8, 7, 6, 4 and
2 got a response. Attempts 5, 3 and the last attempt did not get a response.
(That is, assuming I am interpreting my octal correctly. Remember, there are
three kinds of people in the world. Those that are good at math, and those
that are not. :-) ) This would seem to indicate a congested link, or
discards on the path.

Skye Hagen
Network Engineer
University of Idaho


________________________________________
From:


<>
on behalf of Bruce A. Mah
<>
Sent: Monday, March 31, 2014 5:29 PM
To: John Mann
Cc: Amit Kumar; Aaron Brown;

Subject: Re: [perfsonar-user] Now BWCTL issue....

If memory serves me right, John Mann wrote:
> Hi,
>
> [ CC: list trimmed ]
>
> If memory serves me ... ntp likes to sync to a group of servers that are
> giving about the same time.
> If it can only see 1 source, it can't decide whether that is a truetimer
> or an outlying falseticker.

Well...if there's only one source, and it's valid, ntpd has to use that
one. (One of the hazards of having only one or two time servers.) I
would expect that perfSONAR host to (eventually) sync with that first
server.

Also it's not clear why he couldn't sync with the public timeservers.
Firewall rules / network ACLs maybe?

> https://tools.ietf.org/html/rfc5905#section-11.1
> NMIN, CMIN ...
>
> Suggestions:
> - Wait. Sometimes ntp comes good after 20 mins / several hours.

Yes, depending on how the local ntpd is configured.

> - Add another ntp "server" (that has sync'd time) to the setup
> - e.g. use a router

I'm trying to resist the temptation to dive into NTP configuration
trivia, but having two servers isn't a whole lot better than one,
because if one of them misbehaves, the client can't tell which one to
trust. My usual practice for generic (i.e. non-perfSONAR) hosts, which
mirrors what I understand to be best practice, is to pick either 3 or 5
servers, with the usual considerations for diversity.

> - "peer" the ntp clients together so that they can have confidence in
> each other and the primary source

Hrm, a bunch of clients that all peer with each other and get time from
a single server isn't really any better than just going to the single
server. If that server loses sync or goes down, the client are all
going to lose sync too, eventually, unless some of them are configured
to use their local clocks as high-stratum NTP servers (I am not
recommending that step).

I'm pretty sure the original poster didn't want to set up a local NTP
infrastructure, he just wants to use what's available.

> It is a bit of a black art.

Oh it's not *that* bad. I haven't had to sacrifice any goats for
several years now. :-)

Bruce.

> You might end up with a ntp cloud that regains sync if you reboot one
> node, but if you reboot everything all at once it won't re-sync.
>
> Thanks,
> John
>
>
> On 1 April 2014 07:49, Bruce A. Mah
> <
>
> <mailto:>>
> wrote:
>
> If memory serves me right, Amit Kumar wrote:
> > Yes Aaron
>
> If I'm reading the ntpq -p output correctly...
>
> >>> remote refid st t when poll reach delay
> offset
> >>> jitter
> >>>
>
> ============================================================================
> >>> ==
> >>> 10.255.255.3 10.255.255.35 2 u 519 1024 352 1.396
> -4.134
> >>> 12.624
> >>> chronos.es.net <http://chronos.es.net> .INIT. 16 u
> - 1024 0 0.000 0.000
> >>> 0.000
> >>> nms-rlat.chic.n .INIT. 16 u - 1024 0 0.000
> 0.000
> >>> 0.000
> >>> nms-rlat.hous.n .INIT. 16 u - 1024 0 0.000
> 0.000
> >>> 0.000
> >>> nms-rlat.losa.n .INIT. 16 u - 1024 0 0.000
> 0.000
> >>> 0.000
> >>> nms-rlat.newy32 .INIT. 16 u - 1024 0 0.000
> 0.000
> >>> 0.000
> >>> saturn.es.net <http://saturn.es.net> .INIT. 16 u -
> 1024 0 0.000 0.000
> >>> 0.000
>
> ...it looks like the host in question isn't synched against 10.255.255.3
> (or anything else for that matter) because there's no "*" in front of
> that line...that indicates a host that is the current time source.
>
> Bruce.
>
>
>




Archive powered by MHonArc 2.6.16.

Top of Page