Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] perfSonar 3.3rc4 stuck in bwctl restart loop

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] perfSonar 3.3rc4 stuck in bwctl restart loop


Chronological Thread 
  • From: Brian Tierney <>
  • To: Peter van Heusden <>
  • Cc:
  • Subject: Re: [perfsonar-user] perfSonar 3.3rc4 stuck in bwctl restart loop
  • Date: Thu, 23 May 2013 18:06:38 -0700
  • Authentication-results: sfpop-ironport02.merit.edu; dkim=neutral (message not signed) header.i=none


Good catch! I'll add this to the issue track to fix.


On May 23, 2013, at 2:07 PM, Peter van Heusden
<>
wrote:

> Ok, I found the cause of the problem!
>
> Somehow the web user interface is setting the TOS Bits field of the
> bandwidth test I had set up to "NaN". This then leads to the following
> lines (around line 1243) of bwmaster.pl triggering and adding a "-S
> NaN" to the bwctl command line. Then bwctl of course fails and is
> restarted... etc etc.
>
> push @cmd, ( "-S", $val ) if (
> $val = $conf->get_val(
> TESTSPEC => $ms->{'TESTSPEC'},
> ATTR => 'BWTosBits'
> )
> );
>
> I manually removed the BWTosBits entry in
> /opt/perfsonar_ps/perfsonarbuoy_ma/etc/owmesh.conf and the restart loop has
> now stopped.
>
> Thanks,
> Peter
>
> On 23/05/2013 22:48, Peter van Heusden wrote:
>> Yes, those work fine. E.g.:
>>
>> [root@ps
>> sysconfig]# bwctl -c 192.168.2.132 -s 192.168.2.104
>> bwctl: Using tool: iperf
>> bwctl: 15 seconds until test results available
>>
>> RECEIVER START
>> bwctl: exec_line: iperf -B ps2.sanbi.ac.za -s -f b -m -p 5176 -t 10
>> bwctl: start_tool: 3578330236.026986
>> ------------------------------------------------------------
>> Server listening on TCP port 5176
>> Binding to local address ps2.sanbi.ac.za
>> TCP window size: 87380 Byte (default)
>> ------------------------------------------------------------
>> [ 15] local 192.168.2.132 port 5176 connected with 192.168.2.104 port 5176
>> [ ID] Interval Transfer Bandwidth
>> [ 15] 0.0-10.0 sec 530448384 Bytes 422283774 bits/sec
>> [ 15] MSS size 1448 bytes (MTU 1500 bytes, ethernet)
>> bwctl: stop_exec: 3578330249.065921
>>
>> RECEIVER END
>>
>> Is there any other log file that I could be looking in? The messages in
>> /var/log/messages don't explain *why* bwctl is being restarted, also there
>> is a bwctl process running the entire time, so I'm not sure *what* is
>> being restarted!
>>
>> Thanks!
>> Peter
>> On 23/05/2013 22:28, Aaron Brown wrote:
>>> Hey Peter,
>>>
>>> Can you run bwctl tests by hand?
>>>
>>> Cheers,
>>> Aaron
>>>
>>> On May 23, 2013, at 4:27 PM, Peter van Heusden
>>> <>
>>> wrote:
>>>
>>>> I made that change, and added tock.meraka.csir.co.za - a stratum 1
>>>> server that is about 1000 miles away. ntpq -p now shows:
>>>>
>>>> remote refid st t when poll reach delay offset
>>>> jitter
>>>> ==============================================================================
>>>> *zibbi.meraka.cs 238.72.153.243 2 u 40 64 1 72.996 -3.658
>>>> 6.944
>>>> +firewall.sanbi. 41.73.38.11 3 u 22 64 7 0.070 -4.745
>>>> 0.173
>>>> -ntp0.za.uu.net 216.171.120.36 3 u 20 64 7 5.677 0.208
>>>> 15.795
>>>> -ntp2.is.co.za 146.64.58.41 2 u 21 64 7 5.525 -9.424
>>>> 16.250
>>>> +tock.meraka.csi .PPS. 1 u - 64 15 73.582 -3.607
>>>> 21.681
>>>>
>>>>
>>>> and bwmaster.pl is still restarting.
>>>>
>>>> :(
>>>>
>>>> On 23/05/2013 21:46, Pedro Queirós wrote:
>>>>> Try this:
>>>>>
>>>>> server ntp1.meraka.csir.co.za iburst minpoll 4 maxpoll 6
>>>>> server ntp.sanbi.ac.za iburst minpoll 4 maxpoll 6
>>>>> server ntp0.za.uu.net iburst minpoll 4 maxpoll 6
>>>>> server ntp2.is.co.za iburst minpoll 4 maxpoll 6
>>>>>
>>>>> Asides from that, it looks good. If possible, try to have access
>>>>> to good (e.g. low delay) NTP stratum 1 server near your network.
>>>>>
>>>>> If the bwmaster.pl continues restarting, I'd suggest looking into
>>>>> something else - let us know about!
>>>>>
>>>>> Ah, don't forget to restart ntpd after changing the config file!
>>>>>
>>>>> Pedro
>>>>>
>>>>>
>>>>> On Thu, May 23, 2013 at 8:31 PM, Peter van Heusden
>>>>> <>
>>>>> wrote:
>>>>> logfile /var/log/ntpd
>>>>> driftfile /var/lib/ntp/ntp.drift
>>>>> statsdir /var/lib/ntp/
>>>>> statistics loopstats peerstats clockstats
>>>>> filegen loopstats file loopstats type day enable
>>>>> filegen peerstats file peerstats type day enable
>>>>> filegen clockstats file clockstats type day enable
>>>>>
>>>>> # You should have at least 4 NTP servers
>>>>>
>>>>> server ntp1.meraka.csir.co.za iburst
>>>>> server ntp.sanbi.ac.za iburst
>>>>> server ntp0.za.uu.net iburst
>>>>> server chronos.es.net iburst
>>>>> server ntp2.is.co.za iburst
>>>>>
>>>>> thanks!
>>>>>
>>>>>
>>>>> On 23/05/2013 21:17, Pedro Queirós wrote:
>>>>>> Peter, from the ntpq -p output I can see your NTP config is faulty.
>>>>>>
>>>>>> Can you provide the /etc/ntp.conf file?
>>>>>>
>>>>>> Kind Regards,
>>>>>> Pedro
>>>>>>
>>>>>>
>>>>>> On Thu, May 23, 2013 at 8:04 PM, Jason Zurawski
>>>>>> <>
>>>>>> wrote:
>>>>>> While not true in this case that you sent - static lists fall into
>>>>>> disrepair frequently. A couple of months back we tried this in the
>>>>>> APAN region and found about 1/5 still worked well after being posted
>>>>>> to an 'official' wiki.
>>>>>>
>>>>>> It doesn't hurt to ask those on the front lines that are adopting pS
>>>>>> for some insider info - that request applies to all, even if they ate
>>>>>> not in South Africa.
>>>>>>
>>>>>> Thanks;
>>>>>>
>>>>>> -jason
>>>>>>
>>>>>> On May 23, 2013, at 11:58 AM, Michael Sinatra
>>>>>> <>
>>>>>> wrote:
>>>>>>
>>>>>> > On 05/23/2013 11:44, Jason Zurawski wrote:
>>>>>> >> Hey Peter;
>>>>>> >>
>>>>>> >> I would nuke the ESnet server too - NTP works best when the servers
>>>>>> >> are within the same timezone/continent, it gives the algorithms a
>>>>>> >> level playing field to choose from.
>>>>>> >
>>>>>> > It's not really the case that a server outside of the timezone or
>>>>>> > continent matters. The key is whether it's more likely than not
>>>>>> > that there will be asymmetry in the path to/from a given NTP server.
>>>>>> > I don't really see much evidence of that with any of the
>>>>>> > servers in the list, except for
>>>>>> > possibly ntp.mtnbusiness....
>>>>>> >
>>>>>> >
>>>>>> >> A related note is that we are always looking to add new servers
>>>>>> >> into our list around the world - if you know of 'open' clocks in
>>>>>> >> the region that we can front-load into the list, that can be
>>>>>> >> arranged.
>>>>>> >
>>>>>> > Well, there's a published list of them here:
>>>>>> >
>>>>>> > http://support.ntp.org/bin/view/Servers/WebHome
>>>>>> >
>>>>>> > :)
>>>>>> >
>>>>>> > michael
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> ESnet/Internet2 Focused Technical Workshop
>>> Network Issues for Life Sciences Research
>>> July 17 - 18, 2013, Berkeley CA
>>> http://events.internet2.edu/2013/ftw-life-sciences/
>>>
>>
>




Archive powered by MHonArc 2.6.16.

Top of Page