Hi Andy,
Thanks. I did not check with the Services Directory in ESnet. My concern is about pS UI, which checks the LS records not the Service Directory. Anyway the problem
has resolved by itself and I do not have an example to work on.
Please note that the issue raised in the tracker is related to another error:
500 Can't connect to ****** (connect: timeout)
My case was 403 Forbiden, but anyway….the test for the status on front page, that you gave a description for is enough for me.
Now I am not sure what you are tracking down and how can I be of an assistance, but at least I know that our operations do not have to react on the messages
from the log, but if the MP is really missing in the LS. In fact even then (as I understand), there are no instructions on what should be done. That is always a problem with operations monitoring a service. They need to know how to respond, otherwise they
are just annoyed.
Best regards,
Ivan
From: Andrew Lake [mailto:]
Sent: Donnerstag, 25. Juni 2015 14:12
To: Garnizov, Ivan (RRZE)
Cc: perfsonar-user
Subject: RE: [perfsonar-user] ls_registration service 403 Forbidden
Out of curiosity when it says its not registered do you see it here:
http://stats.es.net/ServicesDirectory/? A number of people have reported that the status will waiver between Yes and No on the toolkit page. For others it seems to work just fine. There is an open issue
https://github.com/perfsonar/toolkit/issues/32 already to explore this further. The web page works by asking for a host record from the LS got the local host. Clearly it thinks it can’t find it
for some reason, but it’s still unclear whether it’s always because it’s actually not registered or some other reason. If I knew the exact cause, there wouldn’t be an issue and we likely wouldn’t be having this conversation :) We hope to track it down by 3.5
final.
On Wed, Jun 24, 2015 at 9:16 AM, Garnizov, Ivan (RRZE) <> wrote:
Hi Andy,
Thanks for asking. In fact I have 10-20 in a row unsuccessful attempts of the service.
I am trying to identify cases where service fails and is not registered in the LS and was wondering,
whether to go by these events or should I use the LS server records.
In fact I am often lead by the “local services” page which states “Globally registered: NO”. How
does this check verifies the state?
The problem is that once a point is missing in the LS, then it is not available in the pS UI and
not possible to test with.
From your reply it seems that you suggest that I just have to wait for the service to restore by
itself.
Best regards,
Ivan
From: Andrew Lake []
Sent: Mittwoch, 24. Juni 2015 12:33
To: Garnizov, Ivan (RRZE)
Cc: perfsonar-user
Subject: Re: [perfsonar-user] ls_registration service 403 Forbidden
Despite appearances, this may or may not be an issue (or at least a temporary one). This usually happens when the ls_registration_daemon tries to register the same record twice.
It usually corrects itself the next run when the old record expires. Are you trying to debug a missing record or were you fishing through the logs again and found this?
On Wed, Jun 24, 2015 at 4:37 AM, Garnizov, Ivan (RRZE) <> wrote:
Dear perfSONAR developers,
Please give some hints, guides on the case with LSRegistrationDaemon getting 403 Forbidden.
1) Is that a service response from the LS server? I exclude firewall or connectivity issues.
2) What would be the cause of this or in which case and how does one intervene or assist the registration process? Please do not ignore possible actions on the LS server for the resolution of the case.
ERROR> Base.pm:304 perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering service. Will retry full registration next time: 403 Forbidden
Best regards,
Ivan
-----Original Message-----
From: [] On Behalf Of Garnizov, Ivan (RRZE)
Sent: Mittwoch, 24. Juni 2015 10:17
To: Ty Bell; perfsonar-user
Subject: RE: [perfsonar-user] bwping/owamp tests randomly stop and never restart
Hi Ty,
In fact I have reported the same issue about my instances. Issue tracker.
https://github.com/perfsonar/regular-testing/issues/5
Suddenly out of no reason, without any notable event in the logs the regular_testing service stops collecting the data. I have also noted that a single service restart does not help. You have to follow a graceful restart....meaning:
sudo service regular_testing stop
sudo service postgresql stop
sudo service cassandra restart
sudo service postgresql start
sudo service regular_testing start
This immediately fixes all measurements. I have tested that on 2 hosts.
We still might be in different scenarios, although my issue is also around the latency tests.
Best regards,
Ivan
-----Original Message-----
From: [] On Behalf Of Ty Bell
Sent: Dienstag, 23. Juni 2015 16:41
To: perfsonar-user
Subject: Re: [perfsonar-user] bwping/owamp tests randomly stop and never restart
All my hosts are running the same (lastest) versions of the tools and they're all sync'd with the same NTP sources. Instead of restarting the whole regular testing service, I've taken to killing the individual bwping process, regular testing fires up a new
process and everything clears up.
--Ty
> On Apr 23, 2015, at 3:29 PM, Amit Khare <> wrote:
>
> Hi Ty,
>
> Are all your hosts running the same version of toolkit. We have had
> similar issues with one of the older toolkit releases.I would also
> check if the hosts are properly synced with NTP server(s). Thanks,
>
> Amit
> ----------------------------------------------------------------------
> -----
> -
> Amit Khare | Network Engineer | CANARIE Inc | 45 O'Connor St., Suite
> 500, Ottawa, ON K1P 1A4 | Office: 613-943-5377│Cell: 613-404-8696│CANARIE NOC:
> 613-944-5612│www.canarie.ca
>
>
>
>
>
>
> On 2015-04-23, 15:19, "Ty Bell" <> wrote:
>
>> Hi All,
>>
>> Wondering if this is something anyone else has observed. I have 10
>> hosts in a mesh all running owamp tests, and randomly (maybe once a
>> week) I’ll check on the mesh and see two hosts have stopped testing
>> in one direction. It’s never the same hosts, and never the same
>> direction, seems totally random. I can execute tests from the command
>> line and they run just fine. I’ve looked around for hung owamp
>> processes or daemon restarts and haven’t found anything.
>>
>> The only resolution I’ve found is to restart regular testing on both
>> hosts.
>>
>> Thanks,
>> --Ty
>>
|