Skip to Content.
Sympa Menu

perfsonar-user - Re: [perf-node-users] [perfsonar-user] LSRegistrationDaemon duplicate checksum lookup/host/ error

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perf-node-users] [perfsonar-user] LSRegistrationDaemon duplicate checksum lookup/host/ error


Chronological Thread 
  • From: Jason Zurawski <>
  • To: "Lixin Liu" <>
  • Cc: , Performance Node Users <>
  • Subject: Re: [perf-node-users] [perfsonar-user] LSRegistrationDaemon duplicate checksum lookup/host/ error
  • Date: Mon, 30 Sep 2013 17:04:10 -0400

Hi Lixin;

2 Questions:

1) Are these all 3.3 or 3.3.1 hosts?

2) What is the contents of this file for each of the hosts:
/opt/SimpleLS/bootstrap/etc/service_url

Thanks;

-jason

On Sep 30, 2013, at 5:00 PM, "Lixin Liu"
<>
wrote:

> Hi Jason,
>
> Looking at the ls_registration_daemon.log on number of our hosts,
> it appears we have some difficulties to connect to your directory
> servers. I see all these hosts report "400 Bad Request" errors
> every hour.
>
> Thanks,
>
> Lixin.
>
> 2013/09/30 14:06:00 (15279) INFO> Interface.pm:127
> perfSONAR_PS::LSRegistrationDaemon::Interface::build_checksum - Checksum is
> ca4JrKF9a0bW+9fMVcgcSQ
> 2013/09/30 14:06:00 (15279) INFO> Base.pm:178
> perfSONAR_PS::LSRegistrationDaemon::Base::refresh - Record 'p2p1' is up,
> registering
> 2013/09/30 14:06:00 (15279) ERROR> Base.pm:226
> perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
> service. Will retry full registration next time: 403 Forbidden
> 2013/09/30 14:06:00 (15279) WARN> daemon.pl:251 main::__ANON__ - Warned:
> Use
> of uninitialized value in sort at
> /opt/perfsonar_ps/ls_registration_daemon/bin/../lib/perfSONAR_PS/LSRegistrationDaemon/Host.pm
>
> line 334.
> 2013/09/30 14:06:00 (15279) WARN> daemon.pl:251 main::__ANON__ - Warned:
> Use
> of uninitialized value in join or string at
> /opt/perfsonar_ps/ls_registration_daemon/bin/../lib/perfSONAR_PS/LSRegistrationDaemon/Host.pm
>
> line 334.
> 2013/09/30 14:06:00 (15279) INFO> Host.pm:309
> perfSONAR_PS::LSRegistrationDaemon::Host::build_checksum - Checksum is
> 0TtxBBUueOQpqJhpT/fdBA
> 2013/09/30 14:06:00 (15279) INFO> Base.pm:178
> perfSONAR_PS::LSRegistrationDaemon::Base::refresh - Record
> 'bdw-ucalgary.westgrid.ca' is up, registering
> 2013/09/30 14:06:00 (15279) ERROR> Base.pm:226
> perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
> service. Will retry full registration next time: 400 Bad Request
> 2013/09/30 14:06:00 (15279) INFO> Service.pm:223
> perfSONAR_PS::LSRegistrationDaemon::Service::build_checksum - Checksum is
> d7gIByEaY/6Ic3BkjzOyJg
> 2013/09/30 14:06:00 (15279) INFO> Base.pm:178
> perfSONAR_PS::LSRegistrationDaemon::Base::refresh - Record 'University of
> Calgary Ping Responder' is up, registering
> 2013/09/30 14:06:01 (15279) ERROR> Base.pm:226
> perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
> service. Will retry full registration next time: 400 Bad Request
> 2013/09/30 14:06:01 (15279) INFO> Service.pm:223
> perfSONAR_PS::LSRegistrationDaemon::Service::build_checksum - Checksum is
> vu6Wo2l/GUk7CbRvFFovow
> 2013/09/30 14:06:01 (15279) INFO> Base.pm:178
> perfSONAR_PS::LSRegistrationDaemon::Base::refresh - Record 'University of
> Calgary Traceroute Responder' is up, registering
> 2013/09/30 14:06:01 (15279) ERROR> Base.pm:226
> perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
> service. Will retry full registration next time: 400 Bad Request
> 2013/09/30 14:06:01 (15279) INFO> Base.pm:194
> perfSONAR_PS::LSRegistrationDaemon::Base::refresh - Record 'University of
> Calgary OWAMP Server' is down
> 2013/09/30 14:06:01 (15279) INFO> Service.pm:223
> perfSONAR_PS::LSRegistrationDaemon::Service::build_checksum - Checksum is
> TBtP9qqBGm8nKPruWIySXQ
> 2013/09/30 14:06:01 (15279) INFO> Base.pm:178
> perfSONAR_PS::LSRegistrationDaemon::Base::refresh - Record 'University of
> Calgary BWCTL Server' is up, registering
> 2013/09/30 14:06:01 (15279) ERROR> Base.pm:226
> perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
> service. Will retry full registration next time: 400 Bad Request
> 2013/09/30 14:06:01 (15279) INFO> Service.pm:223
> perfSONAR_PS::LSRegistrationDaemon::Service::build_checksum - Checksum is
> JZMxKg3PkoEbBRTKtu8uog
> 2013/09/30 14:06:01 (15279) INFO> Base.pm:178
> perfSONAR_PS::LSRegistrationDaemon::Base::refresh - Record 'University of
> Calgary NDT Server' is up, registering
> 2013/09/30 14:06:01 (15279) ERROR> Base.pm:226
> perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
> service. Will retry full registration next time: 400 Bad Request
> 2013/09/30 14:06:01 (15279) INFO> Service.pm:223
> perfSONAR_PS::LSRegistrationDaemon::Service::build_checksum - Checksum is
> XxvLx+/jXY4OUM7up8PJLA
> 2013/09/30 14:06:01 (15279) INFO> Base.pm:178
> perfSONAR_PS::LSRegistrationDaemon::Base::refresh - Record 'University of
> Calgary NPAD Server' is up, registering
> 2013/09/30 14:06:01 (15279) ERROR> Base.pm:226
> perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
> service. Will retry full registration next time: 400 Bad Request
>
>> -----Original Message-----
>> From: Jason Zurawski
>> [mailto:]
>> Sent: September-30-13 9:33 AM
>> To: Lixin Liu
>> Cc:
>> ;
>> Performance Node Users
>> Subject: Re: [perf-node-users] [perfsonar-user] LSRegistrationDaemon
>> duplicate checksum lookup/host/ error
>>
>> Lixin;
>>
>> There are several directory servers that manage registrations, not just
>> the
>> one. I am able to see some of the hosts you mention below, note that
>> sometimes a host registers with an IP address if there was a lack of DNS
>> name to start:
>>
>>> bdw-usask.westgrid.ca
>> http://ndb1.internet2.edu:8090/lookup/records?host-name=206.12.26.19
>>
>>> bdw-sfu.westgrid.ca
>> http://antg.es.net:8090/lookup/records?host-name=bdw-sfu.westgrid.ca
>>
>>> bdw-ucalgary.westgrid.ca
>> *cannot find*
>>
>>> bdw-ubc.westgrid.ca
>> http://antg.es.net:8090/lookup/records?host-name=206.12.24.189
>>
>>> bdw-uvic.westgrid.ca
>> *cannot find*
>>
>>> lat-sfu.westgrid.ca
>> I see this: http://ps4.es.net:9095/lookup/records?host-name=ps-
>> latency.sfu.westgrid.ca - did the host's name change recently?
>>
>>> lat-ucalgary.westgrid.ca
>> *cannot find*
>>
>>> lat-ubc.westgrid.ca
>> *cannot find*
>>
>>> lat-uvic.westgrid.ca
>>
>> http://antg.es.net:8090/lookup/records?host-name=lat-uvic.westgrid.ca
>>
>> You can try to remove the database on the hosts as you did before if the
>> hosts had a recent DNS or IP address change. You can also check to be
>> sure
>> there are no firewalls blocking registration.
>>
>> Thanks;
>>
>> -jason
>>
>> On Sep 30, 2013, at 12:07 PM, Lixin Liu
>> <>
>> wrote:
>>
>>> Hi Jason,
>>>
>>> Thank you very much. Indeed these two are on the list now.
>>>
>>> However, there are a number of hosts previous in the list
>>> are disappeared from the list over the weekend. They are:
>>>
>>> bdw-usask.westgrid.ca
>>> bdw-sfu.westgrid.ca
>>> bdw-ucalgary.westgrid.ca
>>> bdw-ubc.westgrid.ca
>>> bdw-uvic.westgrid.ca
>>> lat-sfu.westgrid.ca
>>> lat-ucalgary.westgrid.ca
>>> lat-ubc.westgrid.ca
>>> lat-uvic.westgrid.ca
>>>
>>> Your URL:
>>>
>>> http://ndb1.internet2.edu:8090/lookup/records?host-name=<hostname>
>>>
>>> display only [].
>>>
>>> Should I try to restart ls_registration_daemon on them? Do I need
>>> to remove the database?
>>>
>>> Thanks again.
>>>
>>> Lixin.
>>>
>>>
>>>> -----Original Message-----
>>>> From: Jason Zurawski
>>>> [mailto:]
>>>> Sent: September-30-13 6:35 AM
>>>> To: Lixin Liu
>>>> Cc:
>>>> ;
>>>> Performance Node Users
>>>> Subject: Re: [perf-node-users] [perfsonar-user] LSRegistrationDaemon
>>>> duplicate checksum lookup/host/ error
>>>>
>>>> Hi Lixin;
>>>>
>>>> We were able to track down an operational issue on our end with the
>>>> caching
>>>> service - as I previously noted your two servers are showing up, so
>>>> this
>>>> is
>>>> not a problem on your end:
>>>>
>>>> http://ndb1.internet2.edu:8090/lookup/records?host-name=bdw-
>>>> umanitoba.westgrid.ca
>>>> http://ndb1.internet2.edu:8090/lookup/records?host-name=lat-
>>>> umanitoba.westgrid.ca
>>>>
>>>> They were not being placed into the cache though because, the Internet2
>> LS
>>>> had a typo in the URL. If you do the following:
>>>>
>>>> sudo /etc/init.d/ls_cache_daemon restart
>>>>
>>>> It will re-download the cache file and your hosts should appear in the
>>>> Global Listing.
>>>>
>>>> Thanks;
>>>>
>>>> -jason
>>>>
>>>> On Sep 29, 2013, at 12:07 PM, Lixin Liu
>>>> <>
>>>> wrote:
>>>>
>>>>> Hi Jason,
>>>>>
>>>>> I still do not see lat-umanitoba.westgrid.ca and
>>>>> bdw-umanitoba.westgrid.ca
>>>>> in
>>>>> the Global Performance Services, and I can't see them in the
>>>>> ComputeCanada
>>>>> community. I even do you suggested (stop registration, remove db and
>>>>> start
>>>>> registration) on bdw-umanitoba yesterday and still unable to see.
>>>>>
>>>>> Is there anything else I can check?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Lixin.
>>>>>
>>>>> On 2013-09-27 10:58 AM, "Jason Zurawski"
>>>>> <>
>>>>> wrote:
>>>>>
>>>>>> Hi Lixin;
>>>>>>
>>>>>> Thanks for the clarification, I understand your specific question
>>>>>> now.
>>>>>> There are a couple of delays, let me try to explain them:
>>>>>>
>>>>>> - Delay between when the LS daemon runs on the local server, and
>>>>>> talks
>>>>>> to the remote server
>>>>>> - Delay between when the remote server talks to a caching service (an
>>>>>> optimization we added - used to generate the GUI information for
>>>>>> 'Global
>>>>>> Performance Services') and creates a new file for everyone to
>>>>>> download
>>>>>> - Delay between when the local server downloads the latest copy of
>>>>>> the
>>>>>> cached content to populate the GUI
>>>>>>
>>>>>> In general the answer is that it takes a couple of hours for this all
>>>>>> to
>>>>>> happen, as little as 2-3 if the timing is right, as many as 6-8 if it
>>>>>> is
>>>>>> wrong. Since we know that your host is in the directory via that
>>>>>> REST
>>>>>> query that was shown below, there is nothing to worry about. It will
>>>>>> just take some time to get into the GUI displays, and there isn't
>>>>>> much
>>>>>> that can be done to speed that up.
>>>>>>
>>>>>> The SElinux issue shouldn't be the root cause, but I believe we
>>>>>> either
>>>>>> set this to permissive or disabled.
>>>>>>
>>>>>> Thanks;
>>>>>>
>>>>>> -jason
>>>>>>
>>>>>> On Sep 27, 2013, at 1:40 PM, "Lixin Liu"
>>>>>> <>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Jason,
>>>>>>>
>>>>>>> Sorry I should mention that the latency host is down right now.
>>>>>>> Someone will take a look at the machine itself. Will let you
>>>>>>> know when it comes up.
>>>>>>>
>>>>>>> So looks like the bandwidth host is registered, but how long I
>>>>>>> need to wait to see its services listed in "Global Performance
>>>>>>> Services"?
>>>>>>>
>>>>>>> I noticed these two sites has one thing in common: SELinux is
>>>>>>> enabled. I disabled it, but not sure if that is root cause of
>>>>>>> our service registration issue.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Lixin.
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Jason Zurawski
>>>>>>>> [mailto:]
>>>>>>>> Sent: September-27-13 10:26 AM
>>>>>>>> To: Lixin Liu
>>>>>>>> Cc:
>>>>>>>> ;
>>>>>>>> Performance Node Users
>>>>>>>> Subject: Re: [perf-node-users] Re: [perfsonar-user]
>>>>>>>> LSRegistrationDaemon
>>>>>>>> duplicate checksum lookup/host/ error
>>>>>>>>
>>>>>>>> Hi Lixin;
>>>>>>>>
>>>>>>>> Looking in one of the global servers. I do see your bdw host:
>>>>>>>>
>>>>>>>> http://ndb1.internet2.edu:8090/lookup/records?host-name=bdw-
>>>>>>>> umanitoba.westgrid.ca
>>>>>>>>
>>>>>>>> Unfortunately you are correct in that the latency host has not
>>>>>>>> registered,
>>>>>>>> I
>>>>>>>> don't see that anywhere. Can you send the latest log message for
>>>>>>>> lat-
>>>>>>>> umanitoba.westgrid.ca again? Everything after the point where you
>>>>>>>> did
>>>>>>>> the
>>>>>>>> removal of the db file and the restart.
>>>>>>>>
>>>>>>>> Thanks;
>>>>>>>>
>>>>>>>> -jason
>>>>>>>>
>>>>>>>> On Sep 27, 2013, at 1:02 PM, "Lixin Liu"
>>>>>>>> <>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Jason,
>>>>>>>>>
>>>>>>>>> Thanks for your information.
>>>>>>>>>
>>>>>>>>> I rebooted two hosts (in University of Saskatoon) and they are now
>>>>>>>>> showing up in the list of Global Services.
>>>>>>>>>
>>>>>>>>> I still have issue with two other hosts (in University of
>>>>>>>>> Manitoba).
>>>>>>>>> I followed your suggestion. It has been more than two hours, but I
>>>>>>>>> still do not see this host in the list. However, there was a
>>>>>>>>> network
>>>>>>>>> (BGP) problem early today that may affect the registration.
>>>>>>>>>
>>>>>>>>> Here is the log file (hostname bdw-umanitoba.westgrid.ca).
>>>>>>>>>
>>>>>>>>> The NTP server on the host you mentioned was changed by the local
>>>>>>>>> admin
>>>>>>>>> to use local NTP servers, but there are only two servers in the
>>>>>>>>> config.
>>>>>>>>> I added two more and should be fine now.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Lixin.
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Jason Zurawski
>>>>>>>>>> [mailto:]
>>>>>>>>>> Sent: September-27-13 6:19 AM
>>>>>>>>>> To: Lixin Liu
>>>>>>>>>> Cc:
>>>>>>>>>> ;
>>>>>>>>>> Performance Node Users
>>>>>>>>>> Subject: Re: [perf-node-users] Re: [perfsonar-user]
>>>>>>>>>> LSRegistrationDaemon
>>>>>>>>>> duplicate checksum lookup/host/ error
>>>>>>>>>>
>>>>>>>>>> And naturally I meant "rm -f
>>>>>>>>>> /var/lib/perfsonar/ls_registration_daemon/lsKey.db" for the cache
>>>>>>>>>> file Š
>>>>>>>>>>
>>>>>>>>>> Thanks;
>>>>>>>>>>
>>>>>>>>>> -jason
>>>>>>>>>>
>>>>>>>>>> On Sep 27, 2013, at 9:17 AM, Jason Zurawski
>>>>>>>>>> <>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Lixin;
>>>>>>>>>>>
>>>>>>>>>>> In looking at the logs, we don't see anything out of the
>>>>>>>>>>> ordinary,
>>>>>>>>>>> the
>>>>>>>>>> 'duplicate' message you see is actually just unfortunate wording
>>>>>>>>>> on
>>>>>>>>>> our
>>>>>>>>>> part
>>>>>>>>>> - it means 'renew' the registration instead of making a new one.
>>>>>>>>>> Since
>>>>>>>>>> you
>>>>>>>>>> noted you changed DNS/IP, it may be an issue of a stake cache.
>>>>>>>>>> Try
>>>>>>>>>> the
>>>>>>>>>> following steps:
>>>>>>>>>>>
>>>>>>>>>>>> sudo /etc/init.d/ls_registration_daemon stop
>>>>>>>>>>>> /var/lib/perfsonar/ls_registration_daemon/lsKey.db
>>>>>>>>>>>> sudo /etc/init.d/ls_registration_daemon start
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This will stop the service, delete the local db, and then start
>>>>>>>>>>> it
>>>>>>>>>>> all
>>>>>>>>>> over. After a couple of hours the information should show up, or
>>>>>>>>>> at
>>>>>>>>>> a
>>>>>>>>>> minimum the logs will let us know if there is something else
>>>>>>>>>> locally
>>>>>>>>>> bad
>>>>>>>>>> that is going on.
>>>>>>>>>>>
>>>>>>>>>>> As an aside, it appears that your NTP is not configured, some of
>>>>>>>>>>> the
>>>>>>>>>>> tools
>>>>>>>>>> may not work until that is fixed. You may want to run the NTP
>>>>>>>>>> configuration
>>>>>>>>>> script again.
>>>>>>>>>>>
>>>>>>>>>>> Thanks;
>>>>>>>>>>>
>>>>>>>>>>> -jason
>>>>>>>>>>>
>>>>>>>>>>> On Sep 26, 2013, at 3:43 PM, "Lixin Liu"
>>>>>>>>>>> <>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Sorry I didn¹t realize that the link is password protected and
>>>>>>>>>>>> I
>>>>>>>>>>>> already logged as admin.
>>>>>>>>>>>>
>>>>>>>>>>>> Here is the log.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Lixin.
>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From:
>>>>>>>>>>>>>
>>>>>>>>>>>>> [
>>>>>>>>>>>>> ]
>>>>>>>>>>>>> On Behalf Of Lixin Liu
>>>>>>>>>>>>> Sent: September-26-13 12:36 PM
>>>>>>>>>>>>> To:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Subject: [perfsonar-user] LSRegistrationDaemon duplicate
>>>>>>>>>>>>> checksum
>>>>>>>>>>>>> lookup/host/ error
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On a number of our perfSONAR hosts, we get
>>>>>>>>>>>>>
>>>>>>>>>>>>> perfSONAR_PS::LSRegistrationDaemon::Base::find_duplicate -
>>>>>>>>>>>>> Found
>>>>>>>>>> duplicate
>>>>>>>>>>>>> checksum lookup/host
>>>>>>>>>>>>>
>>>>>>>>>>>>> error and hosts do not show up in "Global Service and Data
>>>>>>>>>>>>> View".
>>>>>>>>>>>>> These cases may be related to the IP and DNS changes. But I
>>>>>>>>>>>>> think
>>>>>>>>>>>>> hosts are configured correctly.
>>>>>>>>>>>>>
>>>>>>>>>>>>> How do I correct the problem?
>>>>>>>>>>>>>
>>>>>>>>>>>>> You can see the log from this link:
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://lat-
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>> umanitoba.westgrid.ca/toolkit/admin/logs/ls_registration_daemon.log
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Lixin.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> =======================
>>>>>>>>>>>>> Lixin Liu
>>>>>>>>>>>>> IT Services
>>>>>>>>>>>>> Simon Fraser University
>>>>>>>>>>>>
>>>>>>>>>>>> <ls_registration_daemon.log>
>>>>>>>>> <ls_registration_daemon.log>
>>>>>
>>>>> <ls_registration_daemon.log>



Archive powered by MHonArc 2.6.16.

Top of Page