Skip to Content.
Sympa Menu

perfsonar-user - RE: [perf-node-users] [perfsonar-user] LSRegistrationDaemon duplicate checksum lookup/host/ error

Subject: perfSONAR User Q&A and Other Discussion

List archive

RE: [perf-node-users] [perfsonar-user] LSRegistrationDaemon duplicate checksum lookup/host/ error


Chronological Thread 
  • From: "Lixin Liu" <>
  • To: <>, "'Performance Node Users'" <>
  • Subject: RE: [perf-node-users] [perfsonar-user] LSRegistrationDaemon duplicate checksum lookup/host/ error
  • Date: Mon, 30 Sep 2013 09:07:54 -0700 (PDT)

Hi Jason,

Thank you very much. Indeed these two are on the list now.

However, there are a number of hosts previous in the list
are disappeared from the list over the weekend. They are:

bdw-usask.westgrid.ca
bdw-sfu.westgrid.ca
bdw-ucalgary.westgrid.ca
bdw-ubc.westgrid.ca
bdw-uvic.westgrid.ca
lat-sfu.westgrid.ca
lat-ucalgary.westgrid.ca
lat-ubc.westgrid.ca
lat-uvic.westgrid.ca

Your URL:

http://ndb1.internet2.edu:8090/lookup/records?host-name=<hostname>

display only [].

Should I try to restart ls_registration_daemon on them? Do I need
to remove the database?

Thanks again.

Lixin.


> -----Original Message-----
> From: Jason Zurawski
> [mailto:]
> Sent: September-30-13 6:35 AM
> To: Lixin Liu
> Cc:
> ;
> Performance Node Users
> Subject: Re: [perf-node-users] [perfsonar-user] LSRegistrationDaemon
> duplicate checksum lookup/host/ error
>
> Hi Lixin;
>
> We were able to track down an operational issue on our end with the
> caching
> service - as I previously noted your two servers are showing up, so this
> is
> not a problem on your end:
>
> http://ndb1.internet2.edu:8090/lookup/records?host-name=bdw-
> umanitoba.westgrid.ca
> http://ndb1.internet2.edu:8090/lookup/records?host-name=lat-
> umanitoba.westgrid.ca
>
> They were not being placed into the cache though because, the Internet2 LS
> had a typo in the URL. If you do the following:
>
> sudo /etc/init.d/ls_cache_daemon restart
>
> It will re-download the cache file and your hosts should appear in the
> Global Listing.
>
> Thanks;
>
> -jason
>
> On Sep 29, 2013, at 12:07 PM, Lixin Liu
> <>
> wrote:
>
> > Hi Jason,
> >
> > I still do not see lat-umanitoba.westgrid.ca and
> > bdw-umanitoba.westgrid.ca
> > in
> > the Global Performance Services, and I can't see them in the
> > ComputeCanada
> > community. I even do you suggested (stop registration, remove db and
> > start
> > registration) on bdw-umanitoba yesterday and still unable to see.
> >
> > Is there anything else I can check?
> >
> > Thanks,
> >
> > Lixin.
> >
> > On 2013-09-27 10:58 AM, "Jason Zurawski"
> > <>
> > wrote:
> >
> >> Hi Lixin;
> >>
> >> Thanks for the clarification, I understand your specific question now.
> >> There are a couple of delays, let me try to explain them:
> >>
> >> - Delay between when the LS daemon runs on the local server, and talks
> >> to the remote server
> >> - Delay between when the remote server talks to a caching service (an
> >> optimization we added - used to generate the GUI information for
> >> 'Global
> >> Performance Services') and creates a new file for everyone to download
> >> - Delay between when the local server downloads the latest copy of the
> >> cached content to populate the GUI
> >>
> >> In general the answer is that it takes a couple of hours for this all
> >> to
> >> happen, as little as 2-3 if the timing is right, as many as 6-8 if it
> >> is
> >> wrong. Since we know that your host is in the directory via that REST
> >> query that was shown below, there is nothing to worry about. It will
> >> just take some time to get into the GUI displays, and there isn't much
> >> that can be done to speed that up.
> >>
> >> The SElinux issue shouldn't be the root cause, but I believe we either
> >> set this to permissive or disabled.
> >>
> >> Thanks;
> >>
> >> -jason
> >>
> >> On Sep 27, 2013, at 1:40 PM, "Lixin Liu"
> >> <>
> >> wrote:
> >>
> >>> Hi Jason,
> >>>
> >>> Sorry I should mention that the latency host is down right now.
> >>> Someone will take a look at the machine itself. Will let you
> >>> know when it comes up.
> >>>
> >>> So looks like the bandwidth host is registered, but how long I
> >>> need to wait to see its services listed in "Global Performance
> >>> Services"?
> >>>
> >>> I noticed these two sites has one thing in common: SELinux is
> >>> enabled. I disabled it, but not sure if that is root cause of
> >>> our service registration issue.
> >>>
> >>> Thanks,
> >>>
> >>> Lixin.
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Jason Zurawski
> >>>> [mailto:]
> >>>> Sent: September-27-13 10:26 AM
> >>>> To: Lixin Liu
> >>>> Cc:
> >>>> ;
> >>>> Performance Node Users
> >>>> Subject: Re: [perf-node-users] Re: [perfsonar-user]
> >>>> LSRegistrationDaemon
> >>>> duplicate checksum lookup/host/ error
> >>>>
> >>>> Hi Lixin;
> >>>>
> >>>> Looking in one of the global servers. I do see your bdw host:
> >>>>
> >>>> http://ndb1.internet2.edu:8090/lookup/records?host-name=bdw-
> >>>> umanitoba.westgrid.ca
> >>>>
> >>>> Unfortunately you are correct in that the latency host has not
> >>>> registered,
> >>>> I
> >>>> don't see that anywhere. Can you send the latest log message for
> >>>> lat-
> >>>> umanitoba.westgrid.ca again? Everything after the point where you
> >>>> did
> >>>> the
> >>>> removal of the db file and the restart.
> >>>>
> >>>> Thanks;
> >>>>
> >>>> -jason
> >>>>
> >>>> On Sep 27, 2013, at 1:02 PM, "Lixin Liu"
> >>>> <>
> >>>> wrote:
> >>>>
> >>>>> Hi Jason,
> >>>>>
> >>>>> Thanks for your information.
> >>>>>
> >>>>> I rebooted two hosts (in University of Saskatoon) and they are now
> >>>>> showing up in the list of Global Services.
> >>>>>
> >>>>> I still have issue with two other hosts (in University of Manitoba).
> >>>>> I followed your suggestion. It has been more than two hours, but I
> >>>>> still do not see this host in the list. However, there was a network
> >>>>> (BGP) problem early today that may affect the registration.
> >>>>>
> >>>>> Here is the log file (hostname bdw-umanitoba.westgrid.ca).
> >>>>>
> >>>>> The NTP server on the host you mentioned was changed by the local
> >>>>> admin
> >>>>> to use local NTP servers, but there are only two servers in the
> >>>>> config.
> >>>>> I added two more and should be fine now.
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Lixin.
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Jason Zurawski
> >>>>>> [mailto:]
> >>>>>> Sent: September-27-13 6:19 AM
> >>>>>> To: Lixin Liu
> >>>>>> Cc:
> >>>>>> ;
> >>>>>> Performance Node Users
> >>>>>> Subject: Re: [perf-node-users] Re: [perfsonar-user]
> >>>>>> LSRegistrationDaemon
> >>>>>> duplicate checksum lookup/host/ error
> >>>>>>
> >>>>>> And naturally I meant "rm -f
> >>>>>> /var/lib/perfsonar/ls_registration_daemon/lsKey.db" for the cache
> >>>>>> file Š
> >>>>>>
> >>>>>> Thanks;
> >>>>>>
> >>>>>> -jason
> >>>>>>
> >>>>>> On Sep 27, 2013, at 9:17 AM, Jason Zurawski
> >>>>>> <>
> >>>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi Lixin;
> >>>>>>>
> >>>>>>> In looking at the logs, we don't see anything out of the ordinary,
> >>>>>>> the
> >>>>>> 'duplicate' message you see is actually just unfortunate wording on
> >>>>>> our
> >>>>>> part
> >>>>>> - it means 'renew' the registration instead of making a new one.
> >>>>>> Since
> >>>>>> you
> >>>>>> noted you changed DNS/IP, it may be an issue of a stake cache. Try
> >>>>>> the
> >>>>>> following steps:
> >>>>>>>
> >>>>>>>> sudo /etc/init.d/ls_registration_daemon stop
> >>>>>>>> /var/lib/perfsonar/ls_registration_daemon/lsKey.db
> >>>>>>>> sudo /etc/init.d/ls_registration_daemon start
> >>>>>>>
> >>>>>>>
> >>>>>>> This will stop the service, delete the local db, and then start it
> >>>>>>> all
> >>>>>> over. After a couple of hours the information should show up, or
> >>>>>> at
> >>>>>> a
> >>>>>> minimum the logs will let us know if there is something else
> >>>>>> locally
> >>>>>> bad
> >>>>>> that is going on.
> >>>>>>>
> >>>>>>> As an aside, it appears that your NTP is not configured, some of
> >>>>>>> the
> >>>>>>> tools
> >>>>>> may not work until that is fixed. You may want to run the NTP
> >>>>>> configuration
> >>>>>> script again.
> >>>>>>>
> >>>>>>> Thanks;
> >>>>>>>
> >>>>>>> -jason
> >>>>>>>
> >>>>>>> On Sep 26, 2013, at 3:43 PM, "Lixin Liu"
> >>>>>>> <>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Sorry I didn¹t realize that the link is password protected and I
> >>>>>>>> already logged as admin.
> >>>>>>>>
> >>>>>>>> Here is the log.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>> Lixin.
> >>>>>>>>
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> [
> >>>>>>>>> ]
> >>>>>>>>> On Behalf Of Lixin Liu
> >>>>>>>>> Sent: September-26-13 12:36 PM
> >>>>>>>>> To:
> >>>>>>>>>
> >>>>>>>>> Subject: [perfsonar-user] LSRegistrationDaemon duplicate
> >>>>>>>>> checksum
> >>>>>>>>> lookup/host/ error
> >>>>>>>>>
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> On a number of our perfSONAR hosts, we get
> >>>>>>>>>
> >>>>>>>>> perfSONAR_PS::LSRegistrationDaemon::Base::find_duplicate - Found
> >>>>>> duplicate
> >>>>>>>>> checksum lookup/host
> >>>>>>>>>
> >>>>>>>>> error and hosts do not show up in "Global Service and Data
> >>>>>>>>> View".
> >>>>>>>>> These cases may be related to the IP and DNS changes. But I
> >>>>>>>>> think
> >>>>>>>>> hosts are configured correctly.
> >>>>>>>>>
> >>>>>>>>> How do I correct the problem?
> >>>>>>>>>
> >>>>>>>>> You can see the log from this link:
> >>>>>>>>>
> >>>>>>>>> https://lat-
> >>>>>>>>>
> >>>>>>>>>
> umanitoba.westgrid.ca/toolkit/admin/logs/ls_registration_daemon.log
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>>
> >>>>>>>>> Lixin.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> =======================
> >>>>>>>>> Lixin Liu
> >>>>>>>>> IT Services
> >>>>>>>>> Simon Fraser University
> >>>>>>>>
> >>>>>>>> <ls_registration_daemon.log>
> >>>>> <ls_registration_daemon.log>
> >
> > <ls_registration_daemon.log>
>
> -----
>
> Jason Zurawski, Science Engagement Engineer
> ESnet
>
> office: [+1-510-486-6483]
> mobile: [+1-703-981-2494]
> http://www.es.net/zurawski
>
> Supercomputing Conference (SC13)
> November 17 - 22, 2013, Denver, CO
> http://sc13.supercomputing.org
>




Archive powered by MHonArc 2.6.16.

Top of Page