perfsonar-user - Re: [perfsonar-user] Re: Registration service problem
Subject: perfSONAR User Q&A and Other Discussion
List archive
- From: "Andrew Lake" <>
- To: "Brian Candler" <>
- Cc: "Szymon Trocha" <>,
- Subject: Re: [perfsonar-user] Re: Registration service problem
- Date: Tue, 08 Sep 2015 12:55:56 -0700 (PDT)
On Tue, Sep 8, 2015 at 3:44 PM, Brian Candler <> wrote:
On 08/09/2015 21:47, Brian Candler wrote:
> I can try pointing to a different locator, but first is there anyone
> who can check logs on ps-sls.sanren.ac.za to see if the locator
> service is having problems? What might cause the 403 / 500 errors?
Actually, without changing anything on this side, I see at least one
node is now trying to register to Australia - but is also getting
various errors.
[root@pfsnr ~]# grep ": [0-9][0-9][0-9] "
/var/log/perfsonar/ls_registration_daemon.log | tail
2015/09/08 21:46:19 (8675) ERROR> Base.pm:304
perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
service. Will retry full registration next time: 500 read timeout
2015/09/08 21:47:19 (8675) ERROR> Base.pm:304
perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
service. Will retry full registration next time: 500 Can't connect to
nsw-brwy-sls1.aarnet.net.au:8090 (connect: timeout)
2015/09/08 21:48:21 (8675) ERROR> Base.pm:304
perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
service. Will retry full registration next time: 500 read timeout
2015/09/08 21:49:24 (8675) ERROR> Base.pm:304
perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
service. Will retry full registration next time: 500 read timeout
2015/09/08 21:50:26 (8675) ERROR> Base.pm:304
perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
service. Will retry full registration next time: 500 read timeout
2015/09/08 21:51:31 (8675) ERROR> Base.pm:304
perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
service. Will retry full registration next time: 500 read timeout
2015/09/08 21:52:36 (8675) ERROR> Base.pm:304
perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
service. Will retry full registration next time: 500 read timeout
2015/09/08 21:53:39 (8675) ERROR> Base.pm:304
perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
service. Will retry full registration next time: 500 read timeout
2015/09/08 21:54:46 (8675) ERROR> Base.pm:304
perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
service. Will retry full registration next time: 500 read timeout
2015/09/08 21:55:51 (8675) ERROR> Base.pm:304
perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering
service. Will retry full registration next time: 500 Can't connect to
nsw-brwy-sls1.aarnet.net.au:8090 (connect: timeout)
It's not clear to me whether "500 read timeout" is a locally-generated
error, or an actual 500 HTTP error. But the "connect: timeout" looks
like it could be locally generated.
However this node does look like it's having problems.
[root@pfsnr ~]# ping -c10 nsw-brwy-sls1.aarnet.net.au
PING nsw-brwy-sls1.aarnet.net.au (182.255.120.9) 56(84) bytes of data.
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=1
ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=2
ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=3
ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=4
ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=5
ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=6
ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=7
ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=8
ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=9
ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=10
ttl=48 time=445 ms
--- nsw-brwy-sls1.aarnet.net.au ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9510ms
rtt min/avg/max/mdev = 445.548/445.694/445.938/0.438 ms
[root@pfsnr ~]# curl -v nsw-brwy-sls1.aarnet.net.au:8090
* About to connect() to nsw-brwy-sls1.aarnet.net.au port 8090 (#0)
* Trying 182.255.120.9...
<< hangs for 10-20 seconds >>
connected
* Connected to nsw-brwy-sls1.aarnet.net.au (182.255.120.9) port 8090 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7
NSS/3.19.1 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: nsw-brwy-sls1.aarnet.net.au:8090
> Accept: */*
>
<< hangs here - indefinitely? >>
Ditto for curl -v http://nsw-brwy-sls1.aarnet.net.au:8090/lookup/records/
If I try this from a host in the UK, I get something very strange which
I've never seen before:
brian@deploy2:~$ time curl -v
http://nsw-brwy-sls1.aarnet.net.au:8090/lookup/records/
* About to connect() to nsw-brwy-sls1.aarnet.net.au port 8090 (#0)
* Trying 182.255.120.9...
* connected
* Connected to nsw-brwy-sls1.aarnet.net.au (182.255.120.9) port 8090 (#0)
> GET /lookup/records/ HTTP/1.1
> User-Agent: curl/7.26.0
> Host: nsw-brwy-sls1.aarnet.net.au:8090
> Accept: */*
>
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
...
(about once per second)
But if I try "curl -v http://ps-west.es.net:8090/lookup/records/"
instead, what I see is a few "additional stuff not fine" lines, followed
by a big splurge of JSON, which I think is the entire locator database.
Back on the perfsonar node which is showing the problem:
[root@pfsnr ~]# grep http /var/log/perfsonar/ls_registration_daemon.log
| grep -v kenet
2015/09/08 12:05:05 (8604) INFO> daemon.pl:170 main:: - Initial LS URL
set to http://nsw-brwy-sls1.aarnet.net.au:8090/lookup/records/
2015/09/08 18:05:48 (8675) INFO> daemon.pl:349 main::handle_site - LS
URL changed to http://ps-sls.sanren.ac.za:8090/lookup/records
[root@pfsnr ~]#
So it looks like it's learned the ZA URL again, but logs show it's still
trying to connect to the AU one.
All very odd!
Regards,
Brian.
- [perfsonar-user] Registration service problem, Brian Candler, 09/07/2015
- [perfsonar-user] Re: Registration service problem, Brian Candler, 09/07/2015
- Re: [perfsonar-user] Re: Registration service problem, Szymon Trocha, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Brian Candler, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Szymon Trocha, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Brian Candler, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Brian Candler, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Andrew Lake, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Brian Candler, 09/09/2015
- Re: [perfsonar-user] Re: Registration service problem, Andrew Lake, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Brian Candler, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Brian Candler, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Szymon Trocha, 09/08/2015
- [perfsonar-user] Re: Registration service problem, Brian Candler, 09/07/2015
Archive powered by MHonArc 2.6.16.