perfsonar-user - Re: [perfsonar-user] Re: Registration service problem
Subject: perfSONAR User Q&A and Other Discussion
List archive
- From: Brian Candler <>
- To: Szymon Trocha <>
- Cc:
- Subject: Re: [perfsonar-user] Re: Registration service problem
- Date: Tue, 8 Sep 2015 22:43:56 +0300
- Domainkey-signature: a=rsa-sha1; c=nofws; d=pobox.com; h=subject:to :references:cc:from:message-id:date:mime-version:in-reply-to :content-type:content-transfer-encoding; q=dns; s=sasl; b=M3nkJA fLWnko7ANEPKE8DX5ihXu9l0PvdekQzawzJWJkOCWXttDniE/pVWZjNnT/THMIAZ NXpStRCnMCCoAB+j3Ct/Q1d71wp2rtSV+WNBDZUApOJ4f3kvwox4c8xCIC+9rpZo +Zh4IZdg7KKAb7aU/g274dS+EzutFXHmJ38Bg=
On 08/09/2015 21:47, Brian Candler wrote:
I can try pointing to a different locator, but first is there anyone who can check logs on ps-sls.sanren.ac.za to see if the locator service is having problems? What might cause the 403 / 500 errors?Actually, without changing anything on this side, I see at least one node is now trying to register to Australia - but is also getting various errors.
[root@pfsnr ~]# grep ": [0-9][0-9][0-9] " /var/log/perfsonar/ls_registration_daemon.log | tail
2015/09/08 21:46:19 (8675) ERROR> Base.pm:304 perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering service. Will retry full registration next time: 500 read timeout
2015/09/08 21:47:19 (8675) ERROR> Base.pm:304 perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering service. Will retry full registration next time: 500 Can't connect to nsw-brwy-sls1.aarnet.net.au:8090 (connect: timeout)
2015/09/08 21:48:21 (8675) ERROR> Base.pm:304 perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering service. Will retry full registration next time: 500 read timeout
2015/09/08 21:49:24 (8675) ERROR> Base.pm:304 perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering service. Will retry full registration next time: 500 read timeout
2015/09/08 21:50:26 (8675) ERROR> Base.pm:304 perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering service. Will retry full registration next time: 500 read timeout
2015/09/08 21:51:31 (8675) ERROR> Base.pm:304 perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering service. Will retry full registration next time: 500 read timeout
2015/09/08 21:52:36 (8675) ERROR> Base.pm:304 perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering service. Will retry full registration next time: 500 read timeout
2015/09/08 21:53:39 (8675) ERROR> Base.pm:304 perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering service. Will retry full registration next time: 500 read timeout
2015/09/08 21:54:46 (8675) ERROR> Base.pm:304 perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering service. Will retry full registration next time: 500 read timeout
2015/09/08 21:55:51 (8675) ERROR> Base.pm:304 perfSONAR_PS::LSRegistrationDaemon::Base::register - Problem registering service. Will retry full registration next time: 500 Can't connect to nsw-brwy-sls1.aarnet.net.au:8090 (connect: timeout)
It's not clear to me whether "500 read timeout" is a locally-generated error, or an actual 500 HTTP error. But the "connect: timeout" looks like it could be locally generated.
However this node does look like it's having problems.
[root@pfsnr
~]# ping -c10 nsw-brwy-sls1.aarnet.net.au
PING nsw-brwy-sls1.aarnet.net.au (182.255.120.9) 56(84) bytes of data.
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=1 ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=2 ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=3 ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=4 ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=5 ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=6 ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=7 ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=8 ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=9 ttl=48 time=445 ms
64 bytes from nsw-brwy-sls1.aarnet.net.au (182.255.120.9): icmp_seq=10 ttl=48 time=445 ms
--- nsw-brwy-sls1.aarnet.net.au ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9510ms
rtt min/avg/max/mdev = 445.548/445.694/445.938/0.438 ms
[root@pfsnr
~]# curl -v nsw-brwy-sls1.aarnet.net.au:8090
* About to connect() to nsw-brwy-sls1.aarnet.net.au port 8090 (#0)
* Trying 182.255.120.9...
<< hangs for 10-20 seconds >>
connected
* Connected to nsw-brwy-sls1.aarnet.net.au (182.255.120.9) port 8090 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: nsw-brwy-sls1.aarnet.net.au:8090
> Accept: */*
>
<< hangs here - indefinitely? >>
Ditto for curl -v http://nsw-brwy-sls1.aarnet.net.au:8090/lookup/records/
If I try this from a host in the UK, I get something very strange which I've never seen before:
brian@deploy2:~$ time curl -v http://nsw-brwy-sls1.aarnet.net.au:8090/lookup/records/
* About to connect() to nsw-brwy-sls1.aarnet.net.au port 8090 (#0)
* Trying 182.255.120.9...
* connected
* Connected to nsw-brwy-sls1.aarnet.net.au (182.255.120.9) port 8090 (#0)
> GET /lookup/records/ HTTP/1.1
> User-Agent: curl/7.26.0
> Host: nsw-brwy-sls1.aarnet.net.au:8090
> Accept: */*
>
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
* additional stuff not fine transfer.c:1037: 0 0
...
(about once per second)
But if I try "curl -v http://ps-west.es.net:8090/lookup/records/" instead, what I see is a few "additional stuff not fine" lines, followed by a big splurge of JSON, which I think is the entire locator database.
Back on the perfsonar node which is showing the problem:
[root@pfsnr ~]# grep http /var/log/perfsonar/ls_registration_daemon.log | grep -v kenet
2015/09/08 12:05:05 (8604) INFO> daemon.pl:170 main:: - Initial LS URL set to http://nsw-brwy-sls1.aarnet.net.au:8090/lookup/records/
2015/09/08 18:05:48 (8675) INFO> daemon.pl:349 main::handle_site - LS URL changed to http://ps-sls.sanren.ac.za:8090/lookup/records
[root@pfsnr
~]#
So it looks like it's learned the ZA URL again, but logs show it's still trying to connect to the AU one.
All very odd!
Regards,
Brian.
- [perfsonar-user] Registration service problem, Brian Candler, 09/07/2015
- [perfsonar-user] Re: Registration service problem, Brian Candler, 09/07/2015
- Re: [perfsonar-user] Re: Registration service problem, Szymon Trocha, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Brian Candler, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Szymon Trocha, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Brian Candler, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Brian Candler, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Andrew Lake, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Brian Candler, 09/09/2015
- Re: [perfsonar-user] Re: Registration service problem, Andrew Lake, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Brian Candler, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Brian Candler, 09/08/2015
- Re: [perfsonar-user] Re: Registration service problem, Szymon Trocha, 09/08/2015
- [perfsonar-user] Re: Registration service problem, Brian Candler, 09/07/2015
Archive powered by MHonArc 2.6.16.