Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] debugging ls_(cache|registration)_daemon

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] debugging ls_(cache|registration)_daemon


Chronological Thread 
  • From: Jason Zurawski <>
  • To: Carl Hayter <>
  • Cc: perfsonar-user <>
  • Subject: Re: [perfsonar-user] debugging ls_(cache|registration)_daemon
  • Date: Sat, 12 Feb 2011 09:53:20 +0100
  • Organization: Internet2

Hi Carl;

On 2/12/11 7:24 AM, Carl Hayter wrote:
Is hard.

http://www.perfsonar.net/ls.cache.hints contains sources that are down,
and the sources that are up have old differing cache.tgz files.

$ HEAD -e http://psvis0.internet2.edu/perfAdmin/cache.tgz
200 OK
Connection: close
Date: Sat, 12 Feb 2011 05:38:09 GMT
Accept-Ranges: bytes
ETag: "6095b-b033-494dd9a2a7680"
Server: Apache/2.2.3 (Red Hat)
Content-Length: 45107
Content-Type: application/x-gzip
Last-Modified: Fri, 12 Nov 2010 16:30:02 GMT
Client-Date: Sat, 12 Feb 2011 05:38:09 GMT
Client-Peer: 207.75.164.126:80
Client-Response-Num: 1

$ HEAD -e http://ps4.es.net/cache.tgz
200 OK
Connection: close
Date: Sat, 12 Feb 2011 05:39:03 GMT
Accept-Ranges: bytes
ETag: "7ff302-1a36d-4d405965"
Server: Apache/1.3.42 (Unix)
Content-Encoding: x-gzip
Content-Length: 107373
Content-Type: application/x-tar
Last-Modified: Wed, 26 Jan 2011 17:27:01 GMT
Client-Date: Sat, 12 Feb 2011 05:39:03 GMT
Client-Peer: 198.124.238.242:80
Client-Response-Num: 1

Old Last-Modified times, different Content-Length.

$ tar ztf ps4.tgz | wc -l
22
$ tar ztf psvis0.tgz | wc -l
19

$ diff -r -q ps4 psv | fgrep Only
Only in ps4: list.gridftp
Only in ps4: list.traceroute_ma
Only in ps4: pinger-gui.json


You are 100% correct, it looks like the Internet2 machine has been failing for some time (the same machine that recently wasn't running http a week or so ago has some more problems). I gave it some TLC, and here is the result:

-bash-3.2# HEAD -e http://psvis0.internet2.edu/perfAdmin/cache.tgz
200 OK
Connection: close
Date: Sat, 12 Feb 2011 08:51:56 GMT
Accept-Ranges: bytes
ETag: "607b2-7dae-49c11e445cf40"
Server: Apache/2.2.3 (Red Hat)
Content-Length: 32174
Content-Type: application/x-gzip
Last-Modified: Sat, 12 Feb 2011 08:49:25 GMT
Client-Date: Sat, 12 Feb 2011 08:51:56 GMT
Client-Peer: 207.75.164.126:80
Client-Response-Num: 1


I only noticed when the script I had written stopped fetching new cache
files. I copied from the perfSONAR_PS::LSCacheDaemon::LSCacheHandler
module logic, which seems to handle a case badly.

foreach URL
if same as last URL used, use HTTP-ETAG and HTTP-Last-Modified values.
fetch URL
if ERROR, next URL
if NEW, save
last

So, if a URL keeps serving up an old cache tar file without error, the
daemon will not look any further for a new cache tar file. My script
which had been lucky and fetching a fresh cache tar file every hour
suddenly started only seeing the old version. (NEW|OLD) is the success
of fetching an updated cache file, (OK|FAIL) is finding my server in the
appropriate list.* files.

...
Thu Feb 10 00:03:02 PST 2011 NEW OK
Thu Feb 10 01:03:02 PST 2011 NEW OK
Thu Feb 10 02:03:02 PST 2011 NEW FAIL
Thu Feb 10 03:03:02 PST 2011 NEW FAIL
Thu Feb 10 04:03:02 PST 2011 NEW FAIL
Thu Feb 10 05:03:03 PST 2011 NEW FAIL
Thu Feb 10 06:03:02 PST 2011 NEW FAIL
Thu Feb 10 07:03:24 PST 2011 NEW FAIL
Thu Feb 10 08:03:23 PST 2011 OLD FAIL
Thu Feb 10 09:03:28 PST 2011 OLD FAIL
Thu Feb 10 10:03:23 PST 2011 OLD FAIL
Thu Feb 10 11:03:23 PST 2011 OLD FAIL
Thu Feb 10 12:03:23 PST 2011 OLD FAIL
Thu Feb 10 13:03:22 PST 2011 OLD FAIL
Thu Feb 10 14:03:22 PST 2011 OLD FAIL
Thu Feb 10 15:03:23 PST 2011 OLD FAIL
Thu Feb 10 16:03:23 PST 2011 OLD FAIL
Thu Feb 10 17:03:23 PST 2011 OLD FAIL
...

It seems that the hints file should contain reachable hosts (as much as
possible), and the cache tar files should be the same across the hosts
(and hopefully serve up valid Last-Modified times) and LSCacheDaemon
should try harder to find a current file rathen than stopping when it
gets a not-error, not-new cache tar file. (maybe try to get last
modified times of the URLs in the hints file and choose the most recent)


I will open an issue on this, and we will work on correcting the logic. Thanks for the debugging help!

-jason



Archive powered by MHonArc 2.6.16.

Top of Page