Skip to Content.
Sympa Menu

perfsonar-user - RE: [perfsonar-user] perfSONAR host stop report the test results

Subject: perfSONAR User Q&A and Other Discussion

List archive

RE: [perfsonar-user] perfSONAR host stop report the test results


Chronological Thread 
  • From: "Garnizov, Ivan (RRZE)" <>
  • To: Pedro Reis <>, "" <>
  • Subject: RE: [perfsonar-user] perfSONAR host stop report the test results
  • Date: Wed, 14 Dec 2016 12:51:52 +0000
  • Accept-language: en-GB, de-DE, en-US
  • Ironport-phdr: 9a23:rRmWhBRyDr1xbc90yN7pyrdLvdpsv+yvbD5Q0YIujvd0So/mwa6zbR2N2/xhgRfzUJnB7Loc0qyN4vumBzJLuMbJmUtBWaQEbwUCh8QSkl5oK+++Imq/EsTXaTcnFt9JTl5v8iLzG0FUHMHjew+a+SXqvnYdFRrlKAV6OPn+FJLMgMSrzeCy/IDYbxlViDanb75/KBu7oR/Qu8QXjoduN6U8wQbVr3VVfOhb2XlmLk+JkRbm4cew8p9j8yBOtP8k6sVNT6b0cbkmQLJBFDgpPHw768PttRnYUAuA/WAcXXkMkhpJGAfK8hf3VYrsvyTgt+p93C6aPdDqTb0xRD+v4btnRAPuhSwaLDMy7n3ZhdJsg6JauBKhpgJww4jIYIGOKfFyerrRcc4GSWZdW8pcUTFKDIGhYIsVF+cPPfhWoZThp1UArhW+CwujBOLzxTBHnXL5x7E23uA7HA3awAAtHdQDu2nUotXvM6cSVPi4wKfJwzrZdfNW3zb96YnPchA/uf2HQLF+cdTLxkkpCgjJikmep5DmMD2a0+gBvXWQ4u1hVeKxkWEnrRt9oje1ycc2jInJh4MUylfa9ShizoY6P8C4RFRmbtG6CZZdsTyROYVxQsMnWW5ouSA6x6UIuZGnYCcKzo4rxwbDa/GBboOG4QrjWf6MLTp2mX5pYq+zihWx/ES61+HwS9O43EhXoiZdj9XBuG0B2wbO5sWESfZx5ESs1SuV2wzN5exJJVg4mbTHJ5I937I9k5sevErAEyLzgkr6kLOaelkh9+S19+jrfLXrq5qZOoJ6iQzzNLkllNalDuQiKAcOWnCW+eSi273n+k30WLBKj/IvnqnDsZHWPNoXqrSjAw9P04cs9QyzDyqg0NQZhHUIMkhFeBadgIjvNFHBPvb4Ae2ijFuyiDtrxvbGMaP9ApjVM3TOnqrtcaxg50Nfywc/181T649OBr0fPf7/Qkrxu8bZDh89PQy02eHnCNBl24MfR2KAGKmZPLndsVCS/OIvJeiMZY4SuDbjMPUl6eThjWIjlVAAY6alxYEXZ2ygHvR6P0WZZmLhgs8fHmcQsAo+V+vqiFuYXj5JfnqyQrk86S8hCI+9CYfDR5utgKCa3CulBJFWZ2ZGCkySHnfycYWLResMZDyILsB/jzMESOvpd4h07Ryirgiy8bdmNaKA4iAUr5WlztV0/MXdnho0syRoWYDVmXmAVW9vmWUBXXorx61liU171lqZ16Vk2bpVGcEZr6dRXx00LpnaxvY/Fsv/QCrAeMuEUlCrXo/gDD0sGIEf2dgLNgxSEs+kjwLEwW7iIqEckfTLPqYG3+OWlyzwOc97jXnPzq8gnV44asVGKCuqi/gspEDoG4fVnhDBxO6RfqMG0XuIrT/bwA==

Hello Pedro,
 
Unfortunately this is somewhat expected. Once the stalled information expires, there are no easy ways of recollecting it (not sure even if there are).
Please stand by. Next pS release applies a different approach that is also looking at improving/fixing the problem.
 
Regards,
Ivan Garnizov
 
GEANT SA1T2: pS deployments GN Operations
GEANT SA2T3: pS development team
GEANT SA3T5: eduPERT team
 
 
 
-----Original Message-----
From: Pedro Reis []
Sent: Mittwoch, 14. Dezember 2016 09:29
To: Garnizov, Ivan (RRZE);
Subject: Re: [perfsonar-user] perfSONAR host stop report the test results
 
Hello Ivan, All,
 
Thanks for the info.
The system eventually recovered after a few hours processing the stale
lockfiles.
 
Now, I was expecting to see the results in the MA, but I'm getting a big
blank. In both toolkits the readings are there, but looks like they
didn't manage to (re)send it to the MA, or the MA didn't processed the
information correctly :(
 
Com meus melhores cumprimentos | Best Regards
Pedro Reis
Área de Serviços de Rede | Network Services Area
FCT|FCCN
Av. do Brasil, n.º 101
1700-066 Lisboa - Portugal
Telefone|Phone +351 218 440 100; Fax +351 218 472 167
 
On 2016/12/13 09:53, Garnizov, Ivan (RRZE) wrote:
> Hi Pedro,
> Generally all the scheduled tests on the toolkit generate files in that
> /var/lib/perfsonar/regulartesting/ folder.
> I would suggest to first suspend the regulartesting service for a while
> and try to stop all the active processes running which try to write in
> that folder and (optionally) clean after them.
> A very good option for you could be to just restart the server, but make
> note of the currently created folders in /var/lib/perfsonar/regulartesting/
> You might want to clean those afterwards. You will not have easy options
> to do this after the restart.
> Regards,
> Ivan Garnizov
> /GEANT SA1T2: pS deployments GN Operations/
> /GEANT SA2T3: pS development team/
> /GEANT SA3T5: eduPERT team/
> -----Original Message-----
> From:
> [] On Behalf Of Pedro Reis
> Sent: Montag, 12. Dezember 2016 16:38
> To:
> Subject: Re: [perfsonar-user] perfSONAR host stop report the test results
> Hello all,
> I would like to report a similar a similar situation.
> I have 2 toolkit that are sending the measurements to a single Archive.
> Now last Saturday around 8AM the MA had a slight problem and stopped
> processing information (I had to re-set the python environment again).
> Since I fixed the MA the two toolkits have the the regulartesting.log
> full of messages like this:
> 2016/12/12 15:11:02 (42410) WARN> regulartesting.pl:103 main::__ANON__ -
> Warned: IPC::DirQueue: killed stale lockfile:
> /var/lib/perfsonar/regulartesting/esmond_latency_<MA_FQDN_HERE>/active/active/50.20161212035623328938.EMjQ4Mw
> at /usr/share/perl5/IPC/DirQueue.pm line 519.
> And the regulartesting process is using up almost all the CPU
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 1425 perfsona  20   0  237m 112m 1244 R 99.8  0.7   2:29.82 regulartesting.
>   935 cassandr  20   0 21.6g 1.4g  16m S 29.0  9.4   3:20.16 java
> 2505 apache    20   0  673m  46m 4712 S 11.2  0.3   0:03.48 httpd
> I'm hopping the regular testing is just processing the data from the
> failed connections/measurements from when the MA was down.
> Because I tried to see other errors/problem in other logs and didn't
> seem to find anything relevant!
> Until this date I'm still not seeing any new measurements at the MA :(
> Com meus melhores cumprimentos | Best Regards
> Pedro Reis
> Área de Serviços de Rede | Network Services Area
> FCT|FCCN
> Av. do Brasil, n.º 101
> 1700-066 Lisboa - Portugal
> Telefone|Phone +351 218 440 100; Fax +351 218 472 167
> On 2016/06/19 18:15, Lixin Liu wrote:
>> It appears the problem is gone, not sure how but tests results are available again.
>>
>> Thanks,
>>
>> Lixin.
>>
>> On 2016-06-18, 10:14 PM, "Lixin Liu" < on behalf of
> <>>
> wrote:
>>
>> Hi,
>>
>> One of my latency hosts stopped reporting test results starting sometime early today.
>> I see the load on the process
>>
>>        perfSONAR Regular Testing: Measurement Archive: esmond_latency_localhost
>>
>> is always at 100% and regulartesting.log continues showing errors like this:
>>
>> 2016/06/18 21:48:23 (2656) WARN> regulartesting.pl:103 main::__ANON__ - Warned: IPC::DirQueue: killed stale lockfile: /var/lib/perfsonar/regulartesting/esmond_latency_localhost/active/active/50.20160618092329343238.EMjI4OQ at /usr/share/perl5/IPC/DirQueue.pm line 519.
>> 2016/06/18 21:48:24 (2656) WARN> regulartesting.pl:103 main::__ANON__ - Warned: IPC::DirQueue: killed stale lockfile: /var/lib/perfsonar/regulartesting/esmond_latency_localhost/active/active/50.20160618092329700337.EMjI5MQ at /usr/share/perl5/IPC/DirQueue.pm line 519.
>>
>> I hope someone could help me to figure out what needs to be done to resolve this
>> problem. The hostname of the machine is lat-usask.westgrid.ca.
>>
>> Thanks,
>>
>> Lixin Liu
>> Compute Canada & WestGrid
>>
>>
>>
>>
>>
>>
>>
 
 



Archive powered by MHonArc 2.6.19.

Top of Page