Skip to Content.
Sympa Menu

perfsonar-user - RE: [perfsonar-user] perfSONAR host stop report the test results

Subject: perfSONAR User Q&A and Other Discussion

List archive

RE: [perfsonar-user] perfSONAR host stop report the test results


Chronological Thread 
  • From: "Garnizov, Ivan (RRZE)" <>
  • To: Pedro Reis <>, "" <>
  • Subject: RE: [perfsonar-user] perfSONAR host stop report the test results
  • Date: Tue, 13 Dec 2016 09:53:05 +0000
  • Accept-language: en-GB, de-DE, en-US
  • Ironport-phdr: 9a23:Ki9XhRfju7ZA0a05CnPICzHDlGMj4u6mDksu8pMizoh2WeGdxc26ZheN2/xhgRfzUJnB7Loc0qyN4vumBzBLuc/JmUtBWaQEbwUCh8QSkl5oK+++Imq/EsTXaTcnFt9JTl5v8iLzG0FUHMHjew+a+SXqvnYdFRrlKAV6OPn+FJLMgMSrzeCy/IDYbxlViDanb75/KBu7oR/Qu8QZjoduNrs9wQbVr3VVfOhb2XlmLk+JkRbm4cew8p9j8yBOtP8k6sVNT6b0cbkmQLJBFDgpPHw768PttRnYUAuA/WAcXXkMkhpJGAfK8hf3VYrsvyTgt+p93C6aPdDqTb0xRD+v4btnRAPuhSwaLDMy7n3ZhdJsg6JauBKhpgJww4jIYIGOKfFyerrRcc4GSWZdW8pcUTFKDIGhYIsVF+cPPfhWoZThp1UArhW+CwujBOLzxTBHnXL5x7E23uA7HA3awAAtHdQDu2nUotXvM6cSVPi4wKfJwzrZdfNW3zb96YnPchA/uf2HQLF+cdTLxkkpCgjJikmep5DmMD2a0+gBvXWQ4u1hVeKxkWEnrRt9oje1ycc2jInJh4MUylfa9ShizoY6P8C4RFRmbtG6CZZdsTyROYVxQsMnWW5ouSA6x6UIuZGnYCcKzo4rxwbDa/CffYmH/AnjVPqeITdihXJqZaiziAqo/kWm1+byVdG03U5XoidLj9XArG0B2h/Q58SdV/dx412t1SiR2wzL9+1JL104mbDGJ5MiwbM8jJkevVnZEiPol0j7iLeaels49uS09ujqZ7Trq5GfOoJxkA7yLrgiltC6DOglLgQCQWiW9OG52bDt+UD0RqhBgOcsnanDqp/aINwWpq6nDA9R1YYu8xO/Dji/3NQCnHgLNVxIdAidj4jzOlDBPur0Deq5g1StiTtk2erGPrn7DZXLIXjMjrHhcaxg5EFC0AYzzNZf6IxICrwZPf7+VFL9uMbFAhI6MAG42fvrBMhn2o8AR26DGqqZP7nTsV+M6OIvOe6MZIoNtTb8Nfgq/fjugWU2mVAHZ6mp25oXaXG/HvR4OEiZb2DjgsobHWgXoAUyVPbqh0GaUT5Pe3ayWLox5iklB4K8A4fDXYetgLqb0yehB5FWe3tGBU6WEXrzc4WEWuwMaD6JIsN/iDAEVL6hS5M/2hG0sg/11aZnIvTO9iIGqJ3jyYs92+qGrRw+7zs8NcmcyCnZV2x4hGROXTI3x4hxqkd7jE+fh+wwyeRVD9JI4PVASEInLpPG5+18F93oXA/dJJGEREvsCoG+DCs/VdU3ysVLfl1wAf2jiAzOxSynH+VTmrCWUs8a6KXZijLeLtx7ymTByu1prkcvRIMPD1eUq+81v1zSGYfP1UqQjaCrZ6MC9CDE6SGPwDzd7wljTAdsXPCdDjgkbUzMoIG8vxuaQg==

Hi Pedro,
 
Generally all the scheduled tests on the toolkit generate files in that /var/lib/perfsonar/regulartesting/ folder.
I would suggest to first suspend the regulartesting service for a while and try to stop all the active processes running which try to write in that folder and (optionally) clean after them.
A very good option for you could be to just restart the server, but make note of the currently created folders in /var/lib/perfsonar/regulartesting/
You might want to clean those afterwards. You will not have easy options to do this after the restart.
 
Regards,
Ivan Garnizov
 
GEANT SA1T2: pS deployments GN Operations
GEANT SA2T3: pS development team
GEANT SA3T5: eduPERT team
 
 
 
-----Original Message-----
From: [] On Behalf Of Pedro Reis
Sent: Montag, 12. Dezember 2016 16:38
To:
Subject: Re: [perfsonar-user] perfSONAR host stop report the test results
 
Hello all,
 
I would like to report a similar a similar situation.
 
I have 2 toolkit that are sending the measurements to a single Archive.
Now last Saturday around 8AM the MA had a slight problem and stopped
processing information (I had to re-set the python environment again).
 
Since I fixed the MA the two toolkits have the the regulartesting.log
full of messages like this:
2016/12/12 15:11:02 (42410) WARN> regulartesting.pl:103 main::__ANON__ -
Warned: IPC::DirQueue: killed stale lockfile:
/var/lib/perfsonar/regulartesting/esmond_latency_<MA_FQDN_HERE>/active/active/50.20161212035623328938.EMjQ4Mw
at /usr/share/perl5/IPC/DirQueue.pm line 519.
 
And the regulartesting process is using up almost all the CPU
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
1425 perfsona  20   0  237m 112m 1244 R 99.8  0.7   2:29.82 regulartesting.
  935 cassandr  20   0 21.6g 1.4g  16m S 29.0  9.4   3:20.16 java
2505 apache    20   0  673m  46m 4712 S 11.2  0.3   0:03.48 httpd
 
 
I'm hopping the regular testing is just processing the data from the
failed connections/measurements from when the MA was down.
Because I tried to see other errors/problem in other logs and didn't
seem to find anything relevant!
 
Until this date I'm still not seeing any new measurements at the MA :(
 
Com meus melhores cumprimentos | Best Regards
Pedro Reis
Área de Serviços de Rede | Network Services Area
FCT|FCCN
Av. do Brasil, n.º 101
1700-066 Lisboa - Portugal
Telefone|Phone +351 218 440 100; Fax +351 218 472 167
 
On 2016/06/19 18:15, Lixin Liu wrote:
> It appears the problem is gone, not sure how but tests results are available again.
>
> Thanks,
>
> Lixin.
>
> On 2016-06-18, 10:14 PM, "Lixin Liu" <> wrote:
>
> Hi,
>
> One of my latency hosts stopped reporting test results starting sometime early today.
> I see the load on the process
>
>        perfSONAR Regular Testing: Measurement Archive: esmond_latency_localhost
>
> is always at 100% and regulartesting.log continues showing errors like this:
>
> 2016/06/18 21:48:23 (2656) WARN> regulartesting.pl:103 main::__ANON__ - Warned: IPC::DirQueue: killed stale lockfile: /var/lib/perfsonar/regulartesting/esmond_latency_localhost/active/active/50.20160618092329343238.EMjI4OQ at /usr/share/perl5/IPC/DirQueue.pm line 519.
> 2016/06/18 21:48:24 (2656) WARN> regulartesting.pl:103 main::__ANON__ - Warned: IPC::DirQueue: killed stale lockfile: /var/lib/perfsonar/regulartesting/esmond_latency_localhost/active/active/50.20160618092329700337.EMjI5MQ at /usr/share/perl5/IPC/DirQueue.pm line 519.
>
> I hope someone could help me to figure out what needs to be done to resolve this
> problem. The hostname of the machine is lat-usask.westgrid.ca.
>
> Thanks,
>
> Lixin Liu
> Compute Canada & WestGrid
>
>
>
>
>
>
>
 
 



Archive powered by MHonArc 2.6.19.

Top of Page