Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Lots of "worker-still-working" log messages

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Lots of "worker-still-working" log messages


Chronological Thread 
  • From: Aaron Brown <>
  • To: Alan Whinery <>
  • Cc: "" <>
  • Subject: Re: [perfsonar-user] Lots of "worker-still-working" log messages
  • Date: Tue, 10 Mar 2015 12:33:39 +0000
  • Accept-language: en-US
  • Authentication-results: hawaii.edu; dkim=none (message not signed) header.d=none;

Hey Alan,

My only guess is that it’s not able to keep up with the writing or something
like that, and it eventually hit a breaking point. Do the ‘active’ files
build up over time?

Cheers,
Aaron

> On Mar 9, 2015, at 9:21 PM, Alan Whinery
> <>
> wrote:
>
> Hi Aaron,
>
> I did try that on Friday, and it didn't help. Later in the day I did "rm
> -rf *" in /var/lib/perfsonar/regular_testing with regular_testing
> stopped, and that did help. Sorry to skip to the end, but I didn't want
> Monday to come without some data to query. Working on clients to extract
> results.
>
> -Alan
>
>
>
> On 3/5/2015 4:42 AM, Aaron Brown wrote:
>> Hey Alan,
>>
>> Try removing all the ones in the ‘failed’
>> directory(./esmond_traceroute_localhost/failed/), and restart
>> regular_testing.
>>
>> Cheers,
>> Aaron
>>
>>> On Mar 4, 2015, at 2:05 PM, Alan Whinery
>>> <>
>>> wrote:
>>>
>>> On 3/4/2015 8:59 AM, Aaron Brown wrote:
>>>> Do a “find” on that directory. The IPC::DirQueue module creates a bunch
>>>> of deeply nested directories/files.
>>>>
>>>> Cheers,
>>>> Aaron
>>>> [root@uhmanoa-dl
>>>> regular_testing]# find | wc -l
>>>> 882256
>>> [root@uhmanoa-dl
>>> regular_testing]# find
>>> .
>>> ./owamp_7qHIi
>>> ./owamp_7qHIi/.powlock
>>> ./owamp_upwyY
>>> ./owamp_upwyY/.powlock
>>> ./owamp_2hu2R
>>> ./owamp_2hu2R/.powlock
>>> ./owamp_xmZFB
>>> ./owamp_xmZFB/.powlock
>>> ./owamp_lBYgk
>>> ./owamp_lBYgk/.powlock
>>> ./owamp_s7lKI
>>> ./owamp_s7lKI/.powlock
>>> ./owamp_O3Fsh
>>> ./owamp_O3Fsh/.powlock
>>> ./owamp_9rE5X
>>> ./owamp_9rE5X/.powlock
>>> ./owamp_fw0nn
>>> ./owamp_fw0nn/.powlock
>>> ./owamp_9sD7v
>>> ./owamp_9sD7v/.powlock
>>> ./owamp_sqObj
>>> ./owamp_sqObj/.powlock
>>> ./owamp_RdDjR
>>> ./owamp_RdDjR/.powlock
>>> ./owamp_f0BZ7
>>> ./owamp_f0BZ7/.powlock
>>> ./esmond_traceroute_localhost
>>> ./esmond_traceroute_localhost/failed
>>> ./esmond_traceroute_localhost/failed/active
>>> ./esmond_traceroute_localhost/failed/tmp
>>> ./esmond_traceroute_localhost/failed/queue
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304171242934170.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304184801547595.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304183258046449.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304190304313687.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304175751385462.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304185803931956.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304185302785567.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304190304297883.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304180252641367.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304171743163298.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304182757887376.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304190304295024.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304172745463763.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304173747793975.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304180753833117.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304181755156840.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304183759200465.EMTIxN
>>> ./esmond_traceroute_localhost/failed/queue/50.20150304181254976696.EMTIxN
>>> (...)
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150304102312612095.EMTIw
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150302121426894022.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150228083049413614.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150303162126192173.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150303163601128256.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150301234615750704.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150303003842932206.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150303094156425266.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150227135214730842.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150228222430692718.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150227105126392451.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150301151024177772.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150303155054582312.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150302181855084398.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150303064029291617.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150301152731481204.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150228005552798088.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150303120757245309.EMTI1
>>> ./esmond_latency_localhost/active/data/N/Q/50.20150227092609672677.EMTI1
>>> (...)
>>> ./owamp_TZnhf
>>> ./owamp_TZnhf/15591117766712423782_15591118005231958020.owp
>>> ./owamp_TZnhf/.powlock
>>> ./owamp_TZnhf/15591117766712423782_15591118005231958020.sum
>>> ./owamp_uhpRF
>>> ./owamp_uhpRF/15591117982957283008_15591118007035279595.owp
>>> ./owamp_uhpRF/.powlock
>>> ./owamp_uhpRF/15591117982957283008_15591118007035279595.sum
>>> ./owamp_6Jvkt
>>> ./owamp_6Jvkt/.powlock
>>>
>>>
>>>>> On Mar 4, 2015, at 1:44 PM, Alan Whinery
>>>>> <>
>>>>> wrote:
>>>>>
>>>>> On 3/4/2015 8:40 AM, Alan Whinery wrote:
>>>>>> On 3/4/2015 8:28 AM, Aaron Brown wrote:
>>>>>>> Hey Alan,
>>>>>>>
>>>>>>> Are there a bunch of files in /var/lib/perfsonar/regular_testing?
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Aaron
>>>>> Or perhaps I should have said directories...
>>>>>
>>>>> "ls *" gets 125 directories, 54 regular files.
>>>>>
>>>>> [root@uhmanoa-dl
>>>>> regular_testing]# ls -l
>>>>> total 500
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:11 bwctl_0BQRw
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:17 bwctl_0QH8E
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:32 bwctl_5k0Wq
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:36 bwctl_71FtU
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 bwctl_AhVhc
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 3 09:07 bwctl_E9aEh
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 3 09:01 bwctl_EV9Z6
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 3 09:07 bwctl_EYlKB
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 3 08:58 bwctl_FlgvC
>>>>> drwx------ 2 perfsonar perfsonar 4096 Feb 19 18:47 bwctl_gBppB
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:15 bwctl_kMDVi
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:38 bwctl_l4ksQ
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 3 09:07 bwctl_Mybgb
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:38 bwctl_QzaBA
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:34 bwctl_SAF_T
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:31 bwctl_TljeT
>>>>> drwx------ 2 perfsonar perfsonar 4096 Feb 19 18:45 bwctl_u3Mum
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:36 bwctl_xOkeh
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:33 bwctl_XSDvn
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:16 bwctl_z4m5O
>>>>> drwxr-x--- 4 perfsonar perfsonar 4096 Dec 9 13:11
>>>>> esmond_latency_localhost
>>>>> drwxr-x--- 4 perfsonar perfsonar 4096 Dec 9 13:11
>>>>> esmond_throughput_localhost
>>>>> drwxr-x--- 4 perfsonar perfsonar 4096 Dec 9 13:11
>>>>> esmond_traceroute_localhost
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_1723R
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_2hu2R
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_3eKCy
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:39 owamp_3zqqB
>>>>> drwx------ 2 perfsonar perfsonar 4096 Dec 24 13:59 owamp_46rTZ
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_5JObr
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_5x2ua
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_65dDj
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_6Jvkt
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_6TEyC
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_7jw_T
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_7qHIi
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_9hu1N
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_9nYtA
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_9rE5X
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_9sD7v
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_9TSfJ
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_AKoMY
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 3 09:07 owamp_AYzch
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_AZRoN
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_bBAWm
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 3 09:07 owamp_BPHTO
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_bSKCn
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_cH9GH
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_ckzKy
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_CpoHI
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 13 02:00 owamp_DH3y4
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_D_nzz
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_E0uGT
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_e5Qff
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:39 owamp_f0BZ7
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 13 02:00 owamp_f8e2s
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_FEodt
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_fFGwc
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_fjTEN
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 3 09:07 owamp_fTuzd
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_fuu1J
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_fUwhB
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 20 07:47 owamp_Fvurb
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_fw0nn
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_GI0vV
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_GlieD
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_GO2E6
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_guJHF
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:39 owamp_hmofi
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_iaPC0
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 20 07:41 owamp_j7HFV
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_jaidn
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_k680N
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_kHXyR
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_kSyX4
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_Ky6te
>>>>> drwx------ 2 perfsonar perfsonar 4096 Dec 24 14:00 owamp_L3zmI
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_lBYgk
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_LkSyt
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_lLC2x
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_Lt422
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_lW5Lf
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_Mb18E
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_n37rE
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_nFqfN
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:39 owamp_Nn5HO
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:39 owamp_NpjgZ
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_NsuYG
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 3 09:07 owamp_O3Fsh
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_ocz5q
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:39 owamp_OoMkr
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_P1YCC
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_Pa8Np
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 20 07:43 owamp_pbRb1
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:19 owamp_pCnw8
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_pliZ7
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_ptEtT
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_PTOGl
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_QkteT
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_QP8xX
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 20 07:52 owamp_qsQip
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_RdDjR
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_rdIpw
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_RinR3
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:39 owamp_RL0zw
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:39 owamp__RQ57
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_s7lKI
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 3 09:05 owamp_SovaE
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_sqObj
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_Tp8ar
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:39 owamp_tT_gH
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_TZnhf
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_uCLJM
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_uhpRF
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_upwyY
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_VVOtV
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_wPbKx
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_XA5nh
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_xKcgv
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:39 owamp_xmZFB
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_Y0q75
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 3 09:05 owamp_ykbDt
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:39 owamp_YPYl5
>>>>> drwx------ 2 perfsonar perfsonar 4096 Jan 12 12:20 owamp_z9el1
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:40 owamp_ZlR4u
>>>>> drwx------ 2 perfsonar perfsonar 4096 Mar 4 08:39 owamp_zMQVW
>>>>>
>>>>>
>>>>>
>>>>>>>> On Mar 3, 2015, at 6:51 PM, Alan Whinery
>>>>>>>> <>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> I have a host with pS-Performance Toolkit 3.4.1 (3.4.1-1.pSPS)
>>>>>>>> with a
>>>>>>>> mix of powstream and traceroute tests running.
>>>>>>>>
>>>>>>>> Last Friday, at 23:21:12 I started getting a lot of messages like:
>>>>>>>>
>>>>>>>> 2015/02/26 23:21:12 (30583) DEBUG> DirQueue.pm:44
>>>>>>>> perfSONAR_PS::RegularTesting::DirQueue::worker_still_working -
>>>>>>>> worker_still_working called
>>>>>>>>
>>>>>>>> And the graphs stopped updating.
>>>>>>>>
>>>>>>>> I did find the line that logs the message in the source, but it's not
>>>>>>>> commented.
>>>>>>>>
>>>>>>>> So something is full, or stuck? Rebooting had no effect, disk has
>>>>>>>> 858 GB
>>>>>>>> available...
>>>>>>>>
>>>>>>>> There were no instances of this log message prior to 2015/02/26
>>>>>>>> 23:21:12, grepping back through 7 days of logs.
>>>>>>>>
>




Archive powered by MHonArc 2.6.16.

Top of Page