Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly


Chronological Thread 
  • From: Casey Russell <>
  • To: Mark Feit <>
  • Cc: Larry Blunk <>, "" <>
  • Subject: Re: [perfsonar-user] Re: meshconfig-agent-tasks not scheduling tasks regularly
  • Date: Fri, 20 Oct 2017 17:07:36 -0500
  • Ironport-phdr: 9a23:cQ9AphOjIGI1fu+w3Y8l6mtUPXoX/o7sNwtQ0KIMzox0I/77rarrMEGX3/hxlliBBdydsKMUzbKO+4nbGkU4qa6bt34DdJEeHzQksu4x2zIaPcieFEfgJ+TrZSFpVO5LVVti4m3peRMNQJW2aFLduGC94iAPERvjKwV1Ov71GonPhMiryuy+4ZPebgFLiTanfb9+MAi9oBnMuMURnYZsMLs6xAHTontPdeRWxGdoKkyWkh3h+Mq+/4Nt/jpJtf45+MFOTav1f6IjTbxFFzsmKHw65NfqtRbYUwSC4GYXX3gMnRpJBwjF6wz6Xov0vyDnuOdxxDWWMMvrRr0vRz+s87lkRwPpiCcfNj427mfXitBrjKlGpB6tvgFzz5LIbI2QMvd1Y6HTcs4ARWdZXMlRWSxPDI2/YYUSEeQOIf1VoJPhq1YUtxayGRWgCeHpxzRVhnH2x6o60+E5HAzYxgMgBMwBsXTJp9vpKacSSvu1w7fMzTXHcvhb3ivy6JLVchA6uvGDQ7Zwcc7KxEksDQzFiE+QppLjPz+P0OQCrXSb4vNmWOmyhWAnrARxrSKuxscqkoTJgoMVylbH9Spn3Yk1JNu4RFRnbt6jFZtcrz2aNoV3QsM+X2Fnpjw6xqcatp68eSgG0Jsnxx/Da/yHboiH+QjvW/qWITd9nH5le6iwhxCo8Ue+zO3wTM+030hWridDj9LCtWgN2gTN5sWGVvdw/EKs2TiM2g/I9u1JJE85mbbHJ5Mvx7M/ioYfvEXGEyLzhkn7gq+be0M58eay8evneK/pppqEOo90lA7+NqMul9S6AesiMwgOW3GX+eO91LH/5EH4T6tGg/M2n6XDv5DaIsMbpqG9AwBLyIos9xG/DzK+3NQZm3kIMk5FdQqZg4XoJ13DIvX1Dfm8jlu3jDtmwv/LMqH9DpjDK3XMjKvtcLd45kNZ1gY+w9BS64pRCr4bIfLzXkHxtMbfDh88KwG73/znCMhm1oMFR22PBquZPbjRsVCS4OIvOfeDZIkPtDb7Nfcl++bijWUlll8FYampwZwXZWigHvt4OUWZb2HsgtAHEWgQpAoyVfHqiEacXj5JfHuyW6M85ionCIK9E4vPXIGtgLqd3CilBJ1WYH5JCkySHXvyaYqLRuoMO2quJZpEmyYHHZasSpNpgRSguQ7m47thMufO/CAE79Tu2MUjtMPJkhRn3jVvAtXV6XyWVGxwmitcTCUrx7typUhVyV6Fy6V+xfpVCYoAtLtyTg4mOMuEnKRBANfoV1eEJ4/RRQ==

My apologies Mark, I went ahead and rebooted that host about 11:00am CDT this morning to restore services to it.  So by the time you looked, it would have been fine.  

I've placed your script on that same host (and I'll probably follow up and put it on a couple of others).  We'll see if we can catch one in action.




Sincerely,
Casey Russell
Network Engineer
KanREN
phone785-856-9809
2029 Becker Drive, Suite 282
Lawrence, Kansas 66047
linkedin twitter twitter

On Fri, Oct 20, 2017 at 3:39 PM, Mark Feit <> wrote:

Casey Russell writes:

 

    One of my hosts (ps-ku-bw) has failed to schedule tasks today.  This is one of my larger hosts and the MaxClients problem might have actually been the trigger that began the avalanche.  I've left the host broken in case Mark or one of the other developers wants information from it while it's in this failed state.

 

Sorry it took me so long to get to this; I had some catching up to do after TechEx.

 

As of right now, it looks like that host is fine.  I watched it for several minutes using the monitor (pscheduler monitor --host ps-ku-bw.perfsonar.kanren.net) and saw lots of streaming latency running plus a steady diet of trace and the occasional throughput.

 

 At 9:47am yesterday, the httpd error log showed the following:

 

[root@ps-ku-bw crussell]# tail -f /var/log/httpd/error_log

[Wed Oct 18 09:47:29 2017] [error] server reached MaxClients setting, consider raising the MaxClients setting

 

Since pScheduler depends on Apache to do scheduling, I could see this being the cause of runs not being scheduled.  We should focus on what’s causing the number of connections to get that high.  I wrote up a short shell script that will ping pScheduler periodically and produce a netstat if the ping fails.  Leave it to run for awhile on the offending perfSONAR node and it should catch the conditions when Apache runs out of connections.  You can download it here:  https://gist.github.com/mfeit-internet2/24ad3bb83a6dd3fdef87fb9469f92a4a

 

As long as the central MA isn’t having this problem, any run that completes will be archived correctly since the local server isn’t involved in that process.

 

--Mark

 





Archive powered by MHonArc 2.6.19.

Top of Page