Skip to Content.
Sympa Menu

grouper-users - RE: [grouper-users] RE: Status Monitoring - Two Errors

Subject: Grouper Users - Open Discussion List

List archive

RE: [grouper-users] RE: Status Monitoring - Two Errors


Chronological Thread 
  • From: "Hyzer, Chris" <>
  • To: "Gettes, Michael" <>, "Black, Carey M." <>
  • Cc: Ryan Rumbaugh <>, "" <>
  • Subject: RE: [grouper-users] RE: Status Monitoring - Two Errors
  • Date: Fri, 7 Sep 2018 13:38:03 +0000
  • Accept-language: en-US
  • Authentication-results: spf=none (sender IP is ) ;
  • Ironport-phdr: 9a23:wsOY6R0kv/ZYF5lksmDT+DRfVm0co7zxezQtwd8ZsesXL/3xwZ3uMQTl6Ol3ixeRBMOHs60C07KempujcFRI2YyGvnEGfc4EfD4+ouJSoTYdBtWYA1bwNv/gYn9yNs1DUFh44yPzahANS47xaFLIv3K98yMZFAnhOgppPOT1HZPZg9iq2+yo9JDffwdFiCChbb9uMR67sRjfus4KjIV4N60/0AHJonxGe+RXwWNnO1eelAvi68mz4ZBu7T1et+ou+MBcX6r6eb84TaFDAzQ9L281/szrugLdQgaJ+3ART38ZkhtMAwjC8RH6QpL8uTb0u+ZhxCWXO9D9QKsqUjq+8ahkVB7oiD8GNzEn9mHXltdwh79frB64uhBz35LYbISTOfFjfK3SYMkaSHJBUMhPSiJBHo2yYYgBD+UDPOZXs4byqkAUoheiGQWhHv/jxiNKi3LwwKY00/4hEQbD3AE4Ed4AsnTVrdTrO6cISey+0bfFzTXZb/NXwjfx5pXDfxckof6QXbJxccvQxlc1Fw7ej1WQspDqMymI1uQVrWeb6exgWfixhGE6tgF8uz6izdoihInOg4Ia0FHE9SNhzYY6P924T1N7Yca6H5tKrS2VLZN2Qts6T2F0oik60KcKuZG6fSQS1JQnxwTfZOKdfIeV+B7jTvuRITFgiHJlZLK/mw6y8VO+xe3mSMm01EhFrjZfntnOrH8NyQbc5tCfSvtn+Ueh3jiP1xjU6uFFJ0A0ibDXJIImwr41jpYTrFjDHi7ymEnsiq+ZaFkk9+614OrkerXrvoGQOJVohg3jN6kih9GzDfk9PwgAUGWX5fiw2bjh8ELnT7hFkvg7n6zHvJzGOcgXvrO1DgFJ3oo57xuzEi2q3MoFkXQDNl5JZRCKgonxN1HAPv/1DuuzjlGpnTh12/zLPKbuD5DJI3XAnrfsebNw5k9cxQc309xS4pxUB7UfL/3oXEL6qcbWAQUjMwOuxubqENV92Z0aWWKIGqKXKL/fv1iU6u8gOuWCeZcbtCvkJ/gi/PHhk2U1mVgAfam1xpQXb224HvJ7LEmDeXrsmNABEXsUsQUiUOzqiVqCUThJa3a1QqI8+jU7CIWhDYfAXIyinLuB3CKjEp1XYGBJFEyMHG/2e4qeR/sAdSefLtJ8njAZULWhRYAs2Q2yuAL/ybdoMOXZ9TEduJ/mytR5+/HfmQk39TNuDsSd12+NT3tznmMNXzI22bxwoUx9y1aCz6d4medVFd1I5/xVSAs6KIbcwPJ8C9D0QA7OYMqGSEu+TtW8HT4xVs4xw8MJY0tlFNWilBfD3zeyA7ALjbyHHYE08rnC0HjrPMZw0HLG1Kg6j1k6WctDK3eqhq959wjPGYHJiUOZmLi2dagCxiLC6nqMzXfd9H1fBURfXLnZUGtbLm7XptTw60eIB+uhBKg7PxBp1MCGbKZGd4utxR9sQP7oO5CWSGurlnb4IFDCjueGaIPheCNEhn71D1MZ1Q0f4CDVGxI5A3Lrg3PMASYqXXnveUL3u6EqrXi7X1051SmLdEYnyqK4/BhTiPCBHaBAlokYsTss/m0nVG222MjbXp/Z/1I7JvdVfM897VFb1GnQqw16ONm6Irt/gkIFLVglpFvgghNwDIgI0dMnqn8n1kJTEeqZyxsAElHQxpXsIvvSI2j28gqobvvT0VHPwtuM0qYU47Ils1jlukekGldxu3g=
  • Spamdiagnosticmetadata: NSPM
  • Spamdiagnosticoutput: 1:99

I don’t think it is bad to stop loader jobs abruptly, but I agree that when it starts again it should continue with in progress jobs.  Right?  If we wait until work finishes, how do you define work, and will it ever really finish?  If it picks back up where it left off, it should be fine since things are transactional and not marked as complete until complete…  thoughts?

 

Thanks

Chris

 

From: [mailto:] On Behalf Of Gettes, Michael
Sent: Monday, August 27, 2018 12:00 PM
To: Black, Carey M. <>
Cc: Ryan Rumbaugh <>;
Subject: Re: [grouper-users] RE: Status Monitoring - Two Errors

 

I’ve always wanted a quiesce capability.  Something that lets all the current work complete but the current loader instance won’t start any new jobs.  This would be needed for all loader daemons or just specific ones so we can safely take instances down.  I have no idea if this is possible with Quartz and haven’t had a chance to look into it.

 

/mrg



On Aug 27, 2018, at 11:20 AM, Black, Carey M. <> wrote:

 

Ryan,

 

RE: “I had been restarting the API daemon” …  ( due to docker use )

                I have often wondered how the “shutdown process” works for the daemon. Is it “graceful” ( and lets all running jobs complete before shutdown) or does it just “pull the plug”? 

                                I think it just pulls the plug.

                                Which “leaves” running jobs as “in progress”(in the DB status table) and they refuse to immediately start when the loader restarts. Well, until the “in progress” record(s) get old enough that they are assumed to be dead. Then the jobs will no longer refuse to start.

 

                I say that to say this:

                                If the loader is restarted repeatedly, quickly, and/or often, you may be interrupting the running jobs and leaving them as “in progress” (in the DB) and producing more delay on the jobs re-starting again. But it all depends on how fast/often those things are spinning up and down.

 

                                However, maybe If you always spinning up instances (and let the old ones run for a bit) you may be able to “wait till a good time” to turn them off.

                                Maybe if you cycle out the old instances gracefully by timing it with these settings?

                                “

                                ##################################

                                ## enabled / disabled cron

                                ##################################

                                

                                #quartz cron-like schedule for enabled/disabled daemon.  Note, this has nothing to do with the changelog

                                #leave blank to disable this, the default is 12:01am, 11:01am, 3:01pm every day: 0 1 0,11,15 * * ?

                                changeLog.enabledDisabled.quartz.cron = 0 1 0,11,15 * * ?

                                “

 

 

RE: how to schedule the “deprovisioningDaemon”

 

                Verify that your grouper-loader.base.properties has this block: ( or you can add it to your grouper-loader.properties )

                NOTE: it was added to the default base as of GRP-1623. ( which maps to grouper_v2_3_0_api_patch_107  ( and for the UI grouper_v2_3_0_ui_patch_44 ) ) You likely are past those patches… but just saying. J

                “

                #####################################

                ## Deprovisioning Job

                #####################################

                otherJob.deprovisioningDaemon.class = edu.internet2.middleware.grouper.app.deprovisioning.GrouperDeprovisioningJob

                otherJob.deprovisioningDaemon.quartzCron = 0 0 2 * * ?

                “

 

HTH.

 

-- 

Carey Matthew 

 

From:  <> On Behalf Of Ryan Rumbaugh
Sent: Monday, August 27, 2018 10:12 AM
To: 
Subject: [grouper-users] RE: Status Monitoring - Two Errors

 

An update to this issue that may be helpful to others…

 

Before I left the office on Friday I ran the gsh command “loaderRunOneJob(“CHANGE_LOG_changeLogTempToChangeLog”)” process and now the number of rows in the change_entry_temp table is zero! I tried running that before, but really didn’t see much of anything happening. Maybe I was just too impatient.

 

Now when accessing grouper/status?diagnosticType=all the only error is related to “OTHER_JOB_deprovisioningDaemon”. If anyone had any tips on how to get that kick started it would be greatly appreciated.

 

 

--

Ryan Rumbaugh

 

From:  <> On Behalf Of Ryan Rumbaugh
Sent: Friday, August 24, 2018 9:15 AM
To: 
Subject: [grouper-users] Status Monitoring - Two Errors

 

Good morning,

 

We would like to begin monitoring the status of grouper by using the diagnostic pages at grouper/status?diagnosticType=all, but before doing so I would like to take care of the two issues shown below.

 

Can anyone provide tips/suggestions on how to fix the two failures for CHANGE_LOG_changeLogTempToChangeLog and  OTHER_JOB_deprovisioningDaemon?

 

We had a Java heap issue late last week which I believe caused the “grouper_change_log_entry_temp” table to keep growing. It’s at 69,886 rows currently while earlier this week it was at 50k. Thanks for any insight.

 

 

 

2 errors in the diagnostic tasks:

 

DiagnosticLoaderJobTest, Loader job CHANGE_LOG_changeLogTempToChangeLog

 

DiagnosticLoaderJobTest, Loader job OTHER_JOB_deprovisioningDaemon

 

 

 

Error stack for: loader_CHANGE_LOG_changeLogTempToChangeLog

java.lang.RuntimeException: Cant find a success in job CHANGE_LOG_changeLogTempToChangeLog since: 2018/08/16 14:19:22.000, expecting one in the last 30 minutes

                at edu.internet2.middleware.grouper.j2ee.status.DiagnosticLoaderJobTest.doTask(DiagnosticLoaderJobTest.java:175)

                at edu.internet2.middleware.grouper.j2ee.status.DiagnosticTask.executeTask(DiagnosticTask.java:78)

                at edu.internet2.middleware.grouper.j2ee.status.GrouperStatusServlet.doGet(GrouperStatusServlet.java:180)

                at javax.servlet.http.HttpServlet.service(HttpServlet.java:635)

                at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)

                at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:230)

                at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:165)

                at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)

                at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:192)

                at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:165)

                at org.owasp.csrfguard.CsrfGuardFilter.doFilter(CsrfGuardFilter.java:110)

                at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:192)

                at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:165)

                at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)

                at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)

                at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:478)

                at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)

                at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:80)

                at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:624)

                at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)

                at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:341)

                at org.apache.coyote.ajp.AjpProcessor.service(AjpProcessor.java:478)

                at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)

                at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:798)

                at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1441)

                at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)

                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

                at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)

                at java.lang.Thread.run(Thread.java:748)

 

 

Error stack for: loader_OTHER_JOB_deprovisioningDaemon

java.lang.RuntimeException: Cant find a success in job OTHER_JOB_deprovisioningDaemon, expecting one in the last 3120 minutes

                at edu.internet2.middleware.grouper.j2ee.status.DiagnosticLoaderJobTest.doTask(DiagnosticLoaderJobTest.java:173)

                at edu.internet2.middleware.grouper.j2ee.status.DiagnosticTask.executeTask(DiagnosticTask.java:78)

                at edu.internet2.middleware.grouper.j2ee.status.GrouperStatusServlet.doGet(GrouperStatusServlet.java:180)

                at javax.servlet.http.HttpServlet.service(HttpServlet.java:635)

                at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)

                at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:230)

                at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:165)

                at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)

                at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:192)

                at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:165)

                at org.owasp.csrfguard.CsrfGuardFilter.doFilter(CsrfGuardFilter.java:110)

                at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:192)

                at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:165)

                at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)

                at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)

                at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:478)

                at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)

                at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:80)

                at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:624)

                at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)

                at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:341)

                at org.apache.coyote.ajp.AjpProcessor.service(AjpProcessor.java:478)

                at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)

                at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:798)

                at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1441)

                at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)

                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

                at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)

                at java.lang.Thread.run(Thread.java:748)

 

--

Ryan Rumbaugh

 




Archive powered by MHonArc 2.6.19.

Top of Page