Skip to Content.
Sympa Menu

grouper-users - RE: [grouper-users] database loader job with START_TO_START_INTERVAL

Subject: Grouper Users - Open Discussion List

List archive

RE: [grouper-users] database loader job with START_TO_START_INTERVAL


Chronological Thread 
  • From: Chris Hyzer <>
  • To: Scott Koranda <>, grouper-users <>
  • Subject: RE: [grouper-users] database loader job with START_TO_START_INTERVAL
  • Date: Thu, 17 Jul 2014 01:22:05 +0000
  • Accept-language: en-US

Quartz is 1.6.0

The Grouper job implements StatefulJob so multiple wont run at once:

/**
* class which will run a loader job
* implements StatefulJob so multiple dont run at once
*/
public class GrouperLoaderJob implements Job, StatefulJob {

Just curious, what is it in the three loader jobs that requires them to run
concurrently? I would think if new groups are seldom added and the third job
uses a previous job's groups as members, it would generally still work fine
and would catch up in the next hour, right? All the memberships in the
existing groups would work fine. Can you describe a case where you need the
dependency in the jobs?

Im not aware of other people having this issue, but I wonder if we had a job
depend on another job if we could have it sleep if it noticed the other job
was running until it was done... I would think this would be a rare
requirement, but Im interested to hear about it :) Or another way to chain
them together.

Thanks,
Chris

-----Original Message-----
From: Scott Koranda
[mailto:]

Sent: Wednesday, July 16, 2014 2:30 PM
To: Chris Hyzer; grouper-users
Subject: Re: [grouper-users] database loader job with START_TO_START_INTERVAL

Hi,

On Wed, Jul 16, 2014 at 1:04 PM, Chris Hyzer
<>
wrote:
> Why do you not just have loader jobs with a quartz-cron schedule to run
> these hourly?

My concern is that on some days due to new courses being added or
removed the load will be heavy and take much longer than an hour to
complete.

I do not know if the loader scheduler is sophisticated enough to not
run a second instance of a loader job when one is already running, nor
how gracefully the race condition would be handled by Grouper.

> Sorry, I know you explained it, but can you expand? I think quartz will
> not run the same job twice if already running.

Can you tell me precisely which version of Quartz is used in Grouper
2.1.5? I will then try to read up and determine what the expected
behavior is. A pointer to the right place in the source would also be
helpful so I can see what the implementation looks like (unless you
already have this documented, in which case I apologize--I looked for
it but could not find it so please point me to it).

If Quartz will not run the same job twice if already running then that
will really help (and I will document it on the wiki).

> Also, for the dependencies, if the students job isn’t done when the
> instructors job runs, it shouldn’t really matter since it just adds groups
> to other groups and will catch up if there is a race condition in the next
> hour...

It has more to do with latency--if I know the jobs run in series
without any delay in between them then the latency is just that much
reduced and the changes make it to the change log and the custom
change log consumer faster.

> once the jobs are setup they should run quickly should generally will be
> done by the time the next one is scheduled.

My testing shows it depends on the number of rows that need to be
added and deleted. During a term turnover it can be a substantial
increase in time.

Thanks,

Scott

>
> Thanks,
> Chris
>
> -----Original Message-----
> From: Scott Koranda
> [mailto:]
> Sent: Tuesday, July 15, 2014 4:41 PM
> To: Chris Hyzer
> Cc: grouper-users
> Subject: Re: [grouper-users] database loader job with
> START_TO_START_INTERVAL
>
> Hi,
>
> I have decided to take this outside of the loader and instead use the
> 'monit' utility to check once every hour if a GSH script is running.
> The GSH script runs the 3 loader jobs in series that together
> provision the groups we need provisioned (students, instructors, then
> 'all', much like the Penn example in the documentation).
>
> monit will check every hour if the GSH script is running and if not it
> will start it again.
>
> Rather than creating, running, and then destroying the loader jobs on
> the fly I prefer to use loaderRunOneJob() for jobs that are
> permanently saved in Grouper. The issue then is to make sure the
> loader never actually runs the jobs.
>
> I have tried to do that by setting the grouperLoaderQuartzCron for the
> jobs to "0 0 0 * * ? 2099" so that, theoretically, they should only
> run in the year 2099 (by which time I hope to be retired and no longer
> responsible for any group provisioning).
>
> Is there any reason to expect that Grouper and the loader will not
> respect that cron configuration?
>
> Thanks,
>
> Scott
>
> On Wed, Jul 9, 2014 at 2:08 PM, Scott Koranda
> <>
> wrote:
>> Understood.
>>
>> This issue is that we want the loader jobs to run as often as possible
>> to help beat down provisioning latency. But at certain times of the
>> year (when semesters turn over) the amount of work, and hence the
>> amount of time it takes for the loader job to run, spikes. The danger
>> then is that two instances run at the same time.
>>
>> I do not see how to reconcile those requirements/needs with CRON so I
>> was testing the START_TO_START_INTERVAL functionality, but I did not
>> expect random start times.
>>
>> I appreciate any recommendations you can make.
>>
>> Thanks,
>>
>> Scott
>>
>> On Wed, Jul 9, 2014 at 2:03 PM, Chris Hyzer
>> <>
>> wrote:
>>> Hmmmm, yes, you would :)
>>>
>>> For a while now I have always used cron scheduling. Pick a random time
>>> for it to run and run it every hour at that time (so they don’t all run
>>> at the same time). Then you can have a better idea about this :)
>>>
>>> Thanks,
>>> Chris
>>>
>>>
>>> -----Original Message-----
>>> From: Scott Koranda
>>> [mailto:]
>>> Sent: Wednesday, July 09, 2014 3:01 PM
>>> To: Chris Hyzer
>>> Cc: grouper-users
>>> Subject: Re: [grouper-users] database loader job with
>>> START_TO_START_INTERVAL
>>>
>>> Thanks.
>>>
>>> If I kick one off manually do I have to worry about the possibility of
>>> two running simultaneously, or will having one instance running
>>> prevent the other one from running?
>>>
>>> Thanks,
>>>
>>> Scott
>>>
>>> On Wed, Jul 9, 2014 at 1:58 PM, Chris Hyzer
>>> <>
>>> wrote:
>>>> It doesn’t necessarily start when the loader starts:
>>>>
>>>> //start time is the interval seconds / 5, rand
>>>> int startSeconds = (int)(Math.random() * intervalSeconds);
>>>> Date startTime = new Date(System.currentTimeMillis() +
>>>> (startSeconds*1000));
>>>>
>>>> Don’t want all START_TO_STARTs to start when the loader starts or you
>>>> could have performance problems :)
>>>>
>>>> If you want to kick one off manually, you can do that from GSH:
>>>>
>>>> https://spaces.internet2.edu/display/Grouper/GrouperShell+%28gsh%29#GrouperShell%28gsh%29-Loader
>>>>
>>>> Thanks,
>>>> Chris
>>>>
>>>> -----Original Message-----
>>>> From:
>>>>
>>>>
>>>> [mailto:]
>>>> On Behalf Of Scott Koranda
>>>> Sent: Wednesday, July 09, 2014 1:46 PM
>>>> To: grouper-users
>>>> Subject: [grouper-users] database loader job with START_TO_START_INTERVAL
>>>>
>>>> Hi,
>>>>
>>>> I created a loader job that uses START_TO_START_INTERVAL with an
>>>> interval of 3600 seconds.
>>>>
>>>> My understanding is that when I restart the loader process the loader
>>>> job should start immediately, and then run again one hour after it
>>>> completes.
>>>>
>>>> Is that correct?
>>>>
>>>> I do not see any evidence in grouper_error.log (at the INFO level)
>>>> that the loader started the job after it was restarted. Should I?
>>>>
>>>> Thanks,
>>>>
>>>> Scott



Archive powered by MHonArc 2.6.16.

Top of Page