Skip to Content.
Sympa Menu

grouper-dev - Re: [grouper-dev] ldap errors and real time provisioning

Subject: Grouper Developers Forum

List archive

Re: [grouper-dev] ldap errors and real time provisioning


Chronological Thread 
  • From: Tom Zeller <>
  • To: Grouper Dev <>
  • Subject: Re: [grouper-dev] ldap errors and real time provisioning
  • Date: Tue, 19 Jun 2012 12:24:56 -0500

I am adding a retryOnError option to the psp change log consumer, what
should the default be ?

Currently, retryOnError is false, meaning do not retry a change log entry.

Should retryOnError be true for 2.1.1 ?

Thanks,
TomZ

On Thu, May 31, 2012 at 1:56 PM, Michael R. Gettes
<>
wrote:
> See https://bugs.internet2.edu/jira/browse/GRP-799
>
> I hope it is sufficient.
>
> /mrg
>
> On May 31, 2012, at 12:45, Chris Hyzer wrote:
>
>> The change log is designed for this behavior if you implement the consumer
>> this way (i.e. after Michael submits a jira, TomZ could put that switch
>> in).  Just return the last index of the change log that was processed, and
>> it will do nothing until the next minute, and will try that same record
>> again.  Maybe if we want an error queue that could be built into the
>> change log so other consumers could benefit as well.  If TomZ does
>> implement Michael's request, it would probably be nice if the full sync
>> would somehow update the current change log index to the max index so if
>> real-time was stuffed due to missing subject that it would startup again
>> after the full sync at the point where the full sync started...  if the
>> incrementals were stalled for some reason (for longer than a certain
>> period of time), you would be notified I believe via the grouper
>> diagnostics if you have that hooked up to nagios or whatever...
>>
>> Thanks,
>> Chris
>>
>>
>> -----Original Message-----
>> From:
>>
>>
>> [mailto:]
>> On Behalf Of Tom Zeller
>> Sent: Thursday, May 31, 2012 12:08 PM
>> To: Grouper Dev
>> Subject: Re: [grouper-dev] ldap errors and real time provisioning
>>
>> Submit a bug or improvement to jira so we can estimate implementation.
>>
>> For this particular scenario, I think most of the work involves
>> defining "failure", which will most likely be some sort of
>> javax.naming.NamingException. The simplest thing to do may be to
>> (block and) retry any NamingException. Another option may be to make
>> decisions based on the error message of the NamingException.
>>
>> The configuration should probably reside in grouper-loader.properties,
>> near other change log consumer settings. Perhaps a toggle,
>> onNamingException = retry | ignore.
>>
>> Right now, NamingExceptions are ignored, meaning they are logged and
>> the next change log record is processed.
>>
>> Or, maybe the configuration property should consist of actions
>> followed by comma separated exceptions or error messages
>>
>> retry=NamingException, commit failed
>> ignore=AttributeInUseException
>>
>> Not sure about that last one, hopefully someone has a better idea.
>>
>> TomZ
>>
>> On Thu, May 31, 2012 at 10:27 AM, Michael R. Gettes
>> <>
>> wrote:
>>> What can I do to convince you to, in the very least, provide an option to
>>> block on failures?  It is how I would want to run it.
>>>
>>> /mrg
>>>
>>> On May 31, 2012, at 10:53, Tom Zeller wrote:
>>>
>>>> For 2.1.0, I decided to avoid blocking and rely on full
>>>> synchronizations, which may be scheduled in grouper-loader.properties,
>>>> to repair real time provisioning failures.
>>>>
>>>> When I was dealing with error handling in the psp change log consumer,
>>>> I thought of the Northern Exposure episode where the computer prompts
>>>> "Abort, Retry, Fail ?" and the user is unable to answer (freaks out)
>>>> and turns off the computer.
>>>>
>>>> I felt that blocking change log processing was probably the least
>>>> desirable option.
>>>>
>>>> A failure queue is interesting, but it may be important to preserve
>>>> the order of operations, so we'll need to think that through. We might
>>>> need to configurably map provisioned target exceptions to abort |
>>>> retry | fail | ignore handling.
>>>>
>>>> In this particular scenario, we would need to map the "commit failed"
>>>> ldap error to "retry", probably waiting some configurable interval
>>>> (60s, 5min, ?) before retrying.
>>>>
>>>> TomZ
>>>>
>>>> On Thu, May 31, 2012 at 9:30 AM, Gagné Sébastien
>>>> <>
>>>> wrote:
>>>>> I was asking myself the same question. Maybe a missing group in the
>>>>> LDAP, it could be manually deleted by another application. Maybe a
>>>>> missing subject ? (but that would be caught in Grouper before the LDAP
>>>>> request).
>>>>>
>>>>> We are still experimenting with the provisioning and the grouper loader
>>>>> and we had many occasion where data didn't match (login vs full DN).
>>>>> That might affect my current impression.  When the configuration is
>>>>> done correctly I suppose the data will always match.
>>>>>
>>>>>
>>>>> -----Message d'origine-----
>>>>> De : Michael R. Gettes
>>>>> [mailto:]
>>>>> Envoyé : 31 mai 2012 10:17
>>>>> À : Gagné Sébastien
>>>>> Cc : Lynn Garrison; Grouper Dev
>>>>> Objet : Re: [grouper-dev] ldap errors and real time provisioning
>>>>>
>>>>> what kind of "bad data" are you considering?
>>>>>
>>>>> /mrg
>>>>>
>>>>> On May 31, 2012, at 9:56, Gagné Sébastien wrote:
>>>>>
>>>>>> I agree that would be an interesting feature, but the reaction should
>>>>>> depend on the LDAP error. Some errors could be because of bad data in
>>>>>> one record and these shouldn't block the provisioning of all the other
>>>>>> changelog. I think this is where an error queue might be useful; you
>>>>>> try them all and if one has bad data, it will be in the error queue to
>>>>>> retry later, but all the others will still complete successfully. Of
>>>>>> course if the ldap server has a problem you'll have a huge error
>>>>>> queue, but they would have been waiting in the changelog anyway. I
>>>>>> think it's important for the error queue to be retried periodically
>>>>>>
>>>>>> There's the PSP daily full sync that kinda solves this problem. If you
>>>>>> enable it, all the failed transactions will be synched later when the
>>>>>> ldap server will be back online. I believe this sync isn't based on
>>>>>> the changelog but on a diff between Grouper and the LDAP.
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Message d'origine-----
>>>>>> De :
>>>>>>
>>>>>> [mailto:]
>>>>>> De la part de Michael R.
>>>>>> Gettes Envoyé : 31 mai 2012 09:31 À : Lynn Garrison Cc : Grouper Dev
>>>>>> Objet : Re: [grouper-dev] ldap errors and real time provisioning
>>>>>>
>>>>>> +1 to this request.  failures should block processing.  i view this
>>>>>> similar to data replication - the idea is to keep the data in sync and
>>>>>> if there are problems in the sync process, they should block, or, in
>>>>>> the very least, be placed into an error queue. I hate the error queue
>>>>>> notion but I do realize lots of products do things this way these days.
>>>>>>
>>>>>> /mrg
>>>>>>
>>>>>> On May 31, 2012, at 9:26, Lynn Garrison wrote:
>>>>>>
>>>>>>>      Is there a way to stop the real time provisioning if there are
>>>>>>> problems with the ldap server?   We moved to testing real time
>>>>>>> provisioning  with  openldap.  During the provisioning testing, the
>>>>>>> file system became full and ldap updates started returning errors.
>>>>>>>
>>>>>>>
>>>>>>> 2012-05-31 09:15:16,001: [DefaultQuartzScheduler_Worker-8] ERROR
>>>>>>> BaseSpmlProvider.execute(388) -  - Target 'psp' - Modify XML:
>>>>>>> <modifyResponse xmlns='urn:oasis:names:tc:SPML:2:0' status='failure'
>>>>>>> requestID='2012/05/31-09:15:15.993' error='customError'>
>>>>>>> <errorMessage>[LDAP: error code 80 - commit failed]</errorMessage>
>>>>>>> </modifyResponse>
>>>>>>>
>>>>>>>      psp continued to process the change log events.  By the time we
>>>>>>> realized what was happening, all the change log events had been
>>>>>>> processed and only have the members were provisioned to the group.
>>>>>>>
>>>>>>>
>>>>>>> Lynn
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>



Archive powered by MHonArc 2.6.16.

Top of Page