grouper-dev - Re: [grouper-dev] ldap errors and real time provisioning

Subject: Grouper Developers Forum

List archive

Re: [grouper-dev] ldap errors and real time provisioning

From: Tom Zeller <>
To: "Michael R. Gettes" <>
Cc: Grouper Dev <>
Subject: Re: [grouper-dev] ldap errors and real time provisioning
Date: Tue, 19 Jun 2012 16:37:03 -0500

I'll commit retryOnError = false to grouper-loader.properties for now.

Thanks.

On Tue, Jun 19, 2012 at 12:54 PM, Michael R. Gettes
<>
wrote:
> I recommend retryOnError be false by default. RetryOnError true, I
> believe, should be something someone consciously changes and clearly
> documented. I won't put up a fight if others feel strongly for the
> opposite.
>
> /mrg
>
> On Jun 19, 2012, at 13:24, Tom Zeller wrote:
>
>> I am adding a retryOnError option to the psp change log consumer, what
>> should the default be ?
>>
>> Currently, retryOnError is false, meaning do not retry a change log entry.
>>
>> Should retryOnError be true for 2.1.1 ?
>>
>> Thanks,
>> TomZ
>>
>> On Thu, May 31, 2012 at 1:56 PM, Michael R. Gettes
>> <>
>> wrote:
>>> See https://bugs.internet2.edu/jira/browse/GRP-799
>>>
>>> I hope it is sufficient.
>>>
>>> /mrg
>>>
>>> On May 31, 2012, at 12:45, Chris Hyzer wrote:
>>>
>>>> The change log is designed for this behavior if you implement the
>>>> consumer this way (i.e. after Michael submits a jira, TomZ could put
>>>> that switch in). Just return the last index of the change log that was
>>>> processed, and it will do nothing until the next minute, and will try
>>>> that same record again. Maybe if we want an error queue that could be
>>>> built into the change log so other consumers could benefit as well. If
>>>> TomZ does implement Michael's request, it would probably be nice if the
>>>> full sync would somehow update the current change log index to the max
>>>> index so if real-time was stuffed due to missing subject that it would
>>>> startup again after the full sync at the point where the full sync
>>>> started... if the incrementals were stalled for some reason (for longer
>>>> than a certain period of time), you would be notified I believe via the
>>>> grouper diagnostics if you have that hooked up to nagios or whatever...
>>>>
>>>> Thanks,
>>>> Chris
>>>>
>>>>
>>>> -----Original Message-----
>>>> From:
>>>>
>>>>
>>>> [mailto:]
>>>> On Behalf Of Tom Zeller
>>>> Sent: Thursday, May 31, 2012 12:08 PM
>>>> To: Grouper Dev
>>>> Subject: Re: [grouper-dev] ldap errors and real time provisioning
>>>>
>>>> Submit a bug or improvement to jira so we can estimate implementation.
>>>>
>>>> For this particular scenario, I think most of the work involves
>>>> defining "failure", which will most likely be some sort of
>>>> javax.naming.NamingException. The simplest thing to do may be to
>>>> (block and) retry any NamingException. Another option may be to make
>>>> decisions based on the error message of the NamingException.
>>>>
>>>> The configuration should probably reside in grouper-loader.properties,
>>>> near other change log consumer settings. Perhaps a toggle,
>>>> onNamingException = retry | ignore.
>>>>
>>>> Right now, NamingExceptions are ignored, meaning they are logged and
>>>> the next change log record is processed.
>>>>
>>>> Or, maybe the configuration property should consist of actions
>>>> followed by comma separated exceptions or error messages
>>>>
>>>> retry=NamingException, commit failed
>>>> ignore=AttributeInUseException
>>>>
>>>> Not sure about that last one, hopefully someone has a better idea.
>>>>
>>>> TomZ
>>>>
>>>> On Thu, May 31, 2012 at 10:27 AM, Michael R. Gettes
>>>> <>
>>>> wrote:
>>>>> What can I do to convince you to, in the very least, provide an option
>>>>> to block on failures? It is how I would want to run it.
>>>>>
>>>>> /mrg
>>>>>
>>>>> On May 31, 2012, at 10:53, Tom Zeller wrote:
>>>>>
>>>>>> For 2.1.0, I decided to avoid blocking and rely on full
>>>>>> synchronizations, which may be scheduled in grouper-loader.properties,
>>>>>> to repair real time provisioning failures.
>>>>>>
>>>>>> When I was dealing with error handling in the psp change log consumer,
>>>>>> I thought of the Northern Exposure episode where the computer prompts
>>>>>> "Abort, Retry, Fail ?" and the user is unable to answer (freaks out)
>>>>>> and turns off the computer.
>>>>>>
>>>>>> I felt that blocking change log processing was probably the least
>>>>>> desirable option.
>>>>>>
>>>>>> A failure queue is interesting, but it may be important to preserve
>>>>>> the order of operations, so we'll need to think that through. We might
>>>>>> need to configurably map provisioned target exceptions to abort |
>>>>>> retry | fail | ignore handling.
>>>>>>
>>>>>> In this particular scenario, we would need to map the "commit failed"
>>>>>> ldap error to "retry", probably waiting some configurable interval
>>>>>> (60s, 5min, ?) before retrying.
>>>>>>
>>>>>> TomZ
>>>>>>
>>>>>> On Thu, May 31, 2012 at 9:30 AM, Gagné Sébastien
>>>>>> <>
>>>>>> wrote:
>>>>>>> I was asking myself the same question. Maybe a missing group in the
>>>>>>> LDAP, it could be manually deleted by another application. Maybe a
>>>>>>> missing subject ? (but that would be caught in Grouper before the
>>>>>>> LDAP request).
>>>>>>>
>>>>>>> We are still experimenting with the provisioning and the grouper
>>>>>>> loader and we had many occasion where data didn't match (login vs
>>>>>>> full DN). That might affect my current impression. When the
>>>>>>> configuration is done correctly I suppose the data will always match.
>>>>>>>
>>>>>>>
>>>>>>> -----Message d'origine-----
>>>>>>> De : Michael R. Gettes
>>>>>>> [mailto:]
>>>>>>> Envoyé : 31 mai 2012 10:17
>>>>>>> À : Gagné Sébastien
>>>>>>> Cc : Lynn Garrison; Grouper Dev
>>>>>>> Objet : Re: [grouper-dev] ldap errors and real time provisioning
>>>>>>>
>>>>>>> what kind of "bad data" are you considering?
>>>>>>>
>>>>>>> /mrg
>>>>>>>
>>>>>>> On May 31, 2012, at 9:56, Gagné Sébastien wrote:
>>>>>>>
>>>>>>>> I agree that would be an interesting feature, but the reaction should
>>>>>>>> depend on the LDAP error. Some errors could be because of bad data in
>>>>>>>> one record and these shouldn't block the provisioning of all the
>>>>>>>> other
>>>>>>>> changelog. I think this is where an error queue might be useful; you
>>>>>>>> try them all and if one has bad data, it will be in the error queue
>>>>>>>> to
>>>>>>>> retry later, but all the others will still complete successfully. Of
>>>>>>>> course if the ldap server has a problem you'll have a huge error
>>>>>>>> queue, but they would have been waiting in the changelog anyway. I
>>>>>>>> think it's important for the error queue to be retried periodically
>>>>>>>>
>>>>>>>> There's the PSP daily full sync that kinda solves this problem. If
>>>>>>>> you enable it, all the failed transactions will be synched later
>>>>>>>> when the ldap server will be back online. I believe this sync isn't
>>>>>>>> based on the changelog but on a diff between Grouper and the LDAP.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Message d'origine-----
>>>>>>>> De :
>>>>>>>>
>>>>>>>> [mailto:]
>>>>>>>> De la part de Michael R.
>>>>>>>> Gettes Envoyé : 31 mai 2012 09:31 À : Lynn Garrison Cc : Grouper Dev
>>>>>>>> Objet : Re: [grouper-dev] ldap errors and real time provisioning
>>>>>>>>
>>>>>>>> +1 to this request. failures should block processing. i view this
>>>>>>>> similar to data replication - the idea is to keep the data in sync
>>>>>>>> and if there are problems in the sync process, they should block,
>>>>>>>> or, in the very least, be placed into an error queue. I hate the
>>>>>>>> error queue notion but I do realize lots of products do things this
>>>>>>>> way these days.
>>>>>>>>
>>>>>>>> /mrg
>>>>>>>>
>>>>>>>> On May 31, 2012, at 9:26, Lynn Garrison wrote:
>>>>>>>>
>>>>>>>>> Is there a way to stop the real time provisioning if there are
>>>>>>>>> problems with the ldap server? We moved to testing real time
>>>>>>>>> provisioning with openldap. During the provisioning testing, the
>>>>>>>>> file system became full and ldap updates started returning errors.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2012-05-31 09:15:16,001: [DefaultQuartzScheduler_Worker-8] ERROR
>>>>>>>>> BaseSpmlProvider.execute(388) - - Target 'psp' - Modify XML:
>>>>>>>>> <modifyResponse xmlns='urn:oasis:names:tc:SPML:2:0' status='failure'
>>>>>>>>> requestID='2012/05/31-09:15:15.993' error='customError'>
>>>>>>>>> <errorMessage>[LDAP: error code 80 - commit failed]</errorMessage>
>>>>>>>>> </modifyResponse>
>>>>>>>>>
>>>>>>>>> psp continued to process the change log events. By the time
>>>>>>>>> we realized what was happening, all the change log events had been
>>>>>>>>> processed and only have the members were provisioned to the group.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Lynn
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>
>

Re: [grouper-dev] ldap errors and real time provisioning, Tom Zeller, 06/19/2012
- Re: [grouper-dev] ldap errors and real time provisioning, Michael R. Gettes, 06/19/2012
  - Re: [grouper-dev] ldap errors and real time provisioning, Tom Zeller, 06/19/2012
    - Re: [grouper-dev] ldap errors and real time provisioning, Shilen Patel, 06/20/2012
      - Re: [grouper-dev] ldap errors and real time provisioning, Michael R. Gettes, 06/20/2012
        
        RE: [grouper-dev] ldap errors and real time provisioning, Chris Hyzer, 06/20/2012
        
        Re: [grouper-dev] ldap errors and real time provisioning, Shilen Patel, 06/20/2012
        
        Re: [grouper-dev] ldap errors and real time provisioning, Tom Zeller, 06/20/2012
      - Re: [grouper-dev] ldap errors and real time provisioning, Tom Zeller, 06/20/2012

List archive

Re: [grouper-dev] ldap errors and real time provisioning