grouper-dev - Re: [grouper-dev] ldap errors and real time provisioning

Subject: Grouper Developers Forum

List archive

Re: [grouper-dev] ldap errors and real time provisioning

From: "Michael R. Gettes" <>
To: Tom Zeller <>
Cc: Grouper Dev <>
Subject: Re: [grouper-dev] ldap errors and real time provisioning
Date: Tue, 19 Jun 2012 17:54:51 +0000
Accept-language: en-US

I recommend retryOnError be false by default. RetryOnError true, I believe,
should be something someone consciously changes and clearly documented. I
won't put up a fight if others feel strongly for the opposite.

/mrg

On Jun 19, 2012, at 13:24, Tom Zeller wrote:

> I am adding a retryOnError option to the psp change log consumer, what
> should the default be ?
>
> Currently, retryOnError is false, meaning do not retry a change log entry.
>
> Should retryOnError be true for 2.1.1 ?
>
> Thanks,
> TomZ
>
> On Thu, May 31, 2012 at 1:56 PM, Michael R. Gettes
> <>
> wrote:
>> See https://bugs.internet2.edu/jira/browse/GRP-799
>>
>> I hope it is sufficient.
>>
>> /mrg
>>
>> On May 31, 2012, at 12:45, Chris Hyzer wrote:
>>
>>> The change log is designed for this behavior if you implement the
>>> consumer this way (i.e. after Michael submits a jira, TomZ could put that
>>> switch in). Just return the last index of the change log that was
>>> processed, and it will do nothing until the next minute, and will try
>>> that same record again. Maybe if we want an error queue that could be
>>> built into the change log so other consumers could benefit as well. If
>>> TomZ does implement Michael's request, it would probably be nice if the
>>> full sync would somehow update the current change log index to the max
>>> index so if real-time was stuffed due to missing subject that it would
>>> startup again after the full sync at the point where the full sync
>>> started... if the incrementals were stalled for some reason (for longer
>>> than a certain period of time), you would be notified I believe via the
>>> grouper diagnostics if you have that hooked up to nagios or whatever...
>>>
>>> Thanks,
>>> Chris
>>>
>>>
>>> -----Original Message-----
>>> From:
>>>
>>>
>>> [mailto:]
>>> On Behalf Of Tom Zeller
>>> Sent: Thursday, May 31, 2012 12:08 PM
>>> To: Grouper Dev
>>> Subject: Re: [grouper-dev] ldap errors and real time provisioning
>>>
>>> Submit a bug or improvement to jira so we can estimate implementation.
>>>
>>> For this particular scenario, I think most of the work involves
>>> defining "failure", which will most likely be some sort of
>>> javax.naming.NamingException. The simplest thing to do may be to
>>> (block and) retry any NamingException. Another option may be to make
>>> decisions based on the error message of the NamingException.
>>>
>>> The configuration should probably reside in grouper-loader.properties,
>>> near other change log consumer settings. Perhaps a toggle,
>>> onNamingException = retry | ignore.
>>>
>>> Right now, NamingExceptions are ignored, meaning they are logged and
>>> the next change log record is processed.
>>>
>>> Or, maybe the configuration property should consist of actions
>>> followed by comma separated exceptions or error messages
>>>
>>> retry=NamingException, commit failed
>>> ignore=AttributeInUseException
>>>
>>> Not sure about that last one, hopefully someone has a better idea.
>>>
>>> TomZ
>>>
>>> On Thu, May 31, 2012 at 10:27 AM, Michael R. Gettes
>>> <>
>>> wrote:
>>>> What can I do to convince you to, in the very least, provide an option
>>>> to block on failures? It is how I would want to run it.
>>>>
>>>> /mrg
>>>>
>>>> On May 31, 2012, at 10:53, Tom Zeller wrote:
>>>>
>>>>> For 2.1.0, I decided to avoid blocking and rely on full
>>>>> synchronizations, which may be scheduled in grouper-loader.properties,
>>>>> to repair real time provisioning failures.
>>>>>
>>>>> When I was dealing with error handling in the psp change log consumer,
>>>>> I thought of the Northern Exposure episode where the computer prompts
>>>>> "Abort, Retry, Fail ?" and the user is unable to answer (freaks out)
>>>>> and turns off the computer.
>>>>>
>>>>> I felt that blocking change log processing was probably the least
>>>>> desirable option.
>>>>>
>>>>> A failure queue is interesting, but it may be important to preserve
>>>>> the order of operations, so we'll need to think that through. We might
>>>>> need to configurably map provisioned target exceptions to abort |
>>>>> retry | fail | ignore handling.
>>>>>
>>>>> In this particular scenario, we would need to map the "commit failed"
>>>>> ldap error to "retry", probably waiting some configurable interval
>>>>> (60s, 5min, ?) before retrying.
>>>>>
>>>>> TomZ
>>>>>
>>>>> On Thu, May 31, 2012 at 9:30 AM, Gagné Sébastien
>>>>> <>
>>>>> wrote:
>>>>>> I was asking myself the same question. Maybe a missing group in the
>>>>>> LDAP, it could be manually deleted by another application. Maybe a
>>>>>> missing subject ? (but that would be caught in Grouper before the LDAP
>>>>>> request).
>>>>>>
>>>>>> We are still experimenting with the provisioning and the grouper
>>>>>> loader and we had many occasion where data didn't match (login vs full
>>>>>> DN). That might affect my current impression. When the configuration
>>>>>> is done correctly I suppose the data will always match.
>>>>>>
>>>>>>
>>>>>> -----Message d'origine-----
>>>>>> De : Michael R. Gettes
>>>>>> [mailto:]
>>>>>> Envoyé : 31 mai 2012 10:17
>>>>>> À : Gagné Sébastien
>>>>>> Cc : Lynn Garrison; Grouper Dev
>>>>>> Objet : Re: [grouper-dev] ldap errors and real time provisioning
>>>>>>
>>>>>> what kind of "bad data" are you considering?
>>>>>>
>>>>>> /mrg
>>>>>>
>>>>>> On May 31, 2012, at 9:56, Gagné Sébastien wrote:
>>>>>>
>>>>>>> I agree that would be an interesting feature, but the reaction should
>>>>>>> depend on the LDAP error. Some errors could be because of bad data in
>>>>>>> one record and these shouldn't block the provisioning of all the other
>>>>>>> changelog. I think this is where an error queue might be useful; you
>>>>>>> try them all and if one has bad data, it will be in the error queue to
>>>>>>> retry later, but all the others will still complete successfully. Of
>>>>>>> course if the ldap server has a problem you'll have a huge error
>>>>>>> queue, but they would have been waiting in the changelog anyway. I
>>>>>>> think it's important for the error queue to be retried periodically
>>>>>>>
>>>>>>> There's the PSP daily full sync that kinda solves this problem. If
>>>>>>> you enable it, all the failed transactions will be synched later when
>>>>>>> the ldap server will be back online. I believe this sync isn't based
>>>>>>> on the changelog but on a diff between Grouper and the LDAP.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Message d'origine-----
>>>>>>> De :
>>>>>>>
>>>>>>> [mailto:]
>>>>>>> De la part de Michael R.
>>>>>>> Gettes Envoyé : 31 mai 2012 09:31 À : Lynn Garrison Cc : Grouper Dev
>>>>>>> Objet : Re: [grouper-dev] ldap errors and real time provisioning
>>>>>>>
>>>>>>> +1 to this request. failures should block processing. i view this
>>>>>>> similar to data replication - the idea is to keep the data in sync
>>>>>>> and if there are problems in the sync process, they should block, or,
>>>>>>> in the very least, be placed into an error queue. I hate the error
>>>>>>> queue notion but I do realize lots of products do things this way
>>>>>>> these days.
>>>>>>>
>>>>>>> /mrg
>>>>>>>
>>>>>>> On May 31, 2012, at 9:26, Lynn Garrison wrote:
>>>>>>>
>>>>>>>> Is there a way to stop the real time provisioning if there are
>>>>>>>> problems with the ldap server? We moved to testing real time
>>>>>>>> provisioning with openldap. During the provisioning testing, the
>>>>>>>> file system became full and ldap updates started returning errors.
>>>>>>>>
>>>>>>>>
>>>>>>>> 2012-05-31 09:15:16,001: [DefaultQuartzScheduler_Worker-8] ERROR
>>>>>>>> BaseSpmlProvider.execute(388) - - Target 'psp' - Modify XML:
>>>>>>>> <modifyResponse xmlns='urn:oasis:names:tc:SPML:2:0' status='failure'
>>>>>>>> requestID='2012/05/31-09:15:15.993' error='customError'>
>>>>>>>> <errorMessage>[LDAP: error code 80 - commit failed]</errorMessage>
>>>>>>>> </modifyResponse>
>>>>>>>>
>>>>>>>> psp continued to process the change log events. By the time we
>>>>>>>> realized what was happening, all the change log events had been
>>>>>>>> processed and only have the members were provisioned to the group.
>>>>>>>>
>>>>>>>>
>>>>>>>> Lynn
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>

Re: [grouper-dev] ldap errors and real time provisioning, Tom Zeller, 06/19/2012
- Re: [grouper-dev] ldap errors and real time provisioning, Michael R. Gettes, 06/19/2012
  - Re: [grouper-dev] ldap errors and real time provisioning, Tom Zeller, 06/19/2012
    - Re: [grouper-dev] ldap errors and real time provisioning, Shilen Patel, 06/20/2012
      - Re: [grouper-dev] ldap errors and real time provisioning, Michael R. Gettes, 06/20/2012
        
        RE: [grouper-dev] ldap errors and real time provisioning, Chris Hyzer, 06/20/2012
        
        Re: [grouper-dev] ldap errors and real time provisioning, Shilen Patel, 06/20/2012
        
        Re: [grouper-dev] ldap errors and real time provisioning, Tom Zeller, 06/20/2012
      - Re: [grouper-dev] ldap errors and real time provisioning, Tom Zeller, 06/20/2012

List archive

Re: [grouper-dev] ldap errors and real time provisioning