grouper-dev - Re: [grouper-dev] ldap errors and real time provisioning

Subject: Grouper Developers Forum

List archive

Re: [grouper-dev] ldap errors and real time provisioning

From: Tom Zeller <>
To: Grouper Dev <>
Subject: Re: [grouper-dev] ldap errors and real time provisioning
Date: Wed, 20 Jun 2012 16:35:38 -0500

By default, no, the psp change log consumer does not consider subject
attributes when determining whether or not an add or delete membership
event should be processed.

The rule I have followed is that the psp uses the change log entry
only to determine provisioning. It does not perform any additional
queries to grouper for group or subject attributes or whatever.

My reason for the above rule is, as I have said before, to avoid
discrepancies in data because of time delays.

Of course, if the consensus is to disregard this rule, it would be
easy enough for the psp change log consumer to query grouper for
objects whose identifiers are present in a change log entry.

On Wed, Jun 20, 2012 at 9:32 AM, Shilen Patel
<>
wrote:
> So the case that we have is that one of our directories is only used for
> authentication so we avoid putting user objects there for users without
> active netids. These users can still be added to Grouper groups though
> and it is often likely for users that are transitioning between states
> (where the transition time may be a day or more). If the PSP can look at
> attributes of the subject during add/delete membership events to determine
> whether to skip that event, I think that would be okay for this case. Is
> that possible TomZ?
>
>
> I still wonder if this could come up later on though. I could imagine
> there being an application that only has a subset of users, some of which
> can be determined by attributes of the subject while others are more
> ad-hoc. For instance, an application may have all students/faculty but
> only staff that individually request to use the application. The ad-hoc
> users may be added to a Grouper group (which the PSP could use to make its
> determination -- assuming that's possible?) but perhaps those users are
> added directly into the application or through some other means so Grouper
> has no way of knowing whether they are there or not. Another case may be
> if user objects get created in the application the first time the user
> users that application, so again Grouper has no way of knowing. Here I'm
> thinking about applications like Confluence, though that's probably a bad
> example since it can get group memberships from LDAP. Anyways, I don't
> feel too strongly about this at the moment since I don't have a specific
> use case in mind.
>
> Thanks!
>
> -- Shilen
>
>
> On 6/20/12 8:54 AM, "Michael R. Gettes"
> <>
> wrote:
>
>>excellent point on the multiple targets. We have a homegrown replication
>>environment I hope to divest ourselves of in the next 2 years.
>>Experience indicates treating each target separately is desirable. Block
>>the target having trouble but not the others.
>>
>>As for the not all subjects case for Duke, I am sure there is a valid
>>reason for wanting to do what you do but I respectfully question
>>supporting this case. If you are wanting to slice communities to
>>different targets then there should be valid data to determine how to
>>slice it. Then, if a not found condition is occurs, it is an error. Did
>>you do at Duke because the data wasn't available to Grouper to slice the
>>communities to the various targets?
>>
>>These issues all boil down to data integrity and a "replication mindset".
>> Violating replication with all sorts of exceptions really isn't
>>replication. I know, some will say this is provisioning, not
>>replication. Yes, I can agree it falls under the notion of provisioning
>>as this is the larger function you are trying to achieve but the
>>mechanism employed is a form of replication and I believe replication is
>>much closer to a binary function of "it works or it doesn't". I also
>>believe this mindset is easier to support functionally and technically
>>(explaining to people and writing the code).
>>
>>/mrg
>>
>>On Jun 20, 2012, at 8:35, Shilen Patel wrote:
>>
>>> I had an action item from the last call to comment on error handling. I
>>> agree with where this is going and I think having a way of blocking
>>>rather
>>> than ignoring errors would be very helpful. This should make
>>>incremental
>>> provisioning more reliable and therefore have less need for running the
>>> bulk sync often, which can be very expensive depending on the number of
>>> objects that you have in Grouper and the performance of your target.
>>>
>>> I would just add that there are probably some errors that should be
>>> treated differently than other errors. At Duke, we use the change log
>>>to
>>> directly provision 3 different clusters of directories and not all of
>>>them
>>> have all subjects. So for instance, it would be nice if a subject not
>>> found in a provisioning target could either be retried or ignored
>>> depending on configuration.
>>>
>>> Also, if you have multiple provisioning targets configured with the PSP,
>>> would an error on one target end up blocking updates to all other
>>>targets
>>> until the one target is fixed? I suppose that would depend on whether
>>>the
>>> PSP uses one change log consumer vs multiple? Is that possible? Along
>>> those lines, it would be nice if these options could be different based
>>>on
>>> the target.
>>>
>>> Thanks!
>>>
>>> -- Shilen
>>>
>>>
>>> On 6/19/12 5:37 PM, "Tom Zeller"
>>> <>
>>> wrote:
>>>
>>>> I'll commit retryOnError = false to grouper-loader.properties for now.
>>>>
>>>> Thanks.
>>>>
>>>> On Tue, Jun 19, 2012 at 12:54 PM, Michael R. Gettes
>>>> <>
>>>> wrote:
>>>>> I recommend retryOnError be false by default. RetryOnError true, I
>>>>> believe, should be something someone consciously changes and clearly
>>>>> documented. I won't put up a fight if others feel strongly for the
>>>>> opposite.
>>>>>
>>>>> /mrg
>>>>>
>>>>> On Jun 19, 2012, at 13:24, Tom Zeller wrote:
>>>>>
>>>>>> I am adding a retryOnError option to the psp change log consumer,
>>>>>>what
>>>>>> should the default be ?
>>>>>>
>>>>>> Currently, retryOnError is false, meaning do not retry a change log
>>>>>> entry.
>>>>>>
>>>>>> Should retryOnError be true for 2.1.1 ?
>>>>>>
>>>>>> Thanks,
>>>>>> TomZ
>>>>>>
>>>>>> On Thu, May 31, 2012 at 1:56 PM, Michael R. Gettes
>>>>>> <>
>>>>>> wrote:
>>>>>>> See https://bugs.internet2.edu/jira/browse/GRP-799
>>>>>>>
>>>>>>> I hope it is sufficient.
>>>>>>>
>>>>>>> /mrg
>>>>>>>
>>>>>>> On May 31, 2012, at 12:45, Chris Hyzer wrote:
>>>>>>>
>>>>>>>> The change log is designed for this behavior if you implement the
>>>>>>>> consumer this way (i.e. after Michael submits a jira, TomZ could
>>>>>>>>put
>>>>>>>> that switch in). Just return the last index of the change log that
>>>>>>>> was processed, and it will do nothing until the next minute, and
>>>>>>>>will
>>>>>>>> try that same record again. Maybe if we want an error queue that
>>>>>>>> could be built into the change log so other consumers could benefit
>>>>>>>> as well. If TomZ does implement Michael's request, it would
>>>>>>>>probably
>>>>>>>> be nice if the full sync would somehow update the current change
>>>>>>>>log
>>>>>>>> index to the max index so if real-time was stuffed due to missing
>>>>>>>> subject that it would startup again after the full sync at the
>>>>>>>>point
>>>>>>>> where the full sync started... if the incrementals were stalled
>>>>>>>>for
>>>>>>>> some reason (for longer than a certain period of time), you would
>>>>>>>>be
>>>>>>>> notified I believe via the grouper diagnostics if you have that
>>>>>>>> hooked up to nagios or whatever...
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Chris
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From:
>>>>>>>>
>>>>>>>>
>>>>>>>> [mailto:]
>>>>>>>> On Behalf Of
>>>>>>>> Tom Zeller
>>>>>>>> Sent: Thursday, May 31, 2012 12:08 PM
>>>>>>>> To: Grouper Dev
>>>>>>>> Subject: Re: [grouper-dev] ldap errors and real time provisioning
>>>>>>>>
>>>>>>>> Submit a bug or improvement to jira so we can estimate
>>>>>>>> implementation.
>>>>>>>>
>>>>>>>> For this particular scenario, I think most of the work involves
>>>>>>>> defining "failure", which will most likely be some sort of
>>>>>>>> javax.naming.NamingException. The simplest thing to do may be to
>>>>>>>> (block and) retry any NamingException. Another option may be to
>>>>>>>>make
>>>>>>>> decisions based on the error message of the NamingException.
>>>>>>>>
>>>>>>>> The configuration should probably reside in
>>>>>>>> grouper-loader.properties,
>>>>>>>> near other change log consumer settings. Perhaps a toggle,
>>>>>>>> onNamingException = retry | ignore.
>>>>>>>>
>>>>>>>> Right now, NamingExceptions are ignored, meaning they are logged
>>>>>>>>and
>>>>>>>> the next change log record is processed.
>>>>>>>>
>>>>>>>> Or, maybe the configuration property should consist of actions
>>>>>>>> followed by comma separated exceptions or error messages
>>>>>>>>
>>>>>>>> retry=NamingException, commit failed
>>>>>>>> ignore=AttributeInUseException
>>>>>>>>
>>>>>>>> Not sure about that last one, hopefully someone has a better idea.
>>>>>>>>
>>>>>>>> TomZ
>>>>>>>>
>>>>>>>> On Thu, May 31, 2012 at 10:27 AM, Michael R. Gettes
>>>>>>>><>
>>>>>>>> wrote:
>>>>>>>>> What can I do to convince you to, in the very least, provide an
>>>>>>>>> option to block on failures? It is how I would want to run it.
>>>>>>>>>
>>>>>>>>> /mrg
>>>>>>>>>
>>>>>>>>> On May 31, 2012, at 10:53, Tom Zeller wrote:
>>>>>>>>>
>>>>>>>>>> For 2.1.0, I decided to avoid blocking and rely on full
>>>>>>>>>> synchronizations, which may be scheduled in
>>>>>>>>>> grouper-loader.properties,
>>>>>>>>>> to repair real time provisioning failures.
>>>>>>>>>>
>>>>>>>>>> When I was dealing with error handling in the psp change log
>>>>>>>>>> consumer,
>>>>>>>>>> I thought of the Northern Exposure episode where the computer
>>>>>>>>>> prompts
>>>>>>>>>> "Abort, Retry, Fail ?" and the user is unable to answer (freaks
>>>>>>>>>> out)
>>>>>>>>>> and turns off the computer.
>>>>>>>>>>
>>>>>>>>>> I felt that blocking change log processing was probably the least
>>>>>>>>>> desirable option.
>>>>>>>>>>
>>>>>>>>>> A failure queue is interesting, but it may be important to
>>>>>>>>>>preserve
>>>>>>>>>> the order of operations, so we'll need to think that through. We
>>>>>>>>>> might
>>>>>>>>>> need to configurably map provisioned target exceptions to abort |
>>>>>>>>>> retry | fail | ignore handling.
>>>>>>>>>>
>>>>>>>>>> In this particular scenario, we would need to map the "commit
>>>>>>>>>> failed"
>>>>>>>>>> ldap error to "retry", probably waiting some configurable
>>>>>>>>>>interval
>>>>>>>>>> (60s, 5min, ?) before retrying.
>>>>>>>>>>
>>>>>>>>>> TomZ
>>>>>>>>>>
>>>>>>>>>> On Thu, May 31, 2012 at 9:30 AM, Gagné Sébastien
>>>>>>>>>> <>
>>>>>>>>>> wrote:
>>>>>>>>>>> I was asking myself the same question. Maybe a missing group in
>>>>>>>>>>> the LDAP, it could be manually deleted by another application.
>>>>>>>>>>> Maybe a missing subject ? (but that would be caught in Grouper
>>>>>>>>>>> before the LDAP request).
>>>>>>>>>>>
>>>>>>>>>>> We are still experimenting with the provisioning and the grouper
>>>>>>>>>>> loader and we had many occasion where data didn't match (login
>>>>>>>>>>>vs
>>>>>>>>>>> full DN). That might affect my current impression. When the
>>>>>>>>>>> configuration is done correctly I suppose the data will always
>>>>>>>>>>> match.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -----Message d'origine-----
>>>>>>>>>>> De : Michael R. Gettes
>>>>>>>>>>> [mailto:]
>>>>>>>>>>> Envoyé : 31 mai 2012 10:17
>>>>>>>>>>> À : Gagné Sébastien
>>>>>>>>>>> Cc : Lynn Garrison; Grouper Dev
>>>>>>>>>>> Objet : Re: [grouper-dev] ldap errors and real time provisioning
>>>>>>>>>>>
>>>>>>>>>>> what kind of "bad data" are you considering?
>>>>>>>>>>>
>>>>>>>>>>> /mrg
>>>>>>>>>>>
>>>>>>>>>>> On May 31, 2012, at 9:56, Gagné Sébastien wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I agree that would be an interesting feature, but the reaction
>>>>>>>>>>>> should
>>>>>>>>>>>> depend on the LDAP error. Some errors could be because of bad
>>>>>>>>>>>> data in
>>>>>>>>>>>> one record and these shouldn't block the provisioning of all
>>>>>>>>>>>>the
>>>>>>>>>>>> other
>>>>>>>>>>>> changelog. I think this is where an error queue might be
>>>>>>>>>>>>useful;
>>>>>>>>>>>> you
>>>>>>>>>>>> try them all and if one has bad data, it will be in the error
>>>>>>>>>>>> queue to
>>>>>>>>>>>> retry later, but all the others will still complete
>>>>>>>>>>>> successfully. Of
>>>>>>>>>>>> course if the ldap server has a problem you'll have a huge
>>>>>>>>>>>>error
>>>>>>>>>>>> queue, but they would have been waiting in the changelog
>>>>>>>>>>>>anyway.
>>>>>>>>>>>> I
>>>>>>>>>>>> think it's important for the error queue to be retried
>>>>>>>>>>>> periodically
>>>>>>>>>>>>
>>>>>>>>>>>> There's the PSP daily full sync that kinda solves this problem.
>>>>>>>>>>>> If you enable it, all the failed transactions will be synched
>>>>>>>>>>>> later when the ldap server will be back online. I believe this
>>>>>>>>>>>> sync isn't based on the changelog but on a diff between Grouper
>>>>>>>>>>>> and the LDAP.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> -----Message d'origine-----
>>>>>>>>>>>> De :
>>>>>>>>>>>>
>>>>>>>>>>>> [mailto:]
>>>>>>>>>>>> De la part de
>>>>>>>>>>>>Michael
>>>>>>>>>>>> R.
>>>>>>>>>>>> Gettes Envoyé : 31 mai 2012 09:31 À : Lynn Garrison Cc :
>>>>>>>>>>>>Grouper
>>>>>>>>>>>> Dev
>>>>>>>>>>>> Objet : Re: [grouper-dev] ldap errors and real time
>>>>>>>>>>>>provisioning
>>>>>>>>>>>>
>>>>>>>>>>>> +1 to this request. failures should block processing. i view
>>>>>>>>>>>> this similar to data replication - the idea is to keep the data
>>>>>>>>>>>> in sync and if there are problems in the sync process, they
>>>>>>>>>>>> should block, or, in the very least, be placed into an error
>>>>>>>>>>>> queue. I hate the error queue notion but I do realize lots of
>>>>>>>>>>>> products do things this way these days.
>>>>>>>>>>>>
>>>>>>>>>>>> /mrg
>>>>>>>>>>>>
>>>>>>>>>>>> On May 31, 2012, at 9:26, Lynn Garrison wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Is there a way to stop the real time provisioning if there
>>>>>>>>>>>>> are problems with the ldap server? We moved to testing real
>>>>>>>>>>>>> time provisioning with openldap. During the provisioning
>>>>>>>>>>>>> testing, the file system became full and ldap updates started
>>>>>>>>>>>>> returning errors.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2012-05-31 09:15:16,001: [DefaultQuartzScheduler_Worker-8]
>>>>>>>>>>>>> ERROR BaseSpmlProvider.execute(388) - - Target 'psp' - Modify
>>>>>>>>>>>>> XML:
>>>>>>>>>>>>> <modifyResponse xmlns='urn:oasis:names:tc:SPML:2:0'
>>>>>>>>>>>>> status='failure'
>>>>>>>>>>>>> requestID='2012/05/31-09:15:15.993' error='customError'>
>>>>>>>>>>>>> <errorMessage>[LDAP: error code 80 - commit
>>>>>>>>>>>>> failed]</errorMessage>
>>>>>>>>>>>>> </modifyResponse>
>>>>>>>>>>>>>
>>>>>>>>>>>>> psp continued to process the change log events. By the
>>>>>>>>>>>>> time we realized what was happening, all the change log events
>>>>>>>>>>>>> had been processed and only have the members were provisioned
>>>>>>>>>>>>>to
>>>>>>>>>>>>> the group.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Lynn
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>>
>

Re: [grouper-dev] ldap errors and real time provisioning, Tom Zeller, 06/19/2012
- Re: [grouper-dev] ldap errors and real time provisioning, Michael R. Gettes, 06/19/2012
  - Re: [grouper-dev] ldap errors and real time provisioning, Tom Zeller, 06/19/2012
    - Re: [grouper-dev] ldap errors and real time provisioning, Shilen Patel, 06/20/2012
      - Re: [grouper-dev] ldap errors and real time provisioning, Michael R. Gettes, 06/20/2012
        
        RE: [grouper-dev] ldap errors and real time provisioning, Chris Hyzer, 06/20/2012
        
        Re: [grouper-dev] ldap errors and real time provisioning, Shilen Patel, 06/20/2012
        
        Re: [grouper-dev] ldap errors and real time provisioning, Tom Zeller, 06/20/2012
      - Re: [grouper-dev] ldap errors and real time provisioning, Tom Zeller, 06/20/2012

List archive

Re: [grouper-dev] ldap errors and real time provisioning