Skip to Content.
Sympa Menu

grouper-users - RE: [grouper-users] Random ldappc failures

Subject: Grouper Users - Open Discussion List

List archive

RE: [grouper-users] Random ldappc failures


Chronological Thread 
  • From: Chris Hyzer <>
  • To: Paul Engle <>, "" <>
  • Subject: RE: [grouper-users] Random ldappc failures
  • Date: Fri, 31 Jul 2009 21:41:55 -0400
  • Accept-language: en-US
  • Acceptlanguage: en-US

Hey,

Ive got some bad news for everyone, sorry about this, but there is a race
condition in the Field cache where it is possible for this null pointer to
happen, and this is an API problem, not an LDAPPC problem. Paul, if you can
build a new 1.4 grouper.jar and try it out, that should verify that it is
fixed. Its weird that you are the only user who has noticed this...

Here is the bug report:

https://bugs.internet2.edu/jira/browse/GRP-303

You can get latest 1.4 grouper with these commands:

cvs
-d:pserver::/home/cvs/i2mi
login

cvs
-d:pserver::/home/cvs/i2mi
export -r GROUPER_1_4_BRANCH grouper

This is a race condition where if the expiring cache is cleared in between
two specific lines of code, a null pointer will happen. This is most likely
to occur when there are many called happening sequentially, e.g. ldappc or
other loaders, but it could happen in any env.

This requires getting latest and rebuilding the grouper.jar, and placing it
in your WS, ldappc, UI, loader, etc.

If you dont want to rebuild the grouper.jar, another way to reduce the chance
of this happening is to add (or change if you have it already) to your
grouper.ehcache.xml:

<cache name="edu.internet2.middleware.grouper.FieldFinder.fieldCache"
maxElementsInMemory="10000"
eternal="false"
timeToIdleSeconds="86400"
timeToLiveSeconds="86400"
overflowToDisk="false"
/>

This makes the cache last for a day instead of the default which is a minute.
For steady state production deployments this is probably ok, but if you are
deleting and recreating the same group attribute or membership list, this
will have problems... note that if a field or attribute isnt found, the cache
is cleared, so dont worry about only adding new fields. Deletes or changes
would be a problem...

Sorry about that, let me know if you need more information, and Paul, thanks
for all your work in troubleshooting this, if you can let us know if this
works, that would be great. :)

Thanks!
Chris

> -----Original Message-----
> From: Chris Hyzer
> Sent: Wednesday, July 29, 2009 2:51 PM
> To: 'Paul Engle';
>
> Subject: RE: [grouper-users] Random ldappc failures
>
> > Tom,
> > Sorry for the delay. Yes, you are correct. On a failure, I
> > don't get that log message.
> >
> > I have managed to catch it in the act with an eclipse
> > debugger attached to the process. Given the text and format of
> > the fatal log message, I put in a breakpoint in line 128 of
> > LdappcGrouperProvisioner.java, where the general Exception is
> > caught in provisionGroups(). Unfortunately, by the time it gets
> > to this point, most of the useful context of what is was doing
> > is gone. The exception it's catching, however, looks very odd.
> >
> > It's a NullPointerException, but the detailMessage and
> > stackTrace for the exception are both null. And the cause is
> > another, similar NullPointerException with null detailMessage &
> > stackTrace & cause that's a NullPointerException, etc. ad
> > nauseam.
>
> Since error handling and logging are of interest to me, let me join
> this conversation... :)
>
> If you can get it to stop there again while debugging in eclipse, open
> the eclipse view "expressions", and add an expression:
>
> e.printStackTrace()
>
> That should print the stack with the cause to the console.
>
> Or, if you can, just change:
>
> FROM:
>
> catch (NameNotFoundException nnfe)
> {
> ErrorLog.fatal(this.getClass(), "Grouper Provision Failed:
> " + nnfe.getMessage() + " Exception data: "
> + nnfe.toString());
> }
> catch (Exception e)
> {
> ErrorLog.fatal(this.getClass(), "Grouper Provision Failed:
> " + e.getMessage());
> }
>
> TO:
>
> catch (NameNotFoundException nnfe)
> {
> ErrorLog.fatal(this.getClass(), "Grouper Provision Failed:
> " + nnfe.getMessage() + " Exception data: "
> + nnfe.toString() + ", " +
> ExceptionUtils.getFullStackTrace(nnfe));
> }
> catch (Exception e)
> {
> ErrorLog.fatal(this.getClass(), "Grouper Provision Failed:
> " + e.getMessage() + ", " + ExceptionUtils.getFullStackTrace(e));
> }
>
> You will also need to add this import to the top:
>
> import org.apache.commons.lang.exception.ExceptionUtils;
>
> Alternately or in addition, you could add nnfe.printStackTrace() and
> e.printStackTrace() if you want to see this in the console vs error
> log.
>
> I made these changes in 2.4, so once they are propagated to public cvs
> (usually takes an hour or two), you can get the latest 1.4 branch,
> build, and use that. I also changed the logging in a bunch of other
> places to show stacktraces...
>
> You will know when it is ready when this file shows being updated
> today:
>
> http://viewvc.internet2.edu/viewvc.py/grouper/src/grouper/edu/internet2
> /middleware/ldappc/ConfigManager.java?view=log&pathrev=GROUPER_1_4_BRAN
> CH
>
> You can get latest with these commands:
>
> cvs
> -d:pserver::/home/cvs/i2mi
> login
> cvs
> -d:pserver::/home/cvs/i2mi
> export -r
> GROUPER_1_4_BRANCH grouper
>
> Thanks,
> Chris



Archive powered by MHonArc 2.6.16.

Top of Page