Skip to Content.
Sympa Menu

grouper-users - RE: [grouper-users] Penn's organizational hierarchy

Subject: Grouper Users - Open Discussion List

List archive

RE: [grouper-users] Penn's organizational hierarchy


Chronological Thread 
  • From: Chris Hyzer <>
  • To: Niels van Dijk <>
  • Cc: Grouper Users Mailing List <>, "" <>
  • Subject: RE: [grouper-users] Penn's organizational hierarchy
  • Date: Fri, 22 May 2009 08:38:09 -0400
  • Accept-language: en-US
  • Acceptlanguage: en-US

Btw, the loader could be tuned to be more efficient, though it seems like it
has acceptable performance at the moment... right now there is one query
which is run to load the data. If it is a simple loader job, or a group list
job, there is only one query. Then for each group managed (for a simple job,
it is one group, or a group list, it is many groups), there is one query to
list all current members in the registry (note, Im not sure if subjects are
resolved, I hope not). The two lists are reconciled, and any inserts or
deletes are performed.

For group list jobs, I think it would make it much faster if there was one
query against the registry to get the memberships of all groups in the list.

I think we are ok with holding everything in memory (e.g. for Penn's org
list, that would be 28k subjectId's and group names). That should be 3 megs
of data (less depending on java String pooling). That doesn't seem
unreasonable. However, I could picture ordering the query by group name and
subject id, and cycling through the results in order without having to bring
everything into memory...

Another optimization could be, if the db connection of the loader job is the
same as the grouper registry (including if you use a dblink), I could picture
joining to the registry so that query would return the inserts and deletes
only (or two queries, one for inserts, one for delete, if one doesn't work
out)

We should profile the loader to make sure we get the right bottlenecks
though... :) If anyone has thoughts or concerns please let me know.

Thanks,
Chris

> -----Original Message-----
> From: Niels van Dijk
> [mailto:]
> Sent: Friday, May 22, 2009 3:36 AM
> To: Chris Hyzer
> Cc: Grouper Users Mailing List;
>
> Subject: Re: [grouper-users] Penn's organizational hierarchy
>
> Hello Chris,
>
> Thanks for the interesting document . Are you able to tell something
> about the performance of grouper in the setup you describe?
>
> thanks in advance,
> Regards,
>
> Niels
>
> Chris Hyzer wrote:
> > Hey,
> >
> > I implemented the Grouper org hook at Penn. I have permission to
> share my experience outside of Penn so I made a document here:
> >
> >
> https://wiki.internet2.edu/confluence/display/GrouperWG/Penn+organizati
> onal+hierarchy
> >
> > This is the size of our org implementation:
> >
> > 27,000 people in orgs at Penn
> > 2,200 orgs
> > 3,000 org groups (more due to include/exclude lists)
> > 500,000 org memberships (there are a lot due to the rollups, and
> include/exclude lists)
> >
> > Note that the bulk of the work was in figuring out how our org data
> is structured, and how I can expose it in views to something that the
> loader can process. If you want to get this working at your
> institution this document should be helpful.
> >
> > Btw, Loris, does this help with your org issues?
> >
> > Thanks,
> > Chris
>
> --
> Niels van Dijk
> Advanced Services
>
> T: +31 302 305 337 / M: +31 651 347 657
> SURFnet - PO Box 19035 - NL-3501 DA Utrecht - The Netherlands -
>
> http://www.surfnet.nl
> SURFnet grensverleggend netwerk voor hoger onderwijs en onderzoek



Archive powered by MHonArc 2.6.16.

Top of Page