Skip to Content.
Sympa Menu

grouper-users - Re: [grouper-users] ldap source vs jdbc source

Subject: Grouper Users - Open Discussion List

List archive

Re: [grouper-users] ldap source vs jdbc source


Chronological Thread 
  • From: "Michael R. Gettes" <>
  • To: Lynn Garrison <>
  • Cc: Chris Hyzer <>, "" <>
  • Subject: Re: [grouper-users] ldap source vs jdbc source
  • Date: Fri, 11 Mar 2011 14:22:08 -0500

Lynn,

I'm curious... what LDAP server is involved here? It may not be relevant to
the issues you're having, but I am curious.

Otherwise, I do find the single-threaded nature of the provisioning
interesting. LDAP servers are inherently multi-threaded and quite capable so
this would be an interesting area to explore in speeding things up. I'd also
add that Chris' note was indicating the problems of initial provisioning
versus ongoing. Initial always takes longer no matter what the technology
used. Would be interesting to note your timings on subsequent updates where
only a few changes are involved.

/mrg

On Mar 11, 2011, at 14:08, Lynn Garrison wrote:

> Chris,
>
> On Mar 11, 2011, at 12:55 PM, Chris Hyzer wrote:
>
>> You could try Jim Fox's vt-ldap source which has better performance. I
>> think the link for it is here:
>>
>> http://staff.washington.edu/fox/grouper/dist/
>>
> I need to take a look at that. I know that you are planning on
> incorporating it into the standard grouper api package. Will that be 1.7
> or 2.0?
>> How are you loading your group into Grouper? Grouper loader, WS, GSH, API?
>> 15 minutes for 30k members is 30 members/sec which is pretty good I think.
>>
>
> I am using gsh. I wanted to go a quick test. The next step is to
> try other ways to load the data.
>> I cant really comment on ldappcng performance.
>>
>> For loading into grouper though, it has to do a bunch of work, so even if
>> it does a subject query for each row, the order of magnitude of the
>> operation overall will still be the same if you remove the subject query.
>> I would assume it would be similar for ldappcng as well. However, you
>> will be better off if you can load by specifying the subjectId and
>> sourceId so it goes to one source with one query to resolve. If you only
>> specify a netId, and call idOrIdentifier, and not sourceId, that will
>> query each source (~4?), with two queries (8 total). The loader is pretty
>> efficient about this.
>>
>> Generally Grouper is designed assuming there is not a lot of membership
>> churn, right? I.e. you will have your 30k students in a group, and each
>> day, you might add or remove a couple hundred, and then a few days a year,
>> you have +- 5k. Those loads will take a while as you see for the few days
>> where the new students are entered.
>>
>> Do you think these performance numbers will be a problem for you?
>>
>
> The provisioning to ldap timing will be a problem for us. We use GPFS
> for our all share file system and we control group access to share file
> systems with groups in ldap. So all of are groups have to be provisioned
> to ldap. Currently we support three types of groups - standing
> (department), course and user managed groups. The standing groups are
> created once and changes to the groups are made once a day. Course groups
> are created once a semester and changes are made once a day. The user
> managed groups are maintained only in ldap and changes appear as soon as
> they occur. One of the reasons that we are looking at grouper is to make
> the changes to standing and course groups in real time.
> During our testing we ran into several areas of concern. We were
> running ldappcng with a bulksync and interval of 180 seconds. On the
> first interval, the sync took about 45 minutes because the large group had
> to be provisioned. During that interval, I added several more small
> groups (2 members) to grouper. The groups didn't appear in ldap until the
> large group was provisioned. Once the large group was in sync,
> provisioning of new groups and members happened very quickly. I added
> one member to the large group, and the sync was back up to taking 45
> minutes. It appeared that it modified all members of the group instead of
> just the one added. At least that is what I believe that the SPML was
> telling me. The large group is our faculty staff group and could
> potentially change multiple times a day.
>
> Concerns
> 1. Minor change to membership appears to re-provision the group
> 2. Provisioning appears to be single threaded
>
>> Btw, I don't think it will change your performance that much, but I think
>> that a JDBC source has opportunities in the future to have some
>> performance improvements and feature improvements since it could bulk load
>> subjects and page/sort better, but Im not an ldap person, so maybe we
>> could do a similar thing there too. At Penn we just the jdbc2 source...
>>
>> Thanks,
>> Chris
>>
>> -----Original Message-----
>> From:
>>
>>
>> [mailto:]
>> On Behalf Of Lynn Garrison
>> Sent: Friday, March 11, 2011 11:02 AM
>> To:
>>
>> Subject: [grouper-users] ldap source vs jdbc source
>>
>> Our test environment at Penn State is configure with an ldap source;
>> oracle for the gsa. We recently executed a test load with our largest
>> group - ~32k faculty/staff members. We loaded all the members to grouper
>> and then provisioned them to lpad using ldappcng. We executed the test
>> several times.
>>
>> The load into grouper executed in 15 to 18 minutes.
>> The provision to ldap with ldappcng executed in 45 to 50 minutes.
>>
>> We are using the source api and ldappcng version 1.6.3.
>> Questions:
>>
>> 1. Are these reasonable times?
>> 2. Would we see an improvement in the ldappcng execution time if we
>> were using a jdbc source?
>>
>>
>> We are looking at replacing the current mechanism for managing groups
>> - 66000+ groups, ~32k members in the largest group. One of the
>> requirements is that all groups be provision to ldap and available for use
>> as soon as they are created.
>>
>
>




Archive powered by MHonArc 2.6.16.

Top of Page