Skip to Content.
Sympa Menu

grouper-users - Re: [grouper-users] ldap source vs jdbc source

Subject: Grouper Users - Open Discussion List

List archive

Re: [grouper-users] ldap source vs jdbc source


Chronological Thread 
  • From: James Vuccolo <>
  • To: "Michael R. Gettes" <>
  • Cc: Lynn Garrison <>, Chris Hyzer <>, " Users" <>
  • Subject: Re: [grouper-users] ldap source vs jdbc source
  • Date: Fri, 11 Mar 2011 15:03:52 -0500

We are using IBMs directory server v62

James "Jimmy" Vuccolo
Penn State University
215B Computer Building
University Park, PA 16802

On Mar 11, 2011, at 2:37 PM, "Michael R. Gettes"
<>
wrote:

> Lynn, I am terribly sorry for not properly reading your email... I see you
> did the test properly. I'm still curious about which LDAP server you are
> using. I agree that re-provisioning the entire group is not desirable.
>
> (thanks Chris for noting my inability to read properly :-) )
>
> /mrg
>
> On Mar 11, 2011, at 14:22, Michael R. Gettes wrote:
>
>> Lynn,
>>
>> I'm curious... what LDAP server is involved here? It may not be relevant
>> to the issues you're having, but I am curious.
>>
>> Otherwise, I do find the single-threaded nature of the provisioning
>> interesting. LDAP servers are inherently multi-threaded and quite capable
>> so this would be an interesting area to explore in speeding things up.
>> I'd also add that Chris' note was indicating the problems of initial
>> provisioning versus ongoing. Initial always takes longer no matter what
>> the technology used. Would be interesting to note your timings on
>> subsequent updates where only a few changes are involved.
>>
>> /mrg
>>
>> On Mar 11, 2011, at 14:08, Lynn Garrison wrote:
>>
>>> Chris,
>>>
>>> On Mar 11, 2011, at 12:55 PM, Chris Hyzer wrote:
>>>
>>>> You could try Jim Fox's vt-ldap source which has better performance. I
>>>> think the link for it is here:
>>>>
>>>> http://staff.washington.edu/fox/grouper/dist/
>>>>
>>> I need to take a look at that. I know that you are planning on
>>> incorporating it into the standard grouper api package. Will that be 1.7
>>> or 2.0?
>>>> How are you loading your group into Grouper? Grouper loader, WS, GSH,
>>>> API?
>>>> 15 minutes for 30k members is 30 members/sec which is pretty good I
>>>> think.
>>>>
>>>
>>> I am using gsh. I wanted to go a quick test. The next step is to try
>>> other ways to load the data.
>>>> I cant really comment on ldappcng performance.
>>>>
>>>> For loading into grouper though, it has to do a bunch of work, so even
>>>> if it does a subject query for each row, the order of magnitude of the
>>>> operation overall will still be the same if you remove the subject
>>>> query. I would assume it would be similar for ldappcng as well.
>>>> However, you will be better off if you can load by specifying the
>>>> subjectId and sourceId so it goes to one source with one query to
>>>> resolve. If you only specify a netId, and call idOrIdentifier, and not
>>>> sourceId, that will query each source (~4?), with two queries (8 total).
>>>> The loader is pretty efficient about this.
>>>>
>>>> Generally Grouper is designed assuming there is not a lot of membership
>>>> churn, right? I.e. you will have your 30k students in a group, and each
>>>> day, you might add or remove a couple hundred, and then a few days a
>>>> year, you have +- 5k. Those loads will take a while as you see for the
>>>> few days where the new students are entered.
>>>>
>>>> Do you think these performance numbers will be a problem for you?
>>>>
>>>
>>> The provisioning to ldap timing will be a problem for us. We use GPFS
>>> for our all share file system and we control group access to share file
>>> systems with groups in ldap. So all of are groups have to be provisioned
>>> to ldap. Currently we support three types of groups - standing
>>> (department), course and user managed groups. The standing groups are
>>> created once and changes to the groups are made once a day. Course
>>> groups are created once a semester and changes are made once a day. The
>>> user managed groups are maintained only in ldap and changes appear as
>>> soon as they occur. One of the reasons that we are looking at grouper
>>> is to make the changes to standing and course groups in real time.
>>> During our testing we ran into several areas of concern. We were
>>> running ldappcng with a bulksync and interval of 180 seconds. On the
>>> first interval, the sync took about 45 minutes because the large group
>>> had to be provisioned. During that interval, I added several more small
>>> groups (2 members) to grouper. The groups didn't appear in ldap until
>>> the large group was provisioned. Once the large group was in sync,
>>> provisioning of new groups and members happened very quickly. I added
>>> one member to the large group, and the sync was back up to taking 45
>>> minutes. It appeared that it modified all members of the group instead
>>> of just the one added. At least that is what I believe that the SPML
>>> was telling me. The large group is our faculty staff group and could
>>> potentially change multiple times a day.
>>>
>>> Concerns
>>> 1. Minor change to membership appears to re-provision the group
>>> 2. Provisioning appears to be single threaded
>>>
>>>> Btw, I don't think it will change your performance that much, but I
>>>> think that a JDBC source has opportunities in the future to have some
>>>> performance improvements and feature improvements since it could bulk
>>>> load subjects and page/sort better, but Im not an ldap person, so maybe
>>>> we could do a similar thing there too. At Penn we just the jdbc2
>>>> source...
>>>>
>>>> Thanks,
>>>> Chris
>>>>
>>>> -----Original Message-----
>>>> From:
>>>>
>>>>
>>>> [mailto:]
>>>> On Behalf Of Lynn Garrison
>>>> Sent: Friday, March 11, 2011 11:02 AM
>>>> To:
>>>>
>>>> Subject: [grouper-users] ldap source vs jdbc source
>>>>
>>>> Our test environment at Penn State is configure with an ldap source;
>>>> oracle for the gsa. We recently executed a test load with our largest
>>>> group - ~32k faculty/staff members. We loaded all the members to
>>>> grouper and then provisioned them to lpad using ldappcng. We executed
>>>> the test several times.
>>>>
>>>> The load into grouper executed in 15 to 18 minutes.
>>>> The provision to ldap with ldappcng executed in 45 to 50 minutes.
>>>>
>>>> We are using the source api and ldappcng version 1.6.3.
>>>> Questions:
>>>>
>>>> 1. Are these reasonable times?
>>>> 2. Would we see an improvement in the ldappcng execution time if we
>>>> were using a jdbc source?
>>>>
>>>>
>>>> We are looking at replacing the current mechanism for managing groups
>>>> - 66000+ groups, ~32k members in the largest group. One of the
>>>> requirements is that all groups be provision to ldap and available for
>>>> use as soon as they are created.
>>>>
>>>
>>>
>>
>>
>



Archive powered by MHonArc 2.6.16.

Top of Page