Skip to Content.
Sympa Menu

grouper-users - Re: [grouper-users] ldap source vs jdbc source

Subject: Grouper Users - Open Discussion List

List archive

Re: [grouper-users] ldap source vs jdbc source


Chronological Thread 
  • From: Lynn Garrison <>
  • To: Rob Hebron <>
  • Cc:
  • Subject: Re: [grouper-users] ldap source vs jdbc source
  • Date: Tue, 15 Mar 2011 13:39:53 -0400

Rob,
Thanks for the information. We will probably take a look at using
the changelog.
Lynn

On Mar 14, 2011, at 3:08 PM, Rob Hebron wrote:

> Lynn,
>
> We had a similar problem at Cardiff University a few years ago: we needed
> Group provisioning to be faster than seemed to be possible using LDAPPC.
> The solution was to track groups for changes using the changelog process,
> and send these through to the directory.
>
> We used our existing Identity Management infrastructure to process event
> sent as a JSON object (code for this is now in Grouper, and documentation
> is one the Wiki). Grouper runs a thread per changelog consumer, so it would
> be possible to run multiple threads for any operation (with care). Separate
> threads for STEM, GROUP and MEMBERSHIP events were what we went for, and we
> get changes sent through to our LDAP directories within a couple of minutes.
>
> Cardiff doesn't use the ESB consumer exclusively though - in practise
> Grouper Loader, LDAPC-NG and the ESB changelog consumer are now used in
> combination.
>
> Rob
>
> On 11/03/11 19:08, Lynn Garrison wrote:
>> Chris,
>>
>> On Mar 11, 2011, at 12:55 PM, Chris Hyzer wrote:
>>
>>> You could try Jim Fox's vt-ldap source which has better performance. I
>>> think the link for it is here:
>>>
>>> http://staff.washington.edu/fox/grouper/dist/
>>>
>> I need to take a look at that. I know that you are planning on
>> incorporating it into the standard grouper api package. Will that be 1.7
>> or 2.0?
>>> How are you loading your group into Grouper? Grouper loader, WS, GSH,
>>> API?
>>> 15 minutes for 30k members is 30 members/sec which is pretty good I think.
>>>
>>
>> I am using gsh. I wanted to go a quick test. The next step is to
>> try other ways to load the data.
>>> I cant really comment on ldappcng performance.
>>>
>>> For loading into grouper though, it has to do a bunch of work, so even if
>>> it does a subject query for each row, the order of magnitude of the
>>> operation overall will still be the same if you remove the subject query.
>>> I would assume it would be similar for ldappcng as well. However, you
>>> will be better off if you can load by specifying the subjectId and
>>> sourceId so it goes to one source with one query to resolve. If you only
>>> specify a netId, and call idOrIdentifier, and not sourceId, that will
>>> query each source (~4?), with two queries (8 total). The loader is
>>> pretty efficient about this.
>>>
>>> Generally Grouper is designed assuming there is not a lot of membership
>>> churn, right? I.e. you will have your 30k students in a group, and each
>>> day, you might add or remove a couple hundred, and then a few days a
>>> year, you have +- 5k. Those loads will take a while as you see for the
>>> few days where the new students are entered.
>>>
>>> Do you think these performance numbers will be a problem for you?
>>>
>>
>> The provisioning to ldap timing will be a problem for us. We use GPFS
>> for our all share file system and we control group access to share file
>> systems with groups in ldap. So all of are groups have to be provisioned
>> to ldap. Currently we support three types of groups - standing
>> (department), course and user managed groups. The standing groups are
>> created once and changes to the groups are made once a day. Course groups
>> are created once a semester and changes are made once a day. The user
>> managed groups are maintained only in ldap and changes appear as soon as
>> they occur. One of the reasons that we are looking at grouper is to make
>> the changes to standing and course groups in real time.
>> During our testing we ran into several areas of concern. We were
>> running ldappcng with a bulksync and interval of 180 seconds. On the
>> first interval, the sync took about 45 minutes because the large group had
>> to be provisioned. During that interval, I added several more small
>> groups (2 members) to grouper. The groups didn't appear in ldap until the
>> large group was provisioned. Once the large group was in sync,
>> provisioning of new groups and members happened very quickly. I added
>> one member to the large group, and the sync was back up to taking 45
>> minutes. It appeared that it modified all members of the group instead of
>> just the one added. At least that is what I believe that the SPML was
>> telling me. The large group is our faculty staff group and could
>> potentially change multiple times a day.
>>
>> Concerns
>> 1. Minor change to membership appears to re-provision the group
>> 2. Provisioning appears to be single threaded
>>
>>> Btw, I don't think it will change your performance that much, but I think
>>> that a JDBC source has opportunities in the future to have some
>>> performance improvements and feature improvements since it could bulk
>>> load subjects and page/sort better, but Im not an ldap person, so maybe
>>> we could do a similar thing there too. At Penn we just the jdbc2
>>> source...
>>>
>>> Thanks,
>>> Chris
>>>
>>> -----Original Message-----
>>> From:
>>>
>>>
>>> [mailto:]
>>> On Behalf Of Lynn Garrison
>>> Sent: Friday, March 11, 2011 11:02 AM
>>> To:
>>>
>>> Subject: [grouper-users] ldap source vs jdbc source
>>>
>>> Our test environment at Penn State is configure with an ldap source;
>>> oracle for the gsa. We recently executed a test load with our largest
>>> group - ~32k faculty/staff members. We loaded all the members to grouper
>>> and then provisioned them to lpad using ldappcng. We executed the test
>>> several times.
>>>
>>> The load into grouper executed in 15 to 18 minutes.
>>> The provision to ldap with ldappcng executed in 45 to 50 minutes.
>>>
>>> We are using the source api and ldappcng version 1.6.3.
>>> Questions:
>>>
>>> 1. Are these reasonable times?
>>> 2. Would we see an improvement in the ldappcng execution time if we
>>> were using a jdbc source?
>>>
>>>
>>> We are looking at replacing the current mechanism for managing groups
>>> - 66000+ groups, ~32k members in the largest group. One of the
>>> requirements is that all groups be provision to ldap and available for
>>> use as soon as they are created.
>>>
>>
>




Archive powered by MHonArc 2.6.16.

Top of Page