Skip to Content.
Sympa Menu

grouper-dev - Re: [grouper-dev] Hello from Duke

Subject: Grouper Developers Forum

List archive

Re: [grouper-dev] Hello from Duke


Chronological Thread 
  • From: Shilen Patel <>
  • To: Tom Barton <>
  • Cc:
  • Subject: Re: [grouper-dev] Hello from Duke
  • Date: Sun, 08 Jul 2007 12:14:47 -0400

Tom Barton wrote:
Thanks for detailing these issues - it really helps to get this down to specifics.

I'll start with a few follow-up questions/remarks (below), and expect that Gary, blair, or others will chime in.

Shilen Patel wrote:
Hi all,

It was great to meet everyone. I just want to mention some of the issues we discussed in our meeting last week, so that they're documented here.

We have two main types of groups in Grouper.

1. Dynamic Groups - These are groups determined by an LDAP filter.

Because people ask a lot about grouper support for dynamic groups, can you clarify? Ie, do you mean that you run a periodic process that evaluates a canned set of ldap queries and uses the results to maintain memberships in some corresponding grouper groups?

We use Novell IdM to maintain memberships in Grouper groups in real-time. So if a person's attributes in LDAP are changed so that the person should be a member of a dynamic group, Novell IdM will trigger an event that will cause some custom code to run that uses the Grouper API to update the memberships.


2. Class Roster Groups - We have 3 groups per class (instructors, students, and TAs).

About how many classes per term, and how many terms do you keep in the grouper database? I'm guessing 3 groups per class x 2500 classes per term x 6 terms (2 yrs?), or 45000 class groups?

We have about 17,000 classes per year and two years of classes in Grouper right now. So that's about 100,000 class groups.



There is also a TAAdmins group that has an admin privilege to all of the TA groups.

The import file is 230 MB (from the 1.1 Grouper export). There are
846,102 memberships, 102,093 groups, and 89,514 stems.

I'm curious about the nearly 1-1 ratio of stems to groups. What's your stem structure look like, and when do you decide to create a stem vs create a group?

About 99% of our stems and groups and 50% of our memberships are from class data. The rest is almost all from the dynamic groups.

Our class data groups are formated like the following: duke:siss:courses:<SUBJECT>:<CATALOG NUMBER>:<SECTION NUMBER>:<CLASS NUMBER>:<TERM NUMBER>:<instructors or TAs or students>. So the subject, catalog number, section number, class number, and term number are stems. Most of the stems were probably created with the first year of classes and as the number of years of classes increases, the stems to groups ratio will no longer be around 1-1.


The issues we raised are the following.

1. Navigating through the stems viewable in the "My Memberships" section in the UI takes about 2-3 seconds per page.
2. If you are a member of the TAAdmins group, the "Manage Groups" section takes about 10 minutes to load.
3. If you're not a member of the TAAdmins group and have no memberships and no privileges, the "Manage Groups" section still takes about 30 seconds to load.
4. Performing a group search takes about 20-30 seconds even if no results are returned.

I'll wait for Gary's insight before commenting on these.

5. Listing group members also takes a long time. If a group has X members and you want to see the first Y members, listing the group members causes at least X database queries and 2Y LDAP queries.

Are you using the JNDISourceAdapter? Are those 2Y queries each in their own ldap connection? Are the queried attributes suitably indexed? Is the ldap server operated in a way that you expect should produce quick response for the presented query load, ie, cache tuning, ram, I/O for logging, etc?

With grouper 1.2.0, the UI sorts membership and other types of lists by default, which itself can increase display time. If X is large-ish, enough to notice the time it takes to display, is it more useful or less useful for the displayed list to be sorted?

Do you have a sense for the threshold value of X at which this becomes problematic, say, takes more than 1 sec to display?

Yes, we're using the JNDISourceAdapter. I'll get you some more information on where the time is spent later. We're currently moving to a new test database sever.


6. Adding a member to the TAAdmins group takes hours.

If my guess above is right, that amounts to ~15000 indirect memberships. To take 2 hrs would imply ~500ms per membership, which seems an order of magnitude too long. Can you confirm the average time it takes to add a (direct or indirect) membership (or priv assignment) from the grouper logs?

But, whatever that time, N ms x 15000 will take a while.

7. The xml-import takes about 3-4 days for us and requires 4 GB of memory allocated to the Java process.

It appears that the current DOM-based xml import/export approach does not scale. We (ie, the grouper-dev community) should settle on an alternative. Other JAXB-supported xml processing modes? A gsh-based approach?

8. The only way to restrict FERPA protected data is to prevent users from having read access to specific groups and stems that may have FERPA protected data. It would be nice if Grouper had a way of honoring FERPA protected data in a way such that if a user has a FERPA flag set in LDAP, people viewing members of the group would see something like "Anonymous User" instead of the user's name.

Are there users that *should* be able to see FERPA protected data as well? Ie, is it just a function of the data, or of the combination of the data and the user viewing the data? Or other context in which the data is read, like during a provisioning run vs. in a UI session?
It would be helpful if it was a function of the data and the user viewing the data. Right now when we send data to services (either with a feed or an ldap account), we usually determine if the services can view FERPA protected data on a per service basis. So one service may be able to view the private data while another service may not. It seems reasonable for Grouper to support a similar functionality. For instance there can be a group of subjects in Grouper that have access to the FERPA protected data. In our case, the subjects would likely be service accounts. If this functionality is built into the API instead of just in the UI, then this would also work if we want to have services access Grouper data via a web service. What thoughts to other people have?

Thanks,

-- Shilen





Archive powered by MHonArc 2.6.16.

Top of Page