Skip to Content.
Sympa Menu

grouper-dev - Re: [grouper-dev] group membership singleton pairs suggestion

Subject: Grouper Developers Forum

List archive

Re: [grouper-dev] group membership singleton pairs suggestion


Chronological Thread 
  • From: Steve Edgar <>
  • To: Grouper Dev <>
  • Subject: Re: [grouper-dev] group membership singleton pairs suggestion
  • Date: Thu, 3 Apr 2008 15:12:57 -0400

Here is some more background. I took the data in our current AuthZ system, which is a home grown solution circa 1996 named Permit Server, and imported it into Grouper. In Grouper, we are storing our IDs in eduPersonPrincipalName form, as in ...



I then ran LDAPpc to sync Grouper to our directory. We used static groups, to get EZ private group capability.

One group would not sync. It was our monster big group called "cu.alumni", which contains about 190K members.

Investigation showed that there is a limit to the size of the group which can be created on the directory server. (We use Sun One.) Through experimentation, I found that limit was somewhere around 85K members. Attempted loading through LDIF showed the same restriction.

Out of curiosity, I threw together a Java class which would do an initial write of 85K members, and then append 5K members at a time, until all 190K members were in the group. This took a long time, but worked. I'd have to check with the sys admin, but I think there were replication issues.

Tests showed query speed depended on the number of hasMember attributes in a group. The larger the number of hasMember attributes, the slower the response time.

For example, to query "is member in group", using a BindID, with no SSL, a small group of 30 could do about 35 requests/sec. A group of about 20,000 about 20 requests/sec. A group of 30,000 about 15 requests/sec. Cu.alumni came in at about 2 requests/sec.

The requests/second figures decrease by about 50% if SSL and SASL/GSS authentication are used.

We also found this in Sun's docs ...

http://developers.sun.com/identity/reference/techart/bestpractices.html

... this note ...

"Use multivalue attributes sparingly. If you must use them, for example, in the case of static groups, minimize the number of values in a multivalued attribute."

... which seemed to agree with what the tests showed.

Our target for all "is member in group" queries, regardless of the member or group being queried, is 30 requests/sec. Our Permit Server is faster than this, but this is what we'd like to get to handle peak loads.

Another performance problem was "list all groups for member". This was always slower than 2 requests/sec, because cu.alumni gets searched every time.

So that is what drove us to consider singleton pairs, which gives us a response time for "is member in group" of about 45 requests/sec, regardless of the group or member being queried (using a BindID and no SSL). "List groups for member" does about 15 requests/sec. But, as previously noted, there are drawbacks to this approach.

We are not sure what approach we'll use, but we have to decide soon. We plan to do some additional testing with static groups, to look into the load on the directory server.

And... as before, we are open to any new ideas.

-- Steve.

On Apr 3, 2008, at 10:39 AM, Michael R. Gettes wrote:

This is still not clear to me. Exactly what kinds of queries don't
respond well? I was pointed to your previous posts and I didn't see
any examples of exactly what you were trying to do and the problems seen.
I remain curious. Especially with statements like:

"but we found the directory server does not like large numbers of multi-valued attributes"

/mrg

On Apr 3, 2008, at 9:44, Steve Edgar wrote:
We form the cn a bit differently, in that we do not use the source name, and use ID and group name. If ...



... was a member of ...

example:admin:it:staff

We'd get ...

cn:
:admin:it:staff
hasMember:

isMemberOf: example:admin:it:staff
cornelledugroupreadpriv=GrouperAll

I also write a small static group entry, which does contain any group members, in a separate branch for each group. This allows a quick response to the query "list all groups". There, I also write an LDAP URL for each group, in hopes of supporting dynamic groups for vendor apps.

An unknown is compatibility with vendor apps. If the vendor app allows the specification of a search, singleton pairs should work. We've hooked up Bedework with success. If the vendor app supports dynamic groups, I think that will work, but it is untested. If the vendor app only support static groups, you are hosed.

Another trade off with singleton pairs is provisioning time. It is slower than static groups, because you have to touch a lot of entries.

I've got a few classes whipped together which provision singleton pairs, using the same algorithm used by LDAPpc. It's crude, but works, and is enough for us to test with.

We looked into singleton pairs because of limitations we found with existing directory schema when trying to get fast query response times, scalable support for private groups, and support for very large groups. Singleton pairs is the only thing we've found so far which does all 3 of these.

Static groups easily allows scalable private groups, but we found the directory server does not like large numbers of multi-valued attributes. Query response time decreases as static group size increases. Very large groups (over about 85K members if you are using EPPN entries), will not load, are super slow.

Using isMemberOf under uid entries allows for fast query response times, but we do not have a scalable way to allow for private groups. If someone knows a good way to do this, we are quite interested.

We are not all that keen on singleton pairs, but it may be our best alternative for providing a fast, secure, simple and consistent AuthZ solution.

-- Steve.

On Apr 2, 2008, at 11:31 PM, Michael R. Gettes wrote:

Frankly, i don't get it.

/mrg

On Apr 2, 2008, at 3:00 PM, Kathryn Huxtable wrote:

We didn't get a chance to talk about this during the conference call today, but I wanted to get some more input.

Cornell has suggested an option for storing group memberships in LDAP wherein the memberships would be stored as two-tuples, rather as they are in the database. Cornell has their own attributes for this, but I think we could just use eduMember.

So, e.g. if there's a user in the example source, gmettes, who is a member of the group example:admin:it:staff, there would be an entry in a branch of LDAP of the form:

cn: example-gmettes-example:admin:it:staff
hasMember: uid=gmettes, ou=people, dc=example, dc=edu
isMemberOf: example:admin:it:staff

This would create as many objects in this branch of LDAP as there are membership tuples, which could get large, but the individual objects stay very small. It shouldn't complicate or increase the size of the indices at all. The "cn" attribute needs to be constructed of the unique attributes of the tuple to avoid collisions.

This would be an advantage if, as Cornell does, you have a group with more than 100,000 members. It scales very well provided that your LDAP implementation has no silly object limit. (I think Sun Directory server's licensing does have such a limit, but it's not a software limit. Does anyone know about Oracle's Sleepycat implementation, which is used by both Sun and OpenLDAP? Does it perform badly if there are Avogadro's Number of objects (not literally, of course)?)

On the other hand, it's possible that off the self applications won't deal with this.

Personally, I prefer memberships to groups, but which is used at my former institution was conditioned by what application was being used and what its requirements were.

What do people think?

-K







Archive powered by MHonArc 2.6.16.

Top of Page