Skip to Content.
Sympa Menu

grouper-dev - Brown LDAPpc 1.0 Stats

Subject: Grouper Developers Forum

List archive

Brown LDAPpc 1.0 Stats


Chronological Thread 
  • From: "Cramton, James" <>
  • To: <>
  • Subject: Brown LDAPpc 1.0 Stats
  • Date: Fri, 20 Jul 2007 15:53:46 -0400

Apologies in advance for the long posting, but I have some decent
details of our recent LDAPpc work that should be of interest to any
Grouper or LDAPpc implementations, and to the ongoing LDAPpc development
efforts.

As he mentioned in his posting last week, Steve Carmody and I just
played battle of the LDAPpcs, and as expected, his version of LDAPpc
using a SQL person source outperformed my version with the LDAP person
source. The SQL person source version provisioned 10,500 groups from
MACE Grouper in about 5 hours, but the LDAP version never successfully
completed. This was fairly expected, and as a result, our initial LDAPpc
implementation will use a SQL person source, provisioned separately from
our LDAP registry. Because we are under time pressure to get our LDAP
groups infrastructure supporting several applications before the start
of the fall semester, we will be rolling with this version next month,
and looking to improve performance and functionality during the fall.

The results of the 5 hours spent provisioning are mildly interesting,
because it sheds light on the relative performance characteristics of
LDAPpc with 2 profoundly different group profiles. We provisioned 700
Community groups and 9783 Course groups (9 groups for 1087 courses).
Each stem provisioned in about 2.5 hours. We have aggregate data on the
number of members per group in each of the stems:

EAB:
704 groups
215,635 memberships
18,000 members/group max
300 memberships/group, on average
4.7 groups/minute
1438 memberships/minute

COURSE:
9783 groups
64,122 memberships
~400 memberships/group max
6.6 memberships/group, on average
65 groups/minute
427 memberships/minute

But...

The course groups are sparsely populated, because of our course group
schema. Only Instructor and Student are provisioned out of our SIS
system. The rest, for now, are empty, or inherit members from a child
group. As the semester progresses, the many of the courses will have
empty groups manually provisioned with a small number of members. Our
course groups are organized in stems as
COURSE:SUBJECT:NUMBER:YEAR-TERM:SECTION, with 9 groups per section:

All 1087 course groups with 60 members, on average
(inherited from Administrator and Learner)
Administrator 1087 course groups with 1 member (inherited
from Instructor)
Instructor 1087 course groups with 1 member
TeachingAssistant 1087 course groups with 0 members
ContentDeveloper 1087 course groups with 0 members
Learner 1087 course groups with 29 members, on average
(inherited from Student)
Student 1087 course groups with 29 members, on average
Auditor 1087 course groups with 0 members
Vagabond 1087 course groups with 0 members

You may recognize some of these groups as eduCourse roles, but our
Learner group consists of brownEduCourse roles.

This performance is acceptable for now, although we would of course like
to reduce the runtime, since there will be high turnover in our course
groups several times per year, as students rush to register, and then
change registrations during the shopping period at the beginning of the
term. We don't expect the runtime to equal our initial runtime, but it
could be several hours, depending on the degree of change in the EAB and
course groups. Adding that to the other provisioning tasks we have
running at night, and we could use a performance boost. Also, for
performance and functional reasons, we're very interested an
event-driven model, such as Leif's upcoming JMS work. We would like to
be able to replicate group changes to LDAP in more or less real time, so
we can distribute the load on our servers and better meet our users'
expectations.


If you've bothered to read this far, you may be interested in our QA
results, performance bottlenecks and our log output puzzles, since there
are still a couple issues we have yet to iron out. Any help in
identifying solutions to the following issues would be much appreciated:

1.
The only oddity we see in the quality of the data output in LDAP is that
some groups--and we haven't identified the pattern yet--get an
isMemberOf attribute that includes their parent groups. Also, groups
with child groups list the children as hasMember values. Neither of
these is necessarily bad for us, but they are not necessary, and I
suspect they are due to a config file setting we haven't understood
correctly. Any suggestions?

2.
Performancewise, we're running Grouper and LDAPpc on Red Hat Enterprise
Server 3 on quad processor 64-bit SunFire X4100 intel boxes with 4GB RAM
and a nice quick SAN disk. The app server runs apache 2.2.4, tomcat 5.x
and Sun jdk 5.x. The db server runs Oracle 10g. Our SunOne LDAP server
runs under Solaris on a substantially less beefy box--a SunFire 280R
dual UltraSparcIII with 2GB RAM, without a SAN connection. All servers
are on the same subnet of a Gigabit Ethernetwork. The app server and DB
server tend to clip right along, rarely topping 10% or 20% of total CPU
time. Typically, the ldappc java process runs 50% - 70% of 1 CPU, so
we're not taking full advantage of the multiprocessor systems. If Oracle
bothers to notice the load, it distributes the load nicely across the 4
CPUs, but I have yet to see it break a sweat--typically in the low
single digit percentages of total CPU time. The bottleneck is i/o on the
LDAP server. SunOne tends to chug along at 20% - 30% of CPU, with 40% -
50% of process time waiting on i/o. Reducing access logging on the LDAP
server may help mitigate this issue, as we're producing 250 - 300MB of
LDAP access logs per provisioning run. We're looking into mounting the
LDAP logs on a separate disk or maybe the SAN as well, so we can hold
onto some of our access stats.

3.
Our current LDAPpc log output is much cleaner than it once was. We had
to edit the Grouper StemFinder to prevent thousands of internal
isChild:null log entries. But all we did was disable the logging of the
exception. Tom and Blair have identified the issue, though. Even though
our LDAP provisioning is performing correctly as far as we've been able
to determine, we still produce 8MB of error logs out of LDAPpc due to
entries like these:

2007-07-20 04:07:00,816:
[edu.internet2.middleware.ldappc.synchronize.GroupEntrySynchronizer]
SUBJECT[[ NAME = COURSE:AFRI:1050E:2007Fall:S01:TeachingAssistant ][ ID
= 552adac8-92c1-472d-92c1-31a2ca8f2334 ]] Subject not found using [
subject id = COURSE:AFRI:1050E:2007Fall:S01:TeachingAssistant ][ source
= g:gsa ][ filter =
[base=ou=groups,dc=brown,dc=edu][scope=2][filter=(brownGroupRDN={0})] ]

The "Subject not found" errors elude us at the moment. It strikes me as
odd to be looking in the g:gsa source adapter with an LDAP filter.

4.
Finally, LDAPpc isn't satisfied that we've completed the run
successfully, we think because it gets a null dto in GrouperSession. I
wonder if the GrouperSession has expired by the time the process
finishes?

2007-07-20 06:29:07,230:
[edu.internet2.middleware.ldappc.LdappcGrouperProvisioner] Grouper
Provision Failed: null dto in class
edu.internet2.middleware.grouper.GrouperSession


James Cramton
Lead Programmer/Analyst
Brown University

401 863-7324



Archive powered by MHonArc 2.6.16.

Top of Page