Skip to Content.
Sympa Menu

grouper-users - Re: [grouper-users] moving groups and stems

Subject: Grouper Users - Open Discussion List

List archive

Re: [grouper-users] moving groups and stems

Chronological Thread 
  • From: Monica Crawford <>
  • To: Kathryn Huxtable <>
  • Cc:
  • Subject: Re: [grouper-users] moving groups and stems
  • Date: Mon, 14 Aug 2006 11:24:50 -0500

Hi Kathryn,

Was the initial/max heap explicitly stated for the JVM when the script was kicked off? How many MB is the xml file being processed?

Setting the heap explicitly may help a bit, it looks like a dom parser is used and the time spent processing larger files can significantly increase with the size of the file. Doing that may shave down the time down, however there still could be additional bottlenecks on the db side.

Kathryn Huxtable wrote:
Sure. First of all, I miscounted. There were 50,000 subjects. There were
240,000 memberships, all JNDI subjects added as members to 638 groups.

I generated the XML import file using a Perl script that ran against my
Oracle database in about 100 seconds.

I'm afraid I don't really know the hardware specs for my test server, but
it's the same as my production server. It's a Dell Poweredge running RedHat
Linux. I'm using Java1.5.

The subject connector is JNDI running against an extremely oversized Solaris
box running SunJava 5.2. Would JDBC against Oracle be faster? I can do that,
but I'll have to worry about FERPA suppressing the name and description, so
I'll have to make my own JDBC connector to add some code. I get that for
free with my ACIs in LDAP.

The grouper database is Oracle 9i running on some sort of SunFire box.

I made the mistake of running this in a shell inside XEmacs, so I had to
leave my laptop on for all that time.

Anyway, does adding 240,000 memberships take 47.5 hours normally?

I've rewritten _processMembershipLists in XmlImporter so that instead of
removing the old memberships and adding the new ones, it gets the old
membership IDs into a HashSet, does the same with the new membership IDs and
then only removes or adds those that aren't present in both. This does
updates faster, since from day to day most of these groups' memberships
won't change that much (except for three times per year at semester breaks).

But that was after running this.


On 8/12/06 8:56 AM, "Tom Barton"

Can you supply further details of the operation, the execution
environment, and exactly how it was conducted? That does sound absurdly
slow, and far off from my own experience.


Kathryn Huxtable wrote:
I just added external provisioning for all the base groups we might want to
use in composite operations. It took almost 48 hours. This is ridiculous. My
home-grown group management can populate from empty in about four hours. My
nightly update only does adds/removes and so runs very quickly.

I can obviously write similar update code, but really, what all is going on
that adding 20,000 subjects takes so long? Is a table not indexed on some
attribute, or are you creating new subject records from the JNDI data in my
subject configs?

Inquiring minds want to know (fnord).


On 8/4/06 3:43 PM, "Tom Barton"

You should be able to use the XML import/export tool for this purpose.
It'd be good to include an example of how to "prune & graft", like
you're wanting to do, in that wiki page.


Will Norris wrote:
can groups or stems be moved to somewhere else in the group hierarchy?
Looking at the database I can't think of any technical reason why this
wouldn't be possible, unless I'm overlooking something.


Monica Crawford
University of Wisconsin-Madison
Division of Information Technology
1210 West Dayton Street, Rm 3159
Madison, Wisconsin 53706

Archive powered by MHonArc 2.6.16.

Top of Page