Skip to Content.
Sympa Menu

grouper-users - Re: [grouper-users] Grouper database capacity estimation

Subject: Grouper Users - Open Discussion List

List archive

Re: [grouper-users] Grouper database capacity estimation


Chronological Thread 
  • From: Shilen Patel <>
  • To: Nathan Kopp <>, "" <>
  • Subject: Re: [grouper-users] Grouper database capacity estimation
  • Date: Fri, 14 Oct 2011 14:43:13 +0000
  • Accept-language: en-US



From: Nathan Kopp <>
Date: Mon, 10 Oct 2011 15:10:10 -0400
To: "" <>
Subject: [grouper-users] Grouper database capacity estimation

We are preparing to go live with our first deployment of Grouper (1.6.3) on an Oracle 11g database with OID LDAP.

 

Does anyone have experience with estimating the database capacity required for Grouper?  I’m looking for a formula something like this:

Usage = A * #groups + B * #users + C * #memberships + D * #operations + baseline

 

#groups = number of groups stored in grouper

#users = number of users that are involved in memberships

#memberships = total memberships across all groups

#operations = operations (due to the change log)



So this is a very rough estimate.. :)

10 * #groups + 3 * #stems + 1 * #members + 1 * #memberships + 1 * #privileges + baseline (relatively small) + temporary data + audit logs.

This is mainly based on rows inserted into tables.  It's not completely accurate because it doesn't take into account that some rows contain a lot more data than other rows.  It also doesn't take into account index sizes.  The membership and privilege sizes are probably higher than 1 since they involve more (and larger) indexes than most other tables.  Also, not all memberships are the same.  A membership where the member is a group costs more than a membership where the member is a person.  And composite groups are generally more expensive since Grouper stores a flattened membership list for them.

Temporary data includes the change log and the Grouper daemon log (both of which are in the database).  By default, they're deleted after a number of days so I'm considering them temporary.

The audit logs include user audit and point in time audit (v2.0+).  These generally keep growing until you delete them.

This also doesn't take into account attributes and permissions (if they are used).


 

I assume that grouper stores an internal “subject” for each user that is assigned to a group, but if a user is never assigned to a group they do not get a row.



They get a row in the grouper_members table for other reasons as well — if they are given privileges, permissions, access the UI, etc..

 

My assumption is that required capacity will always increase over time as the #operations increases due to the changelog.



True if you don't trim the change log (which is done by default).  But it is also true due to the auditing.


 

Also, does Grouper actually delete groups, memberships, andsubjects from the database or does it only mark such items as deleted?



Members are never deleted from Grouper.  Groups and memberships that are deleted are still kept around (in different tables) for point in time auditing, but that data can be deleted.

 

Thanks much for any help anyone can offer in this area.  I can, of course, run some test of my own to try to produce these estimates.  However, I decided to ask just in case someone else has already done this work and could share their results.  If nobody has, I will do this and then let you know what I find.

 


It would be useful to know what you find based on real tests.  

Thanks!

-- Shilen






Archive powered by MHonArc 2.6.16.

Top of Page