Skip to Content.
Sympa Menu

grouper-dev - Re: [grouper-dev] highly available web services

Subject: Grouper Developers Forum

List archive

Re: [grouper-dev] highly available web services

Chronological Thread 
  • From: Tom Barton <>
  • To: "" <>
  • Subject: Re: [grouper-dev] highly available web services
  • Date: Mon, 15 Jun 2009 08:09:43 -0500

Chris knows that I agree with the perspectives offered by Jim and Niels, and more generally I aim for applications like grouper to take as little as possible into themselves, relying instead on other infrastructure layers in which a problem has been solved.

Of course, that approach has its downsides: deployers need greater skill, sites need to have more services ready to go, and the application itself usually must compromise on what it could otherwise deliver if it instead implemented internally all that it needs.

It's hard to find the right balance between what should be in-sourced and what should be out-sourced from grouper. So I appreciate Chris continuing to challenge us to think freshly about it.


Niels van Dijk wrote:
Hi Chris,

I really admire your wish to make life of the local admin easy :)
However, I feel that your life will become *very* complicated by this :(

We may not be a typical setup, but for us, redundant MySQL was rather
the logical way to go. This is also a very well documented solution, so
I feel grouper should not have to worry about this. erhaps the grouper
community would also be served well by just documenting very well how to
setup such a HA enviroment with grouper?

If you require a good in-memory cache, take a look at memcache. This is
a very good, highly scaleable solution.

just my 2 cents,


Chris Hyzer wrote:
Yes, that is always what is mentioned as what to do for this.

Here are some more advantages, though I agree with you that replication might
be the way to go...

I think it would add value if Grouper (assuming you need highly available web
services) were shipped as a product which did not require database
replication, since it might not be something the institution has experience
with. I think it would be nice if the only server process Grouper needs to
function in read queries is the servlet container (no DB process to worry
about, even if there are multiple).

Another reason to have an in-memory cache of the data is to have high
performant attribute read queries for privilege management (which can be
complex and high volume)... e.g. does ID 12345 have rights to read english110
data? It could be a direct assignment, an assignment through a role, and
assignment through a privilege SET (e.g. an assignment to all of the English
department), an assignment of a role to a SET, an assignment to a role which
is inherited from another role which the user is in, etc. Of course you
could also benefit from high volume group membership queries, though these
have simpler SQL/logic...


-----Original Message-----
From: Jim Fox
Sent: Sunday, June 14, 2009 11:09 PM
To: Chris Hyzer

Subject: Re: [grouper-dev] highly available web services

These steps you outline are a lot more complicated than you let on.
Why not just replicate the database?


On Jun 14, 2009, at 2:25 PM, Chris Hyzer wrote:

On the call and on the list we have been discussing highly available
web services (Im thinking about reads here not writes). Some use

1. Queries that aren't conducive to LDAP: e.g. if the limits are
exceeded of LDAP, or maybe complex queries

2. If a deployment of grouper doesn't have LDAP, it doesn't need it

3. Queries that require logic, e.g. the attribute framework will
need privilege decision point queries which require logic to get the

So I think web services should:

1. Slurp all data from the database into objects, and serialize them
to disk.

2. For read operations, use those structures to make the decision

3. If web services are shutdown, deserialize from disk (and run
recent change logs from the time it was last serialized)

4. If a certain amount of time has passed since the last slurp,
slurp again. E.g. 24 hours

5. Have a thread which reads the change log every so often (e.g.
every 2 minutes, configurable), update the object structures. Every
so often (30 minutes), serialize to disk

6. There could be an option of configuration for if you want
readonly to do this, or just hit the db, or or this if the DB is

7. Yes, if you have this mode on, it will require a lot of memory
(proportional to the size of the registry)... we could swap this
out for a local database or some other storage if we like, this part
could even be pluggable

This way you should be able to run multiple web services instances,
with a load balancer, and be highly available for readonly
operations (when DB goes down)...

Anyways, I don't think any of this is all that complicated or hard
or time consuming to build... also, Im not going to work on this
soon, just throwing out ideas...


Archive powered by MHonArc 2.6.16.

Top of Page