Skip to Content.
Sympa Menu

grouper-users - Re: [grouper-users] Grouper 1.6.1 UI membership list sorting

Subject: Grouper Users - Open Discussion List

List archive

Re: [grouper-users] Grouper 1.6.1 UI membership list sorting


Chronological Thread 
  • From: Rob Hebron <>
  • To:
  • Subject: Re: [grouper-users] Grouper 1.6.1 UI membership list sorting
  • Date: Mon, 04 Oct 2010 08:09:51 +0100

[resent from correct account, apologies if this hits the list twice]

At Cardiff we have 2 major potential subject sources (HR and Student records) and dozens of smaller ones. Our approach was to consolidate them all into a single source since we saw some real problems ahead if we did not, including:

* Duplicate records across data sources. This would mean that one person could be multiple subjects, making a single view of a person's memberships impossible. There are is significant subset of people who fall into this category (people who are both staff and students)
* Timely and accurate de-provisioning where duplicates existed since some data sources are better administered than others
* How to deal with cases where duplicates exist in a single data source

The consolidation was the first step in an Identity Management project, which later included a Grouper implementation. I can clearly see that Grouper would have been far, far harder to implement had we not had the consolidated subject source. A side-effect of the consolidation was that other services can also use it.

So, while I can see that storing a string for sorting would be of immediate benefit, with few risks, I see icebergs ahead if Grouper were to try to cache all data about all subjects from all sources. These include:

* Consolidation and de-duplication - Grouper would be caching data from multiple sources, but presenting a consolidated set. Should subjects from different sources which represent the same person be consolidated into a single subject? now? ever?
* Timeliness - how up-to-date would the local Grouper subject store need to be? In our case if would be unacceptable if it were out-of-date by more than a few seconds


Rob

On 04/10/10 04:41, Chris Hyzer wrote:
I think we are agreeing... the only difference is whether grouper does the
caching/aggregation or not. And I think there are more advantages for
Grouper doing it than for institutions to have to do it. Right?

Thanks,
Chris

-----Original Message-----
From: Jim Fox
[mailto:]
Sent: Sunday, October 03, 2010 11:35 PM
To: Chris Hyzer
Cc: Jim Fox; GW Brown, Information Systems and Computing; Peter DiCamillo;

Subject: Re: [grouper-users] Grouper 1.6.1 UI membership list sorting


My point is this: There are certain operations in grouper that are
intolerably slow. This is a common and valid complaint. Most of these
situations would be solved, or at least mitigated, by a local subject source.
I think this is a fundamental flaw of grouper. I think it is a necessary
correction.

From an external point of view there can be several subject sources. At UW
we have at least four. However, when cached they can very easily be
represented as a single source --- at least as a single table, obviating a
lot of the database unions you speak of. This is our own experience.

Imagine how much more efficient grouper might be if you could assume a local
subject source.

Jim


On Oct 3, 2010, at 7:48 PM, Chris Hyzer wrote:

I think another common use case is finding a subject in a group. i.e. we
have a subject picker that wants to do a subject search (in one source) where
the subject is an employee. This takes a long time since it has to do the
search of 500k subjects, then for the results, see which ones are employees
(which is batched, but still takes a while). Since I am a jdbc source guy, I
would agree that adding more capabilities into the subject source would be
nice (could solve this problem). However, when searching/sorting over
multiple sources, it seems more difficult even if the all the sources are
jdbc ones... so that is why keeping a sort string in the member table (for
all subjects in all sources which are members of groups) seems like a good
thing to do. If we kept a sort string in the member table, then we could
also keep a lowered search string, then we only need to do that instead of
give the JDBC sources extra features over JNDI. And if we do that, we might
as well ke
ep a name/description so if they become unresolvable, we can still show who
they are (were). Though I don't see keeping other attributes in there (full
replication) :)

Thanks,
Chris

-----Original Message-----
From: Jim Fox
[mailto:]
Sent: Sunday, October 03, 2010 12:20 PM
To: GW Brown, Information Systems and Computing
Cc: Chris Hyzer; Peter DiCamillo;

Subject: Re: [grouper-users] Grouper 1.6.1 UI membership list sorting



It is for this reason more than any other that we cache all our subjects in
tables in grouper's database. We can use simple database joins to allow the
DBMS do all the selecting and sorting, by subject id or name or whatever.

It seems obvious enough that caching a subject sort string is not much easier
than caching the entire subject record. Possibly version 2.0 could require,
or at least optimize for, local caches of subjects.

Jim

On Oct 3, 2010, at 3:49 AM, GW Brown, Information Systems and Computing wrote:

The problem is that the API is now doing the paging - but it has no means
to sort the data because, as Chris says, we don't store subject attributes.
API paging was added because even without sorting, processing very large
memberships would bring the UI to a grinding halt. So, unfortunately the
comparator.sort.limit has become redundant.

Going forward I think there are a coupe of options.

1) The UI checks how many results there will be and if under
comparator.sort.limit sets this as the pagesize for the API.
2) The API stores a sort String for subjects so it can add an order by
clause to its query. Some mechanism would be needed to refresh sort Strings.

I can have a look at option 1, but we have talked about option 2 before so
I'll wait and see what the consensus is. Either way it is likely to be 2.0
before we resolve it.

Gary

--On 03 October 2010 01:59 -0400 Chris
Hyzer<>
wrote:

I can reproduce that. The problem is that subject data is stored
externally to Grouper (in sources), so if there are 50000 subjects in a
group, that is 50000 subject lookups (assuming none in cache), which
doesn't scale. Granted if you had 356 with 250 on a page, that should
sort. :) What use case do you have? Trying to see if someone is in a
group? Just curious, what is the workaround? Anyways, Im sure Gary will
weigh in here as well...

Thanks,
Chris

-----Original Message-----
From:

[mailto:]
On Behalf Of Peter DiCamillo
Sent: Saturday, October 02, 2010 11:14 PM
To:

Subject: [grouper-users] Grouper 1.6.1 UI membership list sorting

I'm having a problem where membership lists I view in the full Grouper
UI are not sorted as I would expect. When the size of the membership
list does not exceed the page size the list is sorted. But when more
than one page is needed, the list on each page is not sorted. The value
of comparator.sort.limit in media.properties is 50000, larger than the
size of any of our groups, because even if it's slow, we always want
sorted lists.

For example, I viewed a list with 2 direct members and 356 indirect
members. When I viewed the indirect members or all members with a page
size of 500 the list was sorted. But with a page size of 250 the lists
on the pages were not sorted.

I didn't install Grouper 1.3, but we may have had a local fix for this.
But in any case, it appears that since 1.3 the code that for handling
pages and sorting has changed

Peter




----------------------
GW Brown, Information Systems and Computing






Archive powered by MHonArc 2.6.16.

Top of Page