grouper-dev - RE: [grouper-dev] Change Log consumer with respect to PIT data
Subject: Grouper Developers Forum
RE: [grouper-dev] Change Log consumer with respect to PIT data
- From: "Black, Carey M." <>
- To: Shilen Patel <>
- Cc: "" <>
- Subject: RE: [grouper-dev] Change Log consumer with respect to PIT data
- Date: Thu, 20 Dec 2018 18:36:25 +0000
- Accept-language: en-US
- Authentication-results: spf=pass (sender IP is 184.108.40.206) smtp.mailfrom=osu.edu; duke.edu; dkim=pass (signature was verified) header.d=osu.edu;duke.edu; dmarc=pass action=none header.from=osu.edu;
- Authentication-results-original: spf=none (sender IP is ) ;
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
RE: "Also the PIT data doesn’t keep track of group name changes when renames
are done. "
Great to know. So my "key into PIT" being the name (String form of
path) is generally a bad idea. And I bet that is likely most of my issue.
I will move to using the ID and see how that works out. ( I am
guessing this will solve my issue. Time will tell.)
As a further comment. On the topic.
Having N "names" for an object is good, and also confusing to
users as well as developers.
Id() = guid for the object
name() = fully qualified folder structure back to
the root of the repo.
( also referred to as "path" in some doc/UI
AlternateName() = previous "name()" value
DisplayName() = English alias for the object
extentionDisplayName() = terminal portion (node?) of
Knowing that the rule is:
"Luke, use the ID. Use the ID!"
Can avoid others from falling into this trap(Dark
side of using one of the other forms of identifiers for grouper objects in
the API layer) too.
And maybe that is why I had to write that method for
myself. Maybe there already is one based on the "ID" values. And that looked
to me to be "This is not the method you are looking for.". ( I will have to
RE: " It’ll (PIT data) just know about the current name or the name at the
time of deletion. "
Uh... "current name" changes over time and PIT generally does not
track those changes.
So I think you are saying that the PIT "name data" is not managed in
a PIT way?
AKA: PIT names are limited to a single value at all times?
That name will be the value "at create" and/or "at delete" times.
Is that correct? ( If so, very good to know that
feature/design detail of PIT data.)
From: Shilen Patel
Sent: Thursday, December 20, 2018 11:27 AM
To: Black, Carey M.
Subject: Re: [grouper-dev] Change Log consumer with respect to PIT data
How are you looking for the group in the PIT data? You should be able to
find the group there even before it’s deleted. So I’m not sure why you’d
only be finding the deleted group sometimes. I wonder if you’re running into
a caching issue?
The PIT data is populated from the temp change log. So the data is there
before the change log consumers would be seeing the changes.
Also the PIT data doesn’t keep track of group name changes when renames are
done. It’ll just know about the current name or the name at the time of
deletion. All the membership and attribute history are tied to the group id
(or stem/attribute def id). And that doesn’t change when you rename a group.
So it might help if your group and attribute assignment lookups happen using
the id instead of the name?
On 12/20/18, 10:54 AM,
on behalf of Black, Carey M."
on behalf of
I have started trying to deal with a change log consumer and
"[group/stem] name changes". I have taken a stab at this in the past, but did
not get very far. This time I kind of have it working, but I don't like what
I am seeing. So I am guessing that either, or both, of these things are true
at this point.
1) I am the only one doing this
2) I am doing something wrong
I will point out that:
I am working in the v2.3 universe.
I had to write some code ( AKA: isStemMarkedForSync(PITStem
parentFolder, PITAttributeDefName pitSyncAttribute) ) that uses
"GrouperDAOFactory.getFactory().getPITAttributeAssign()") to try to resolve
attribute assignments in the ancestors of a group/stem. Needing to dive that
deep into the Grouper API seems... wrong... but I found no other way to do
it. So clearly I could be doing things incorrectly here. :)
What I appear to be seeing is this.
Order of events:
1) UI user deletes a group.
2) Change Log Consumer (CLC) runs before "temp change log" daemon job
runs --> no events to process no problem.
3) "temp change log" daemon job runs ( now there are queued CLC events
to be processed )
4) Change Log Consumer (CLC) runs (again) after "temp change log" daemon
job runs --> events to process so processing starts.
Processing looks in PIT data for the "deleted group" and it finds it
but only "sometimes".
NOTE: If I throw an exception when the PIT group is not found then
the next time the CLC runs it likely finds the PIT group.
So it appears that I don't understand how the state of the PIT data
integrity ( with respect to the main repository integrity ) works from a
Also I think this sequence of events also leads to other issues (like the
intermediate states not making it into the PIT data at all.
1) UI user creates a group named "foo".
2) UI user renames group "foo" to "bar".
3) UI user renames group "bar" to "newFoo".
If those events happen "fast enough" then the CLC would see events
for all three, but the PIT data may never have the "foo" and/or "bar" group.
( At least I have not been able to find a way to find those objects. )
Even if the CLC throws several times the PIT data seems to never "get
all of the state changes".
Which can leave the CLC "throwing exceptions" and becoming
"stuck" on a given CLC event that it appears to never recover from.
So a few direct questions:
How is the PIT data constructed?
From the CLC or directly from the main repository? ( and at
what time/interval/process ? )
If it is from the main repository, that seems bad and
likely to have "gaps" if changes are made quickly.
If it is from the CLC then I don't understand how
what I am doing does not eventually find the data in the PIT. ( Maybe/likely
it is my code trying to deal with PIT data.)
I suspect the answer is that the PIT data is constructed on some "time
interval/batch" from the main repository and if enough changes happen in that
repository fast enough then the PIT become blind to the intermediate states.
Yet the CLC events have all of those intermediate states.
- [grouper-dev] Change Log consumer with respect to PIT data, Black, Carey M., 12/20/2018
- <Possible follow-up(s)>
- Re: [grouper-dev] Change Log consumer with respect to PIT data, Shilen Patel, 12/20/2018
- RE: [grouper-dev] Change Log consumer with respect to PIT data, Black, Carey M., 12/20/2018
Archive powered by MHonArc 2.6.19.