Skip to Content.
Sympa Menu

grouper-users - Re: [grouper-users] PSPNG: syncing fails with ENTRY_ALREADY_EXISTS when group has more than 1500 members

Subject: Grouper Users - Open Discussion List

List archive

Re: [grouper-users] PSPNG: syncing fails with ENTRY_ALREADY_EXISTS when group has more than 1500 members


Chronological Thread 
  • From: Dominique Petitpierre <>
  • To: "Coleman, Erik C" <>, Grouper-Users <>
  • Subject: Re: [grouper-users] PSPNG: syncing fails with ENTRY_ALREADY_EXISTS when group has more than 1500 members
  • Date: Thu, 24 Sep 2020 04:57:21 +0200
  • Organization: University of Geneva

Hello,

thanks for your message:

On 22.09.20 23:25, Coleman, Erik C wrote:

Dominique,

 

I can confirm this behavior in our environment, running Grouper 2.5.33.  Looks like about 35 of these events last Saturday, the day we run a full sync.

If it is the same symptom, this would mean that there are 35 groups with more than 1500 members (or your value of MaxValRange).
Here is a one liner I used to calculate how many groups have more than MaxValRange members:

ldapsearch ... -b OU=Grouper,OU=Groups,OU=_UNIGE,DC=isis-klif,DC=unige,DC=ch 'objectclass=group' member| sed -e 's/;.*/;/'|uniq|grep -c '^member;'
65

and a one liner for counting the groups getting the ENTRY_ALREADY_EXISTS warning:

grep ENTRY_ALREADY_EXISTS grouper_error.log|sed -e 's/.*FullSyncer(\([^)]*\)).*data already on server:/\1/'|sort -u|wc -l
65

Group membership successfully propagated to AD without any performance impact or missing entries, so we never really “paid attention” to these log entries. 

As written in https://lists.internet2.edu/sympa/arc/grouper-users/2020-09/msg00048.html :
"So that might be why people have not noticed: incremental synchronization was doing the job, not the full sync."

I.e. I am convinced that in the described context (Grouper 2.4.0, PSPNG patch >=8 or Grouper 2.5.x), PSPNG full sync does not do its job to resolve members discrepancies between Grouper groups and Active Directory groups that have more than MaxValRange members.
But since discrepancies are probably very rare (they might be caused by a bad timed stop of the loader, or if the database or directory are unreachable at the wrong moment) it might not have ever happened (PSPNG incremental sync seems quite resilient), or maybe in large groups it is not noticed. Moreover initial provisioning of large groups is handled by full sync without problem and people who have tested full sync are likely to have done it on small groups.
So, many people did not pay attention.

Honestly, there are dozens of WARN level logs that appear to be benign, have always put off combing the logs for a rainy weekend project!

 

Have you provided this information as a comment to the JIRA entry?

Currently I am testing a fix: it looks like it corrects the problem and makes the full sync job finish quicker.

I suspect that they aren’t prioritizing this with the work on the PSPNG replacement, which will completely change the architecture of group provisioning to targets.

It seems to me that a sentence like  "members never get deleted from target system" in GRP2343 should have raised a red flag back in September 2019.
It is a matter of data integrity: if full sync does not do its job fully, fixing it should be high priority.

Regards.

 

--

Erik Coleman

University of Illinois at Urbana-Champaign

 

 

 

 

From: On Behalf Of Dominique Petitpierre
Sent: Monday, September 21, 2020 7:52 PM
To: Grouper-Users
Cc: Jeffrey Williams
Subject: Re: [grouper-users] PSPNG: syncing fails with ENTRY_ALREADY_EXISTS when group has more than 1500 members

 

Hello,

 

here is a example showing that PSPNG full sync does not work in the case I was describing:

 

On 21.09.20 19:41, Dominique Petitpierre wrote:

So when the full sync has nothing to do there is no bad consequences.
But if you stop the loader process, remove a member from the group in Grouper, and start "manually" a full sync, then, because of the bug that member will not be removed from the target directory group.

0) the Grouper group and target directory group are synchronized and have 1502 members, and user bello6 is not one of them.

ldapsearch ... gidnumber=5016234 'member'|grep -c bello6
0
 
ldapsearch ...  gidnumber=5016234 'member;range=1500-2999'|grep -c ^member
2

1) add a new member bello6 in the target directory group with

ldapmodify ...  <<EOF
dn: CN=testgroup,OU=workflow1,OU=bpm-oracle,OU=bpm-poc,OU=application,OU=student,OU=Grouper,OU=Groups,OU=_UNIGE,DC=isis-klif,DC=unige,DC=ch
changetype: modify
add: member
member: CN=bello6,OU=LETTRES,OU=ETU,OU=UsersUnige,OU=_UNIGE,DC=isis-klif,DC=unige,DC=ch
 
EOF

2) check the new member and the new number of members (1503) in the target directory group:

ldapsearch ... gidnumber=5016234 'member'|grep bello6
member;range=0-1499: CN=bello6,OU=LETTRES,OU=ETU,OU=UsersUnige,OU=_UNIGE,DC=isis-klif,DC=unige,DC=ch
 
ldapsearch ...  gidnumber=5016234 'member;range=1500-2999'|grep -c ^member
3

3) start a full sync for that group (or wait for the next global full sync launched by the daemon "loader" process).
Here is a relevant extract of the log (log level INFO, to compare to the one in https://lists.internet2.edu/sympa/arc/grouper-users/2020-09/msg00039.html):

sed -n -e '/^2020-09-21 23:50/p' grouper_error.log.2020-09-21| egrep -a 'EXIST|\b(1502|1503)\b'|sed -e 's/.* -  - //'
Fetching subjects Completed (Success): 1502 items in 09.869s (9131.6 items/minute)
activedirectory_student-full/application:bpm-poc:bpm-oracle:workflow1:testgroup/#16234(Existing): 1502 correct member subjects. Sample: [''/'person'/'people-test.unige.ch', ''/'person'/'people-test.unige.ch', ''/'person'/'people-test.unige.ch', ''/'person'/'people-test.unige.ch', ''/'person'/'people-test.unige.ch', ''/'person'/'people-test.unige.ch', ''/'person'/'people-test.unige.ch', ''/'person'/'people-test.unige.ch', ''/'person'/'people-test.unige.ch', ''/'person'/'people-test.unige.ch']...
activedirectory_student-full: Full-sync comparison for application:bpm-poc:bpm-oracle:workflow1:testgroup/#16234(Existing): Target-subject count: Correct/Actual: 1502/0
activedirectory_student-full: Group application:bpm-poc:bpm-oracle:workflow1:testgroup/#16234(Existing) has 1502 missing values
groupLdap1: Performing ldap mod (with retry) Mod Summary: dn=cn=testgroup,ou=workflow1,ou=bpm-oracle,ou=bpm-poc,ou=application,ou=student,ou=grouper,ou=groups,ou=_unige,dc=isis-klif,dc=unige,dc=ch|ADD 1502 member values|
groupLdap1: Problem while modifying ldap system based on grouper expectations. Starting to perform adaptive modifications based on data already on server: cn=testgroup,ou=workflow1,ou=bpm-oracle,ou=bpm-poc,ou=application,ou=student,ou=grouper,ou=groups,ou=_unige,dc=isis-klif,dc=unige,dc=ch: ENTRY_ALREADY_EXISTS
groupLdap1: Data already on ldap server Entry Summary: dn=cn=testgroup,ou=workflow1,ou=bpm-oracle,ou=bpm-poc,ou=application,ou=student,ou=grouper,ou=groups,ou=_unige,dc=isis-klif,dc=unige,dc=ch|1503 member values|
groupLdap1: Summary: Comparing modification of member to what is already in LDAP: ADD/1502 Values
groupLdap1: Comparing Attribute member. #Values on server already 1503. #Values in mod/ADD: 1502
FullSync Item done (Successful: Success). Stats: ins=1502|del=0|upd=0|tot=1502|t=41 secs/ProcTime: 41.384s Timing breakdown: init=0%/ReadGroupFromGrouper=0%/ClearingGroupCache=0%/StartCoordination=0%/StartProvisioning(get group & subject info)=1%/doFullSync=98%/FinishProvisioning=0%/FinishCoordination=0%/: Action="application:bpm-poc:bpm-oracle:workflow1:testgroup|qid=1|Trigger=Scheduled" full sync|ExternalRef=null|AsOf=21 sept. 2020 23:50:02|QTime=00s|Age=41.384s
FullSyncer(activedirectory_student): Full Sync of all groups progress: 1 groups of 1 (100%): ins=1502|del=0|upd=0|tot=1502|t=41 secs
FullSyncer(activedirectory_student): Full Sync of all groups: Finished. Stats: ins=1502|del=0|upd=0|tot=1502|t=0 secs
 

The first log lines are the same: because of the bug, the provisioner deduces that  the group is empty (Correct/Actual: 1502/0) and tries to add all the members again (ADD 1502 member values) and that operation fails (ENTRY_ALREADY_EXISTS). Since that failure is not fatal it continues and checks the content of the group again, with another method that is not bugged, and finds that there are 1503 members (Values on server already 1503) and eventually concludes that the supposedly missing members have been inserted (Stats: ins=1502), but no deletion occurred (del=0) and syncing is finished with 1502 members in the group.

4) check that the extra member in the target directory is still there and there are still 1503 members in the target group:

ldapsearch ... gidnumber=5016234 'member'|grep bello6
member;range=0-1499: CN=bello6,OU=LETTRES,OU=ETU,OU=UsersUnige,OU=_UNIGE,DC=isis-klif,DC=unige,DC=ch
 
ldapsearch ...  gidnumber=5016234 'member;range=1500-2999'|grep -c ^member
3

5) Conclusion: full sync does not work in that case, at least in my context (cf. Grouper version, PSPNG patches and settings in https://lists.internet2.edu/sympa/arc/grouper-users/2020-09/msg00039.html).

 

If you have Grouper 2.4.0 or greater with PSPNG patch 8 or greater, syncing to a target Active Directory, could you please try something similar for a group with at least 1501 members (more than AD lDAPAdminLimits: MaxValRange=1500), with the following log level in conf/log4j.properties:

log4j.logger.edu.internet2.middleware.grouper.pspng=INFO

or, if you want to see if there is a bad searchEntryHandlers and the resulting empty list of values for the member attribute (cf. https://lists.internet2.edu/sympa/arc/grouper-users/2020-09/msg00042.html):

log4j.logger.edu.internet2.middleware.grouper.pspng=DEBUG
log4j.logger.org.ldaptive=DEBUG

 

I am curious of your findings.
Thanks in advance!

 

Regards.


-- 
Mr Dominique Petitpierre, user=Dominique.Petitpierre domain=unige.ch
IT Division, University of Geneva, Switzerland
 



Archive powered by MHonArc 2.6.19.

Top of Page