Skip to Content.
Sympa Menu

grouper-users - Re: [grouper-users] Problem with PSP and Active Directory replication

Subject: Grouper Users - Open Discussion List

List archive

Re: [grouper-users] Problem with PSP and Active Directory replication


Chronological Thread 
  • From: David Langenberg <>
  • To: Gagné Sébastien <>
  • Cc: Tom Zeller <>, "<>" <>
  • Subject: Re: [grouper-users] Problem with PSP and Active Directory replication
  • Date: Mon, 29 Apr 2013 15:12:38 +0000
  • Accept-language: en-US
  • Authentication-results: sfpop-ironport02.merit.edu; dkim=neutral (message not signed) header.i=none

Know issue with Active Directory replication.  We have the same problems here at Chicago (outside of the PSP) and get around it by sticking all IAM operations to one DC (and closely coordinating outages).  The problem comes in that you have multiple connections to to different DCs due to the round-robining.  The PSP comes by, sees the group missing during the calculation and issues a create group command.  Then we get to the memberships, a different connection to a different DC winds up getting used, however, the group doesn't exist (error, so create group).  Now you have two groups.  Then AD replication fires and there you go -- two groups, same name, one gets renamed.  

Your options are basically:

A) the easy way: stick to a single DC.  Yes, it takes out HA and really annoys the AD Admins, but it ensures a 100% consistent domain controller.
B) the hard way: try to dial AD replication such that changes to a DC are replicated near-real-time.  Trouble with this is if the PSP is faster than your replication, you'll hit this error again.

Dave

--
David Langenberg
Identity & Access Management
The University of Chicago




On Apr 29, 2013, at 9:01 AM, Gagné Sébastien <>
 wrote:

Yes I thought of using one specific DC, but then we would lose the high availability with our LDAP. The admins here are counting on the fact that users will use one of DNS addresses.
 
But what would cause that problem ? How can the bulkSync create a group two times ? One wild guess would be that it tries to create the group, then another request is made to check if it’s indeed there, but  in the meantime a new request is made to the DNS, so it doesn’t see the group and creates it again. Would that even be possible ?
 
 
De :  [mailto:grouper-] De la part de Tom Zeller
Envoyé : 29 avril 2013 10:35
À : 
Objet : Re: [grouper-users] Problem with PSP and Active Directory replication
 
I suggest picking one domain controller to provision to, and not using round-robin DNS.
 
Not sure about the delete operation failing on the "special" DN, but I am not surprised. My guess is that the formatting is not preserved upon deletes properly.

 

On Fri, Apr 26, 2013 at 10:32 AM, Gagné Sébastien <> wrote:
Hi,
I’m having a problem with the PSP and Active Directory replication. In fact I have two problems :
First : It seems the PSP is creating groups on two different domain controllers, this causes a conflict that is then resolved by the DCs (it renames the older one with CNF:<uid>)
Second: The PSP isn’t able to deleted the “resolved” group from the conflict
 
Here is a little background and details :
I have a Java application that automatically creates courses groups. These groups and synced in real-time by the PSP as well as in bulk by the PSP. Access to my domain controllers are done via a round-robin DNS name that will return one of five IPs.
 
For a yet unknown reason, it seems that the PSP is creating the same group on two different domain controllers. When they replicate, the conflict is found and resolved : the older group is renamed with the suffix “CNF:<uid of object>”, for example I have :
CN=A13_4602-ETU,OU=acad,OU=Grouper,OU=People,DC=sim,DC=umontreal,DC=ca
CN=A13_4602-ETU
CNF:f4358f7b-1fc6-462f-bb56-d0f0c7ed36d4,OU=acad,OU=Grouper,OU=People,DC=sim,DC=umontreal,DC=ca
 
<image001.png>
 
(Yes there’s actually a carriage return/line feed in the name…)
 
First problem : why is the PSP creating this group twice ? Looking at the creation date it tells me that both groups were created during the bulkSync and in the logs I see (there are many more for other timestamps) :
2013-04-25 13:23:04,813: [main] ERROR BaseSpmlProvider.execute(320) -  - Target 'ldap' - Lookup LookupResponse[pso=<null>,status=failure,error=noSuchIdentifier,errorMessages={},requestID=2013/04/25-13:23:04.812]
2013-04-25 13:23:04,814: [main] ERROR BaseSpmlProvider.execute(320) -  - Target 'psp' - Lookup LookupResponse[pso=<null>,status=failure,error=noSuchIdentifier,errorMessages={},requestID=2013/04/25-13:23:04.812]
 
I waited to be sure that the real-time PSP was done before starting the bulkSync. So it’s not a concurrent creation problem. One thing I find weird is that there’s only one problem group. I created 1600 groups yesterday and only one had this problem. This isn’t an isolated case either, a few days ago I had the same problem with 3 other groups, I manually removed them thinking it was a fluke and wouldn’t come back, but it did today.
 
This was the first problem. The second one comes in the following bulkSync. The PSP is the authoritative source for my OU, so it sees the renamed group with CNF, and tries to delete it. The problem is that the delete fails (see logs below). The delete is working properly on all the other groups. What is the problem here ? Is it the weird character in the group’s name ? Is the name too long (AD doesn’t seem to have a problem with it) ?
 
2013-04-26 06:02:28,720: [main] ERROR BaseSpmlProvider.execute(254) -  - Target 'ldap' - Delete DeleteResponse[status=failure,error=customError,errorMessages={CN=A13_4602-ETU
CNF:f4358f7b-1fc6-462f-bb56-d0f0c7ed36d4,OU=acad,OU=Grouper,OU=People,DC=domain,DC=umontreal,DC=ca: [LDAP: error code 34 - 0000208F: NameErr: DSID-031001BA, problem 2006 (BAD_NAME), data 8349, best match of:
        'CN=A13_4602-ETU
CNF:f4358f7b-1fc6-462f-bb56-d0f0c7ed36d4,OU=acad,OU=Grouper,OU=People,DC=domain,DC=umontreal,DC=ca'
_]},requestID=2013/04/26-06:02:28.716]
2013-04-26 06:02:28,722: [main] ERROR BaseSpmlProvider.execute(254) -  - Target 'psp' - Delete DeleteResponse[status=failure,error=customError,errorMessages={CN=A13_4602-ETU
CNF:f4358f7b-1fc6-462f-bb56-d0f0c7ed36d4,OU=acad,OU=Grouper,OU=People,DC=domain,DC=umontreal,DC=ca: [LDAP: error code 34 - 0000208F: NameErr: DSID-031001BA, problem 2006 (BAD_NAME), data 8349, best match of:
        'CN=A13_4602-ETU
CNF:f4358f7b-1fc6-462f-bb56-d0f0c7ed36d4,OU=acad,OU=Grouper,OU=People,DC=domain,DC=umontreal,DC=ca'
_]},requestID=2013/04/26-06:02:28.716]
2013-04-26 06:02:28,735: [main] ERROR Psp.execute(811) -  - Psp 'psp' - BulkSync BulkSyncResponse[id=<null>,status=failure,error=<null>,errorMessages={},requestID=2013/04/26-05:45:33.395,responses=14256]
 
Thanks
 
Sébastien Gagné,     | Analyste en informatique
514-343-6111 x33844  | Université de Montréal,
                     | Pavillon Roger-Gaudry, local X-100-11
 





Archive powered by MHonArc 2.6.16.

Top of Page