Skip to Content.
Sympa Menu

grouper-users - [grouper-users] RE: Grouper loader sync problem

Subject: Grouper Users - Open Discussion List

List archive

[grouper-users] RE: Grouper loader sync problem


Chronological Thread 
  • From: Chris Hyzer <>
  • To: "Omaraie, Brad" <>, "" <>
  • Subject: [grouper-users] RE: Grouper loader sync problem
  • Date: Sat, 22 Jun 2013 01:20:13 +0000
  • Accept-language: en-US
  • Authentication-results: sfpop-ironport04.merit.edu; dkim=neutral (message not signed) header.i=none

 

> Hi Chris,

> Thanks for getting back to me so fast. For the first issue, the

> root:nameOfStem1:nameOfStem2: is actually the root stem of a large

> tree that all the groups are under. I was trying to load like one

> third of groups under this stem first and then the second and third

> one cause each chunk takes two three days to load and we wand to do

> it over weekends. So to answer your question, the like string is set

> to the root for the large batch of groups that these three chunks are part of.

 

Right, set the like string to blank

 

>

> For the second issue, we are currently using 2.1.2 version. The

 

I vaguely remember this issue, can you try this with 2.1.4?

 

> logic behind the scene for sync process, I assume at the first load grouper

> gets the subject identifier from source view, query ldap and get the subject

> data and store it in its registry. So next time it does the sync, it compare

 

Grouper doesn’t store subject identifiers.  So if you can go a loader job that does subject id’s, that is better.  If not, it will do lookups.

 

> sure we set them correctly. This might explain why it tries to delete/add members

> instead of compare them.

 

No, it still shouldn’t delete/add members, though it will do subject lookups.

 

>

> Thanks,

> Brad

> 

 

From: Omaraie, Brad [mailto:]
Sent: Friday, June 21, 2013 2:06 PM
To: Chris Hyzer
Subject: Re: Grouper loader sync problem

 

Hi Chris,

Thanks for getting back to me so fast. For the first issue, the root:nameOfStem1:nameOfStem2: is actually the root stem of a large tree that all the groups are under. I was trying to load like one third of groups under this stem first and then the second and third one cause each chunk takes two three days to load and we wand to do it over weekends. So to answer your question, the like string is set to the root for the large batch of groups that these three chunks are part of. 

 

For the second issue, we are currently using 2.1.2 version. The query for subject is using university ids (uid) as subject identifier and get the subjectids (ppids) from ldap for each member. So at first load we have one call to ldap for each member. Just so I understand the logic behind the scene for sync process, I assume at the first load grouper gets the subject identifier from source view, query ldap and get the subject data and store it in its registry. So next time it does the sync, it compare the stored subject identifiers (uids in this case) with subject identifiers coming from source view for that group and find the delta between the two and process it. Is that a correct assumption? In this case it should have stored subject identifiers somewhere in registry to compare with incoming subject identifiers for fast delta detection, otherwise it has to make a call to ldap for each incoming subject identifier from source view again and then compare the subject ids. 

 

I'm asking this cause when I look at grouper_members table, I see subject_id, name and description columns being populated with subject data, but there's no subject identifier field! I see these are being set up in sources.xml file under ldap section, but nothing there for subject identifier or uid. I'm asking this cause these were set up before I take over this project and I just want to make sure we set them correctly. This might explain why it tries to delete/add members instead of compare them.

 

I attached the log file with grouper.app.loader set to debug. As you can see, there's a tree of about 50 groups there but they're all empty except a group with id 55. This log is for about three minutes of loader running at every minute. And yes, I'm a java guy and can debug this if you want me to. Can you point me to the right classes I should put break points in?

 

Thanks,

Brad

 

 

From: "" <>
Date: Friday, June 21, 2013 5:46 AM
To: ucla <>, "" <>
Subject: RE: Grouper loader sync problem

 

 

> 1) At the time of loading the data for the first time, if I break the dataset to

> three chunks and load the first chunk, when I start to load the second chunk,

> grouper starts to delete the first chuck (members of those groups) before loading

> the second one, even though I set loader.sqlTable.likeString.removeGroupIfNotUsed = false 

> in grouper-loader.properties file and also set

> grouperLoaderGroupsLike to root:nameOfStem1:nameOfStem2:%_systemOfRecord in loader group

> attributes. What setting should I use to force grouper not to delete other groups while

> updating with a new sql query? (I guess that's what you call orphant groups)

 

I think if you don’t set the likeString, then it shouldn’t affect other groups.  Though how would orphans be deleted then?  Are you sure you cants set a like string for the batch of groups or just keep it as 1 large batch?

 

 

> 2) When I have a group loaded to grouper, if I set loader group's

> grouperLoaderQuartzCron attribute to one minute for testing, any

> time the loader runs it deletes most of the group members and add

> them again although no changes was done to the group. I said most

> of them because I noticed for a group with 55 members, it deleted

> 19 of them and then added the same 19 again back. And it keeps

> doing this every minute although no changes was done to the group

> through grouper UI or to the source view that the data is being

> read from! Shouldn't the correct behavior be like if the group is

> loaded and there's no changes to the members of it on the source

> view side, no changes should happen on grouper side either? Why

> does it keep deleting the group members? This can be very problematic

> in our case with some of our groups having two or three hundred

> thousands of members and if they get deleted to be re-populated on

> nightly syncs, the sync will never finish!

 

Yes, it should definitely not do that!  J  Hmmm… are you using 2.1.4?  Is the query for the subjects by subjectId or subjectIdentifier?  subjectId is better, though it should work with either.  Do you have a simple example that I could reproduce?  Are you a Java person and you can debug this and see why?  J Can you set logging to debug and let me know the output?

 

log4j.logger.edu.internet2.middleware.grouper.app.loader = DEBUG

 

Thanks,

Chris

 

From: Omaraie, Brad []
Sent: Thursday, June 20, 2013 8:54 PM
To: Chris Hyzer
Subject: Grouper loader sync problem

 

Hi Chris,

We are getting to the point that we want to go to production with our Grouper installation. For that to happen we need to load three set of large data to grouper and we want to have nightly sql loader sync with our data sources. We are using change log consumers for provisioning to ldap and that seems to be working fine. But when I try to load and and sync our data, I have the following problems:

 

1) At the time of loading the data for the first time, if I break the dataset to three chunks and load the first chunk, when I start to load the second chunk, grouper starts to delete the first chuck (members of those groups) before loading the second one, even though I set loader.sqlTable.likeString.removeGroupIfNotUsed = false  in grouper-loader.properties file and also set grouperLoaderGroupsLike to root:nameOfStem1:nameOfStem2:%_systemOfRecord in loader group attributes. What setting should I use to force grouper not to delete other groups while updating with a new sql query? (I guess that's what you call orphant groups)

 

2) When I have a group loaded to grouper, if I set loader group's grouperLoaderQuartzCron attribute to one minute for testing, any time the loader runs it deletes most of the group members and add them again although no changes was done to the group. I said most of them because I noticed for a group with 55 members, it deleted 19 of them and then added the same 19 again back. And it keeps doing this every minute although no changes was done to the group through grouper UI or to the source view that the data is being read from! Shouldn't the correct behavior be like if the group is loaded and there's no changes to the members of it on the source view side, no changes should happen on grouper side either? Why does it keep deleting the group members? This can be very problematic in our case with some of our groups having two or three hundred thousands of members and if they get deleted to be re-populated on nightly syncs, the sync will never finish!

 

I know these might be repeated questions, but I looked everywhere in documentation and couldn't find a good solution. I really appreciate your help.

 

Thanks,

Brad

 

 

 

 

 

 




Archive powered by MHonArc 2.6.16.

Top of Page