Skip to Content.
Sympa Menu

grouper-users - RE: [grouper-users] Large number of changes and provisioning

Subject: Grouper Users - Open Discussion List

List archive

RE: [grouper-users] Large number of changes and provisioning


Chronological Thread 
  • From: Dave Churchley <>
  • To: Jeffrey Crawford <>, Gouper Users List <>
  • Subject: RE: [grouper-users] Large number of changes and provisioning
  • Date: Wed, 16 Aug 2017 15:52:49 +0000
  • Accept-language: en-GB, en-US
  • Authentication-results: mailhub-mx5.ncl.ac.uk; spf=pass smtp.mailfrom=newcastle.ac.uk
  • Ironport-phdr: 9a23:V53cQhHHplCi0/taHaIl1J1GYnF86YWxBRYc798ds5kLTJ7yr8+wAkXT6L1XgUPTWs2DsrQf2rqQ6/iocFdDyK7JiGoFfp1IWk1NouQttCtkPvS4D1bmJuXhdS0wEZcKflZk+3amLRodQ56mNBXdrXKo8DEdBAj0OxZrKeTpAI7SiNm82/yv95HJbQhFgDmwbaluIBmqsA7cqtQYjYx+J6gr1xDHuGFIe+NYxWNpIVKcgRPx7dqu8ZBg7ipdpesv+9ZPXqvmcas4S6dYDCk9PGAu+MLrrxjDQhCR6XYaT24bjwBHAwnB7BH9Q5fxri73vfdz1SWGIcH7S60/VC+85Kl3VhDnlCYHNyY48G7JjMxwkLlbqw+lqxBm3oLYfJ2ZOP94c6zaZd0aQmlPUMhMXCBFH4+wc44DAuwcNuhasob9vUMDoxugCwexGOPhxD1Hhn7q0qI13eQsChvJ0BA8E98UrHjYsNf4OaEPWu611qnIyjDDYutY1zjn9ojHbA0qrO+MXbJta8re1UkvFwfbgVSQtIfoOD2b2f4Ts2eF9eZtS+Kvi3M9qw5tojivwtkji4nSi4IT1F/F+zl2z5wuJd2kTE50f8SoH4FMuC2AKot2QdovTmd1syg50r0LoYO3cDYJxZg92RLSZOCLf5KW7h/tW+udOzR1iXJ9dL6imRq/8lKsxvPzW8WoylpGsDRJn9nWun0P1hHf8saHReV+80u93DuC0hrc5fxLLE8vkKfUN50sz70wm5cWrEvOHzL5lUbzgaKZbEop+e6l4Pn9bLr8vJ+TLYp0hxn+Mqswnsy/Bvw1Mg0UUGia/eS81KDj8lbgTLVPgfw7krPVsJbGKsQfvKK2GRNa3pw+6xa8Fzum0dIYkmcbLF9dZR6LkYvkN0vQLP33EfuzmUqgnTlkyvzcPbDsAo3BLn3Zn7fgebZ95VRcyA02zd1H6ZxbEK8BL+zpVkLvtNzYEwU5PBC0w+bjFtp81pgTWWSRDa6XLKzdq0SI5v4rI+aSfoMaoiryJ+I45/7tlXM5g0MSfbG13ZsLb3C1BvtmI1+eYXrxmtcOD30KshMjTOzxllKCSyVTam22X6I9/TE7FJmmAZnZSoCshryBwDm0HodQZm9YFlCAD23keJuZVPcRO2quJZpDnzIKT7WwA6Q73A6gsxX2g+5lI+Ta4CAJvrr+38N+oeDfiEd2vRNzEtyQySmwVGxuhSssRiQqlPRwu0tg0lqZlLVjjuZDPd1V+/5TVAomb9jRw/EsWP7oXQeUUt6XRR6NQsi8CjE1Q8N5l/MHfUs7MNW/lBvD2ye7K7kYiLeCH4Aw/+TV1D7sJJAumD79yKA9ggx+EYN0Pmq8i/s6plDe
  • Spamdiagnosticmetadata: NSPM
  • Spamdiagnosticoutput: 1:99

Seems like a great idea to me. I hope you don’t mind me asking an off-topic (but definitely related) question…

 

We’ve suffered from similar backlogs with PSP when there have been large membership changes. The first thing that struck me about your solution, however, was that you said your bulkSync takes half an hour. For us it takes about 2 weeks! For this reason, we never do a bulkSync and rely on PSP incremental provisioning to AD so when we get a backlog we just have to live with it.

 

We’re on v2.2 using PSP to provision to AD. Since we started using PSP we’ve found it to be quite slow but, under normal load, it copes pretty well and updates AD in about a minute. We do occasionally suffer provisioning backlogs though. We know there’s a big one on the horizon at the end of the month, as our academic year switches over. Last year we had a backlog for about 2 weeks.

 

Does anyone have any suggestions that might help us out? We’ve recently been testing PSPNG on v2.3 but it doesn’t appear to be quite production-ready for our needs yet.

 

Thanks
Dave

Newcastle University

 

From: [mailto:] On Behalf Of Jeffrey Crawford
Sent: 11 August 2017 21:53
To: Gouper Users List <>
Subject: Re: [grouper-users] Large number of changes and provisioning

 

I just got back from vacation, based on Carl's response would it be possible to implement a change like this to only the psp (or pspng)?


Jeffrey E. Crawford
Enterprise Service Team

    ^         ^

   / \  ^    / \    ^

  /   \/ \  /   \  / \

 /        \/     \/   \

/                      \

 

You have been assigned this mountain to prove to others that it *can* be moved.

 

On Thu, Jul 20, 2017 at 9:19 AM, Jeffrey Crawford <> wrote:

I suppose we are kinda doing the same thing, except I'm modifying the internal data and kicking off a full sync as opposed to redirecting the psp via message queue.

 

Although I agree the composite change could be improved, I can also see a case where someone may need to populate a large group, say alumni to be able to retrieve transcripts. That would in one go create a group of over 200,000 records. This could be performed by someone we've delegated access to for one, so we may not be immediately aware of it until load and update times become delayed.

 

It sounds like you may have set up a better more configurable intermediate queuing system, which we don't have and probably are not going to get at the moment. So some of our choices are more limited in that regard.

 

I supposed that the config could be based on a particular provisioner, so for example a different max change before bulk sync setting could be done per item in the grouper-loader config file. That would allow you to set a different max change before bulk sync config per item. The bulk sync settings are already configurable per provisioner so that would make sense to me. However you make a good point that the setting for LDAP may need to be different than a setting for Google groups, etc.

 

I don't know how easy of difficult the above would be since I'm not in the code pretty much at all. I will say we use grouper pretty much as a stand alone product. The change log works pretty well for the most part in handling changes, but there is a point in diminishing returns when a large number of changes need to happen.


Jeffrey E. Crawford
Enterprise Service Team

    ^         ^

   / \  ^    / \    ^

  /   \/ \  /   \  / \

 /        \/     \/   \

/                      \

 

You have been assigned this mountain to prove to others that it *can* be moved.

 

On Thu, Jul 20, 2017 at 7:32 AM, Waldbieser, Carl <> wrote:

Jeffery,

The issue is specific to a class of provisioners.  If I assume updates dominate the work performed by LDAP services, then work performed by incremental updates to a group is O(n).  If I perform an update where I know the end state, that is O(1).

Compare that to a target that is a database (without transactions, as that will just make the example complex).  Suppose the database represents each group member as a single row.  In that case, incremental updates and bulk updates actually must perform the same amount of work.

In your specific example, it would actually be ideal that the composite could be edited in place without causing every member to be removed and many re-added.  It would be useful if one could create a new composite and "replace into" an existing composite.  In that case, only the actual differences would be reported.  This would more accurately reflect the *intent* of the changes an operator wanted to make.

The Lafayette LDAP provisioner does optimize to some extent to handle this case.  The provisioner does not process LDAP changes immediately as it receives them.  Instead, it collects the incremental changes in a database and processes them in batches at some short, configurable, regular interval (~20s in production).  This allows a couple optimizations:
1) If a subject has multiple add/removes to a specific group, only the last operation needs to be processed.
2) If multiple subjects are added/removed to/from a group, the group only needs to be updated 1 time for that batch.  The wider the update interval, the more subjects you can process per batch.

I have run into the scenario you are talking about.  In general, since I know it is going to create a lot of churn, my approach is to temporarily route the changes to a null route (via our rabbitMQ exchange) so the messages are discarded.  Once I am finished with the change, I re-instate the original route and then fire off a bulk sync.  Your suggestion would make that to happen automatically, and I agree it would be useful.  I am unsure though whether Grouper should *not* produce incremental changes for the change logger in this case, though.

Thanks,
Carl Waldbieser
ITS Systems Programmer
Lafayette College


----- Original Message -----
From: "Jeffrey Crawford" <>
To: "Gouper Users List" <>
Sent: Tuesday, July 18, 2017 2:26:52 PM
Subject: [grouper-users] Large number of changes and provisioning

We had an interesting case show up not so long ago, basically there was a
change in a group that in effect, removed everyone, and then added them all
back (composite group change), there were > 122,000 members of the group so
it cause a huge back log of changes that wound up taking quite a few hours.

Eventually I just stopped grouper, tagged the psp entry in the
grouper_change_log_consumer, to be the same number as syncGroups and
restarted, which performed a bulkSync. That only takes 30 min.

Additionally what I noticed is that our ldap servers were backed up quite a
bit as they were busy deleting records one at a time, and then adding them
again one at a time.

It got me to thinking that perhaps there should be a setting that will
identify how many records are supposed to change from the change log and
say if it's over 10,000, instead of processing the change log, it would
sync up the psp record to match syncGroups, and perform a bulk sync, which
is also easier on the LDAP servers as it does a compare and just modifies
what needs to change.

This setting would be settable by the admin since different environments
might find they should process 30,000 before the change log takes longer
than a bulk sync for example

Thoughts?

Jeffrey E. Crawford

Enterprise Service Team <>

    ^         ^
   / \  ^    / \    ^
  /   \/ \  /   \  / \
 /        \/     \/   \
/                      \

You have been assigned this mountain to prove to others that it *can* be
moved.

 

 




Archive powered by MHonArc 2.6.19.

Top of Page