grouper-users - RE: [grouper-users] Maintaining Grouper database size
Subject: Grouper Users - Open Discussion List
List archive
- From: "Hyzer, Chris" <>
- To: "Black, Carey M." <>
- Cc: Shilen Patel <>, David Langenberg <>, Gail H Lift <>, "" <>, Rory Larson <>
- Subject: RE: [grouper-users] Maintaining Grouper database size
- Date: Fri, 16 Feb 2018 17:52:33 +0000
- Accept-language: en-US
- Authentication-results: spf=none (sender IP is ) ;
- Ironport-phdr: 9a23:kKi15h1ti3J8NEKbsmDT+DRfVm0co7zxezQtwd8ZsesWK/vxwZ3uMQTl6Ol3ixeRBMOHs6kC07KempujcFRI2YyGvnEGfc4EfD4+ouJSoTYdBtWYA1bwNv/gYn9yNs1DUFh44yPzahANS47xaFLIv3K98yMZFAnhOgppPOT1HZPZg9iq2+yo9JDffwtFiCChbb9uMR67sRjfus4KjIV4N60/0AHJonxGe+RXwWNnO1eelAvi68mz4ZBu7T1et+ou+MBcX6r6eb84TaFDAzQ9L281/szrugLdQgaJ+3ART38ZkhtMAwjC8RH6QpL8uTb0u+ZhxCWXO9D9QKsqUjq+8ahkVB7oiD8GNzEn9mHXltdwh79frB64uhBz35LYbISTOfFjfK3SYMkaSHJBUMhPSiJBHo2yYYgBD+UDPOZXs4byqkAUoheiGQWhHv/jxiNKi3LwwKY00/4hEQbD3AE4Ed4AsnTVrdTrO6cISey+0bfFzTXZb/NXwjfx5pXDfxckof6QXbJxccvQxlc1Fw7ej1WQspDqMymI1uQVrWeb6exgWfixhGE6tgF8uz6izdoihInOg4Ia0FHE9SNhzYY0I924VFB0YcSiEJROqyGWKZF6Td0/TGF1oCo60qcGuZm8fCgE0JQnwB/fa/qbc4SS/h3jU+ORLS93hHJ/YLKzngi+/lW9xuD9VMS531BHpTdGnNnUrn0ByQHf58mdRvZz4EutwyuD2gPP5u1ePEw5l6rWJ4Y8zrM+ipYfq0DOEjLslEnokaObcl8o9vWq5unmernmqIGTOoxohgz7N6khhNGwDvg2MgULUWiW9+Cx2bzm8ELjXLpHiuM6n6zXsJ3aIckWqai0CBJP3Ik58RawFTKm3cwYnXYZKFJFfwqKgZD1Nl/JPPz0EO6zjkm0njpl3vzGOabuDYvXInjEjbfhYa1y60lByAo10N9T/YpUCqsGIPLvRED+qMDYDh4+Mwyy2ernD8h91p8aWWKIBa+ZM7nevkOP5uIqO+WMZYkVtyjhK/U9+fLikH40lUUTcKW3x5cbdXO1Euh8L0mEY3fhgs8NEWIQsQo/SOzqhkeCUTlWZ3uqWqIz+jE7CYKnDIjdXICgm72B3DynEZFMe2BGEk6DEXHud4meRfgDdT+SLtd7kjMYTbihV5Mh1Ra2uQ/10bpnKffU+jUGupL5zdR1+vbTmg8o9TxvFMmd12CNT3ponmMTWTM6xqF/oUphylidy6h4heJXFcBN6/9TTAg1KIPcnKRGDIW4cAbIddTNAH2vWNi3SRR3BJplydsHaEU7Qo/5phfYwmynD6JDxJKRA5lhuILNzXXrY45Wy2zHz+Np21wtQtpdOHeOh7V0sRXLCojP1UiVivD5JuwnwCfR+TLbniK1t0ZCXVs1CP2dUA==
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
Heres what I will do shortly. Please read carefully and let me know asap on these two default choices which I believe are useful and conservative. ############################################ ## audit entries with no logged in user aren’t really all that useful. There is point in time data still. So removing these shouldn’t be a big deal ## default is remove these that are 5 years old. ############################################ # number of days to retain db rows in grouper_audit_entry with no logged in user (loader, gsh, etc). -1 is forever. suggested is 365.
default is five years: 1825 loader.retain.db.audit_entry_no_logged_in_user.days=1825 ############################################ ## I think its ok to remove all audit entries over 10 years, but will default this to never since even at penn there aren’t that many records.
## These are audits for things people do on the UI or WS generally.
############################################ # number of days to retain db rows in grouper_audit_entry. -1 is forever. suggested is -1 or ten years: 3650 loader.retain.db.audit_entry.days=-1 ############################################ ## After you delete an object in grouper, it is still in point in time. So if you want to know who was in a group a year ago, you need this info ## However, I think after some time its ok to let it go. So the default is 5 years ############################################ # number of days to retain db rows for point in time deleted objects. -1 is forever. suggested is 365.
default is five years: 1825 loader.retain.db.point_in_time_deleted_object.days=1825 ############################################ ## This is optional. You can set limits on deleted objects in point in time on a folder level. So if you don’t need delete course point in time ## you can get rid of that sooner… ############################################ # number of days to retain db rows for point in time deleted objects in a folder. "courses" or "someLabel" are variables you make up in these examples #loader.retain.db.point_in_time_deleted_objects_in_folder.courses.days=180 #loader.retain.db.point_in_time_deleted_objects_in_folder.courses.folderName=my:folder:for:courses #loader.retain.db.point_in_time_deleted_objects_in_folder.someLabel.days=365 #loader.retain.db.point_in_time_deleted_objects_in_folder.someLabel.folderName=my:folder:for:whatever ############################################ ## This is optional. You can just automatically obliterate folders in a parent folder that are a certain age old… e.g. courses. ## so you could delete a term of courses 4 years old if you like. Note, make sure the loader isn’t going to recreate or you will get churn… ############################################ # number of days after a subfolder is created that it will be obliterated (deleted) and point in time will be deleted too.
# "courses" or "anotherLabel" are variables you make up in these examples #loader.retain.db.folder.courses.days=1825 #loader.retain.db.folder.courses.parentFolderName=my:folder:for:courses #loader.retain.db.folder.anotherLabel.days=1825 #loader.retain.db.folder.anotherLabel.parentFolderName=my:folder:for:courses From: Black, Carey M. [mailto:]
In general, I think there is a clear need here for operational tools/processes to manage the DB data growth. However, I also hate losing data. ( Delete is a form of “loss”. Hopefully a willful choice, but still a loss.)
Mostly because we lose the ability to ask a whole range of questions about “what really happened”? ( While looking back instead of planning ahead.
J ) Maybe it would be better to have a model where this kind of audit data is moved from “Active” to “Archived” then off to “delete”? Maybe a shadow table(s) where the “Archived data” can be held just out of sight of the operation of the UI/WS, but still around for other reporting? Your schedule of a configuration to define the duration of “Active” (Days/weeks/months, move from “Active” to “Archive” on that
schedule.) and “Achieved” (Days/weeks/months/years) data sounds good. Then add a later schedule to more from Archived to delete. I also think there is the possibility for some to want to treat any membership change ( regardless of source [UI/WS/Loader/etc…]) as equally valuable, and others
might see “non-human” process as less necessary to have in their active audit trail. So maybe the definition of that should be a separate config item? (AKA: “has a subject id”
vs “no subject id” for the change) Maybe even special groups that need more monitoring/carve outs for extra ( or reduced) retention too. Also, I also wonder if there are some reports/summary/monitoring that should be done before the delete that would preserve some details/trends while still letting
go of the volume of data? Maybe there are some groups that it would be nice to monitor the count of members once a day, month, etc.. across the cycles of the academic/finical
calendar? Maybe seeing spikes/dips in Loader loaded data by group/job? Maybe seeing growth/shrinking basis, ref, access control policy groups in the system over time? Etc… So I think it may be harder than just “archive/delete every N days”. Might even be a opportunity to tag with attributes to signal what to do for each group? (
maybe with a system config default if not tagged? ) .. Thinking like Attestation, but for the definition of things like: “ArchiveAfter”, ‘DeleteAfter”, “CollectStatsEvery”…. --
Carey Matthew
From:
[]
On Behalf Of Rory Larson Agreed. That would be a very nice feature. Would time-based deletes be based on create-date or last-mod-date? There seems to be a difference between these in the grouper_audit_entry table, though I'm
not sure why a log record or point-in-time record would ever be modified. Thanks, Rory From: Gail H Lift []
Sounds good here too. The configurable time intervals will make it easy to adjust to local needs. On Wed, Jan 31, 2018 at 11:55 AM, David Langenberg <> wrote:
--
|
- RE: [grouper-users] Maintaining Grouper database size, Rory Larson, 02/05/2018
- <Possible follow-up(s)>
- RE: [grouper-users] Maintaining Grouper database size, Hyzer, Chris, 02/16/2018
- Re: [grouper-users] Maintaining Grouper database size, Peter DiCamillo, 02/16/2018
- RE: [grouper-users] Maintaining Grouper database size, Hyzer, Chris, 02/16/2018
- Message not available
- RE: [grouper-users] Maintaining Grouper database size, Gail H Lift, 02/17/2018
- Message not available
- RE: [grouper-users] Maintaining Grouper database size, Hyzer, Chris, 02/16/2018
- Re: [grouper-users] Maintaining Grouper database size, Peter DiCamillo, 02/16/2018
Archive powered by MHonArc 2.6.19.