Skip to Content.
Sympa Menu

grouper-users - RE: [grouper-users] Maintaining Grouper database size

Subject: Grouper Users - Open Discussion List

List archive

RE: [grouper-users] Maintaining Grouper database size


Chronological Thread 
  • From: Rory Larson <>
  • To: "Hyzer, Chris" <>, Shilen Patel <>
  • Cc: "" <>
  • Subject: RE: [grouper-users] Maintaining Grouper database size
  • Date: Mon, 29 Jan 2018 20:42:29 +0000
  • Accept-language: en-US
  • Authentication-results: spf=none (sender IP is ) ;
  • Ironport-phdr: 9a23:Fy39WhVhu9afbWNK5Ta8tnqnAPjV8LGtZVwlr6E/grcLSJyIuqrYbRKFt8tkgFKBZ4jH8fUM07OQ7/i5HzRYqb+681k6OKRWUBEEjchE1ycBO+WiTXPBEfjxciYhF95DXlI2t1uyMExSBdqsLwaK+i764jEdAAjwOhRoLerpBIHSk9631+ev8JHPfglEnjWwba9vIBmssQndqtQdjJd/JKo21hbHuGZDdf5MxWNvK1KTnhL86dm18ZV+7SleuO8v+tBZX6nicKs2UbJXDDI9M2Ao/8LrrgXMTRGO5nQHTGoblAdDDhXf4xH7WpfxtTb6tvZ41SKHM8D6Uaw4VDK/5KpwVhTmlDkIOCI48GHPi8x/kqRboA66pxdix4LYeZyZOOZicq/Ye94RWGhPUdtLVyFZAo2ycZYBD/YPM+hboYnypVoOogexCwajH+7v1iRHimPq0aEmz+gsEwfL1xEgEdIUt3TUqc34OrsVUeCvzKjD0DPNYOlM2Tfm74jIdA0qrPaWXbJ0a8XR01IvFwTDjlmNsoHlIjWV1uAXv2eF8uVgSPuihmg6oA9/pTivw90jiojPho8N11/E6CZ5z5gsKdGmU0F0fN+kEIBTty2DKYd5WdgiQ3lytykm1LIKoIC3czMXyJQ83RHfZOKIf5CQ4hL5U+aRJjl4hGpieLK+gRay7VKsxfH7Vsmx1ltBsylLksHUu3wTyxDe6NKLR/R880u7xDqC1g7e5vtLLE03jabXNoAtz7oqmpcXv0nPBCD7lF/ogKKVcEgv5/Km5P79Yrr8o5+RL490hR/6MqQpgsGxGfg1PA8SU2SF4Oixyr/s8VDgTLVNlfI5jLPVsJfHJcQHvaG5BBJV0oA+5BqlFzemytMYnWUZI11ZZBKHjo/pO1fULPD/EPe/n1CskDBsx/DFJLHuHpLNLn3bnLfge7Zy9VJcxRItwdxD5Z9YEL4MLfPpVkPstdHVAAU1PgO3zur/DdVyzIIeWWaBAq+DN6PStEeF5vgrI+aWfo8VozH9K/446/P1kH85n1sdcbO30pQKdXC0Bu5mLFmBYXrwntcBFn8HvgUkTOztlV2CSSBcZ2yrU6Ih/TE0FpimDZzYS4CpgbyBxzu7HoZIamxcC1CMF2voeJueW/cKdi2SPtFtniYaWre8Vo9ynS2p4UXa2qhqNK6c0S0CtImpnIxw7O3Chxwo3T1vBIKAy2yLSSd5kn5eA3cN3KF7rEo18l6f3Kww1+dYFNhe4dtFTwQ3c5PQ0ropJcr1X1f7ec2YAHKvWM6hB3llVNY42cQJZR9VBtOmyB3Pwnz5UPcui7WXCclsoern1H/rKpM4ki6e2Q==
  • Spamdiagnosticmetadata: NSPM
  • Spamdiagnosticoutput: 1:99

Thanks, all, for the suggestions. I think I'm hearing that I don't want to
purge data from large point-in-time tables manually through SQL. Instead, I
should run an edu.internet2.middleware.grouper... command under gsh, such as:


edu.internet2.middleware.grouper.pit.PITUtils.deleteInactiveRecords(new
Date(), true);
or

edu.internet2.middleware.grouper.pit.PITUtils.deleteInactiveObjectsInStem("my:stem",
true);

This will eliminate inactive records or objects only, meaning that if the
people are still around, they will still be taking up space with
point-in-time data going back perhaps for many years, correct? And
additionally, we would still have to perform an OPTIMIZE TABLE on each PIT
table to (possibly) realize any gains?

I will plan to do this, but I'm unsure of how much space, if any, I'm likely
to recover.

Another suggestion was made off-line, that I should perform a TRUNCATE on the
grouper_audit_entry table. In fact, this table has grown to about 40 GB, and
is apparently the biggest source of our problem. I understand that this is
simply a transaction log table, and that nothing else depends on it.
Truncation means that the table would essentially be dropped and then
re-created, minus all the data. That would be done in SQL. Is there any
reason not to do this? If so, is there a better way to reduce its size?

Thanks,
Rory


-----Original Message-----
From: Hyzer, Chris
[mailto:]

Sent: Monday, January 29, 2018 1:03 PM
To: Shilen Patel
<>;
Rory Larson
<>
Cc:

Subject: RE: [grouper-users] Maintaining Grouper database size

Im worried about the future state of using direct SQL, maybe something else
in future will be needed that we put in the GSH command... you should
definitely use that.



Thanks

Chris



-----Original Message-----

From:


[mailto:]
On Behalf Of Shilen Patel

Sent: Monday, January 29, 2018 1:34 PM

To: Rory Larson
<>

Cc:


Subject: Re: [grouper-users] Maintaining Grouper database size



If you’re going to trim the audit based on time, I’d suggest using the gsh
command since it’ll delete the records from the tables in the right order
(taking into account foreign keys). But as long as you delete them in the
right order via sql directly, that should be fine as well. Also, recently, I
documented how much space was being taken up at Duke (we use Oracle).
https://urldefense.proofpoint.com/v2/url?u=https-3A__spaces.internet2.edu_display_Grouper_Duke-2BDisk-2BSpace-2BUsage&d=DwIGaQ&c=Cu5g146wZdoqVuKpTNsYHVKLgTZS09MUACeOXHWmTvE&r=8UxP4WgXNK7VFhDT-iCutw&m=i5pbjVCyOSrgzJa3qHir_ok5n_Zw8MCNGbVSHFXVgZs&s=1qaggpDGSz3DUP03rlkM40RUuYd4lBXJ6ylGMs76vQQ&e=
I was surprised to see so much of the space being used by indexes. We
could possibly look at reducing the number of indexes to help.



- Shilen



On 1/29/18, 9:13 AM,
"
on behalf of Rory Larson"
<
on behalf of
>
wrote:



Hello,



We are encountering an issue of our Grouper database growing to a size
that threatens to use up the entire hard drive space available. The database
server is a physical machine with no room to expand, running a MySQL-type
database (5.5.56-MariaDB MariaDB Server).



The problem is not so much the size of the group data itself, but with
point-in-time tables that record what that data was back to the beginning of
the Grouper installation. Some of these have grown to huge sizes, on the
order of 20 GB or so. A suggestion has been made that we could delete a lot
of this prior to some reasonable point in time, especially groups or
memberships that no longer exist. This would be fine, but I'm wondering if
this can be done from the database command line on a per-table basis, or
whether there are dependencies that require doing this through gsh function
calls?



Also, because this is a MySQL-type database, shrinking the database
physical size after deletion of unnecessary data is not trivial. OPTIMIZE
TABLE ... might work, or it might make a table bigger. ALTER TABLE ... ROW
FORMAT=COMPRESSED is promising as tried on our test database, but takes about
four times as long as OPTIMIZE TABLE. In either case, updating has to be
shut down while running them, and they may require duplicating the table
during the operation. We are at 90% now, and can't afford any more big
tables. The ultimate MySQL solution seems to be to dump the database, delete
the whole thing, and reload it, which would mean complete downtime for
Grouper.



Does anyone else using a MySQL-type database have a system for handling
this problem and maintaining a reasonable database size?



Thanks,

Rory












Archive powered by MHonArc 2.6.19.

Top of Page