Skip to Content.
Sympa Menu

grouper-users - Re: [grouper-users] RE: How big should your temp change log be? ( Loader questions )

Subject: Grouper Users - Open Discussion List

List archive

Re: [grouper-users] RE: How big should your temp change log be? ( Loader questions )


Chronological Thread 
  • From: Shilen Patel <>
  • To: "Hyzer, Chris" <>, "Gettes, Michael" <>
  • Cc: "Black, Carey M." <>, "" <>
  • Subject: Re: [grouper-users] RE: How big should your temp change log be? ( Loader questions )
  • Date: Fri, 8 Sep 2017 12:59:50 +0000
  • Accept-language: en-US
  • Authentication-results: oit.duke.edu; spf=none
  • Ironport-phdr: 9a23:UV1R7R2VAh/HigBDsmDT+DRfVm0co7zxezQtwd8ZseISLvad9pjvdHbS+e9qxAeQG96Ku7Qc06L/iOPJYSQ4+5GPsXQPItRndiQuroEopTEmG9OPEkbhLfTnPGQQFcVGU0J5rTngaRAGUMnxaEfPrXKs8DUcBgvwNRZvJuTyB4Xek9m72/q89pDXYAhEniaxba9vJxiqsAvdsdUbj5F/Iagr0BvJpXVIe+VSxWx2IF+Yggjx6MSt8pN96ipco/0u+dJOXqX8ZKQ4UKdXDC86PGAv5c3krgfMQA2S7XYBSGoWkx5IAw/Y7BHmW5r6ryX3uvZh1CScIMb7Vq4/Vyi84Kh3SR/okCYHOCA/8GHLkcx7kaZXrAu8qxBj34LYZYeYP+d8cKzAZ9MXXWpPUNhMWSxdDI2ybIUPAOgPMuZZs4byqEADogGiCQS2Hu7j1iVFi3/w0KYn0+ohCwbG3Ak4EtwTsHTUrdH1O7kUUeuo1qfH1ynDYO1S2Tzg74XIaRAhofCIXb1ud8rc0lQgFx7GjlqOs4DqIS6a1vkUvmWd8uFuVvqvhnY6pwxxvDSj3MkhhpPUio8Wzl3I7zt1zYgrKdGgVUJ2YcCoHIFeui2AOYZ7TdkuTm9wtCok1rELu4S3cScLxZg92hLSavKKf5KV7h/nUuuaPC12i2h/eL2lgha/6UigxfP4VsmzyFtKqzBKktjItnwU2BzS5NGLRud880u73jaP1hzT5vtZLkAplqrUNYQtzaAtmZoOqUjDHyn2l1vqjKKOa0kp+vak5/7pb7n8u5ORNIt5hhvgPqkghMCzGeE4PRIPX2if9+S8zrrj/UjhTbVRjv02k63Zv47cJckAu6G5HgxV3Zg55xmhEjimzcwUnWMbI1JdZBKHk4/pNknBIPDiCve/nkysnylxx/zfI73hGYvCLmLYkLr6ebZ96lVcyBYozdxB/Z5UC7cBIO7tVU/rstzXEAM5PxKuz+n5Fdp9y9BWZWXaSI+IIq7I9Rej5vguOKPEMIoeuCfvJuIN5uXly2IhlFkbO6SlwM1TIDqXE+h6Ll7dKV/tidcIHGNA9l4xQfD2hUaqTDBXIXu+QvRvyCs8DdedDYbHS4bluLWd0SPzSoZWZ2RLDHiADHTlMYiIRqFfO2qpPsZ9n2lcBvCaQIg72ETr7Veixg==
  • Spamdiagnosticmetadata: NSPM
  • Spamdiagnosticoutput: 1:99

The index is in the codebase (for 2.4+).  

The following can be added now and it won't conflict when 2.4 is out or if there is a 2.3 patch that handles ddl changes.

CREATE INDEX change_log_temp_created_on_idx ON GROUPER_CHANGE_LOG_ENTRY_TEMP (CREATED_ON);

Thanks!

- Shilen

From: <Hyzer>, Chris <>
Date: Tuesday, September 5, 2017 at 4:18 PM
To: "Gettes, Michael" <>
Cc: "Black, Carey M." <>, "" <>
Subject: RE: [grouper-users] RE: How big should your temp change log be? ( Loader questions )

When I look at it, I run that query in oracle and the explain plan has a full table scan.  When I add the index it does not (cost 0).

 

I will discuss with Shilen to see if he agrees to add it.

 

FYI, this is the script I ran in oracle:

 

CREATE INDEX change_log_temp_created_on_idx ON GROUPER_CHANGE_LOG_ENTRY_TEMP (CREATED_ON);

 

Note, patches cant run DDL at this point, but maybe we can add an enhancement for that…

 

Thanks

Chris

 

From: Gettes, Michael []
Sent: Tuesday, September 05, 2017 3:47 PM
To: Hyzer, Chris <>
Cc: Black, Carey M. <>;
Subject: Re: [grouper-users] RE: How big should your temp change log be? ( Loader questions )

 

I had this same problem at a previous place of employment.  Never got resolved but we were heading down the path of putting the grouper DB on SSD to speed up the read/write/delete operation.  The DBAs never indicated there were lack of indices problems but we definitely had the temp->main changelog as choke-point problem.  When the “disk” could keep up - it went fast.  It really presented as an I/O problem.

 

I hope this helps.


/mrg

 

On Sep 5, 2017, at 3:42 PM, Hyzer, Chris <> wrote:

 

Reading a record, and writing a record, and deleting a record, should be fast.  Has nothing to do with the loader jobs.  The theads inserting into the temp table are equal to any thread using the Java Grouper API.  So it depends on the number of WS, UI, and loader threads....

As far as local SQL changes, I don't think that is that common.  Local admins just need to run that after upgrading.

Thanks
Chris

-----Original Message-----
From: Black, Carey M. [
] 
Sent: Tuesday, September 05, 2017 3:06 PM
To: Hyzer, Chris <
>
Cc: 

Subject: RE: How big should your temp change log be? ( Loader questions )

Chris,

Thank you for the details.

FWIW: 
I stopped running all of the loader jobs.
The  grouper_change_log_entry_temp table has cleared itself  over a few days.

So I think it is reasonable to believe that the jobs were submitting more records into the temp table than the loader process was able to convert to the "grouper_change_log_entry" table.
Assuming that the select from the temp table was a "slow point", then maybe the index will help the loader to keep up with the queue. :)
If the index does not resolve it, then I would ask if the "thread counts" match between the loader processes that generate the "_temp" entries vs the ones moving the rows to the "grouper_change_log_entry"?

For now, I will add the index and start up a few of the jobs and see if I can find the "max rate" point for our system. 


Just a thought:
Since there may be a local need for "DB customizations", maybe supporting some kind of "local sql script" to run after an upgrade would be helpful?

If there were a standard way(config file?) to register/track local DB changes then the upgrade could be "pre-wired" for the upgrade to auto apply the local changes too.
I do recognize that some SQL might not work after the upgrade. However "safe things" (like adding indexes and adding custom Tables) should work. Views based on the grouper tables might fail due to grouper RDBMS structure changes. But that would be a good thing to find in the upgrade process anyways.
Maybe the SQL file could be conditionally run at startup of the loader too? ( Kind of an "All Sync" type option?)

-- 
Carey Matthew 



-----Original Message-----
From: Hyzer, Chris [
] 
Sent: Tuesday, September 5, 2017 2:07 PM
To: Black, Carey M. <
>; 
Subject: RE: How big should your temp change log be? ( Loader questions )

Carey, can you add an index on that column and see if it improves things?  You can explain plan that query to see if it works :)  If so, Shilen, do you think we should add that index?

Carey, have you analyzed all your tables recently?  Not sure if rebuilding indexes helps in mysql, but it helped us at penn...


This "temp table" is auto dropped/created by the system?


No, it's a permanent temp table :)


When is the temp table "cleaned up"? (or should it never has "stuck rows" after the job finishes?)


The loader will take rows, slap another timestamp on them, and put them in grouper_change_log_entry and remove from grouper_change_log_entry_temp


Can I just stop the loader then remove all rows from this temp table?


Change log entries wont be sent out, but yes, you can do that


There is some reason to NOT add an index to the created_on column?


I don't think so


Is it "safe" to do things like adding indexes to grouper's "core tables"?


Sure.  Document it.  make sure on major upgrades you reapply the index if it is removed.  Also let us know if we should consider adding it to core.


Are there any other indexes that I should be manually adding? ( or verifying that they were created properly?)


Not that I know of.


Why is the default 14 days before the change log is trimmed? ( Why not 0, 1 or 2? )


They should be removed as loader copies them over.  So it should never go up to 14.  But if your loader is turned off for a while they should still progress through the change log notifications.  If two weeks has gone by, then all bets off.

Thanks
Chris

-----Original Message-----
From: 
 [] On Behalf Of Black, Carey M.
Sent: Wednesday, August 30, 2017 11:08 PM
To: 

Subject: [grouper-users] How big should your temp change log be? ( Loader questions )

All,

I suspect that I have been pushing the loader process a bit harder than I should. (Or I am finding yet more configuration that I need to "tune" to keep things humming along.)

My DBA reported a "slow query" and I think this is related to loader jobs not "completing".

-- Connection Id: NNNNN
-- User: DB_USER
-- Host: host:port
-- DB: grouper
-- Command: Query
-- Time: 291
-- State: Creating sort index
select changeloge0_.id as id1_14_, changeloge0_.change_log_type_id as change_l2_14_, changeloge0_.context_id as context_3_14_, changeloge0_.created_on as created_4_14_, changeloge0_.string01 as string5_14_, changeloge0_.string02 as string6_14_, changeloge0_.string03 as string7_14_, changeloge0_.string04 as string8_14_, changeloge0_.string05 as string9_14_, changeloge0_.string06 as string10_14_, changeloge0_.string07 as string11_14_, changeloge0_.string08 as string12_14_, changeloge0_.string09 as string13_14_, changeloge0_.string10 as string14_14_, changeloge0_.string11 as string15_14_, changeloge0_.string12 as string16_14_ from grouper_change_log_entry_temp changeloge0_ order by changeloge0_.created_on limit 1000

While that is a bit odd to my eye... it basically is a:
 Select a, b, c ... from grouper_change_log_entry_temp changeloge0_ order by changeloge0_.created_on limit 1000

Some insight into the data in the table....

select count(*) from grouper_change_log_entry_temp -- '6,745,965'   That is "only" 6.7 M rows, in a "temp" table. ( Uh...? )
select min(created_on) from grouper_change_log_entry_temp; -- '1503703639031000' --? Friday, August 25, 2017 11:27:19 PM GMT
select max(created_on) from grouper_change_log_entry_temp; -- '1504121955115000' --? Wednesday, August 30, 2017 7:39:15 PM GMT 

-- Assuming I am converting those dates properly... 
-- That is just under 5 day's worth of data. ( And I think it will not start to trim it till 14 days. ( default value for: loader.retain.db.change_log_entry.days   Or does that apply to the "non-temp" loader table?  )


When I look at the grouper_change_log_entry_temp table, I do not see an index on the created_on column.

Based on reading of mariadb v10.2.8:
  REF: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__mariadb.com_kb_en_the-2Dmariadb-2Dlibrary_improvements-2Dto-2Dorder-2Dby_&d=DwIFAg&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=EUBHI54mtQDlbcqo5rUTdQ&m=w8JE-eXRCDxqQ4YtKvCq9IOMSrJvy-hCPczf4w1jRHY&s=WvlD0A3kZdwVIoAPwUBnJ4QNo8xm3bBzfNTKWif9ty4&e= 
  It seems like there really should be an index on the created_on column to avoid performance issues and speed up the "limit" function of that select statement.

Can anyone tell me If:
This "temp table" is auto dropped/created by the system?
When is the temp table "cleaned up"? (or should it never has "stuck rows" after the job finishes?)
Can I just stop the loader then remove all rows from this temp table?
There is some reason to NOT add an index to the created_on column?
Is it "safe" to do things like adding indexes to grouper's "core tables"?
Are there any other indexes that I should be manually adding? ( or verifying that they were created properly?)
Why is the default 14 days before the change log is trimmed? ( Why not 0, 1 or 2? )


On maybe a more general "DB Config" topic...
   select count(*) from grouper_memberships_all_v;   -- produces a 30 sec timeout result. No count returned. DB connection dead. Uh.... Is there some overarching DB/DB client configuration issue that is causing me pain?
Did I miss a section on Grouper's required DB configuration settings somewhere?

Is there any additional information/ configuration details that could help any of these questions be answered? ( Let me know.)

Thanks in advance.

-- 
Carey Matthew

 




Archive powered by MHonArc 2.6.19.

Top of Page