grouper-users - Re: [grouper-users] Error on job: CHANGE_LOG_consumer_psp
Subject: Grouper Users - Open Discussion List
List archive
- From: "Bee-Lindgren, Bert" <>
- To: Scott Koranda <>
- Cc: grouper-users <>
- Subject: Re: [grouper-users] Error on job: CHANGE_LOG_consumer_psp
- Date: Tue, 19 Feb 2019 14:51:51 +0000
Scott,
I know this is stale, but I just to let you know that I created Jira GRP-2041 to augment logging surrounding problems like the one you observed.
Thanks,
Bert
From: <> on behalf of Scott Koranda <>
Sent: Thursday, January 12, 2017 12:55 PM
To: Bee-Lindgren, Bert
Cc: grouper-users
Subject: Re: [grouper-users] Error on job: CHANGE_LOG_consumer_psp
Sent: Thursday, January 12, 2017 12:55 PM
To: Bee-Lindgren, Bert
Cc: grouper-users
Subject: Re: [grouper-users] Error on job: CHANGE_LOG_consumer_psp
> > ... Did not get all the way through the batch! 88720 != 88721 ...
>
> Changelog entry 88721 failed to be processed. It might help to know what that
> change was, and it might help try to duplicate the problem. Can you query the
> changelog for some details about it?
mysql> SELECT * FROM grouper_change_log_entry WHERE sequence_number = 88721\G
*************************** 1. row ***************************
change_log_type_id: f677ebad71ab4c0c9cb63f14ad9928c2
context_id: 1cf47f5cb1da428b84759f5b01bfa112
created_on: 1484182869667000
sequence_number: 88721
string01: 0660320dd7144056842f3b1be2f1856d
string02: members
string03: I46500
string04: RegistryLDAP
string05: flattened
string06: 686e09db47e14726bb2b976dc25c1c19
string07: My Stem:MyPrefix_TM Foo, Bar, Blarf
string08: 0028e6b1edbd46a4b8285d4c1c4ef31d
string09: 17123756633e40daab179ad09dd89bf6
string10: NULL
string11: NULL
string12: NULL
(I obfuscated the name of the group of course...)
> Of course, PSP should have logged something about its failure to process that
> change; I would have expected an error message to be there before the log lines
> you sent.
There are no ERRORs before that in grouper_error.log, nor are there any
WARN entries.
> Also, it's a long shot, but might there be some information in the
> grouper_event log?
There are no ERRORs or WARNs in grouper_event.log, and an inspection of
the time around 2017-01-12 01:01:15 in the grouper_event.log does
now show anything unusual--just "normal" INFO event log entries.
> Or search for 88720 and 88721 in the log directory,
> obviously ignoring the "Did not get all the way through the batch!" messages.
/var/log/grouperApi# grep 88720 grouper_error.log | grep -v "Did not get all the way through the batch"
2017-01-12 05:45:30,014: [DefaultQuartzScheduler_Worker-7] INFO EventLog.info(156) - - [74202005c5234594b3288720d1205685,'GrouperSystem','application'] session: start (3ms)
So, nothing in grouper_error.log other than the ERRORs about
the batch.
> The "Did not get all the way through the batch! 88720 != 88721" repeated over
> and over, right?
Correct, until the restart of the grouper_loader process.
> If not, then retryOnError might not have been working.
>
>
> Finally, if there aren't any other error messages, then we'll probably have to
> try to duplicate the problem with more verbose logging.
We will see if we can reproduce on dev, though since the problem is moving
from the temp change log to the change log I am not sure how we can
cause the issue easily other than repeatedly making changes to
that particular group. We will try that.
If we can reproduce it we will turn up debugging on the dev tier and send
you some more information.
Since a restart of the Grouper loader process made the error state
disappear, I am assuming it cannot be a problem in the table(s), ie. the
data in that row is not problematic.
I should add that the loader had been up for 2 days previously, but before
that it had been up for about 30 days. So in general the loader is
running as expected.
Thanks,
Scott K
>
> Changelog entry 88721 failed to be processed. It might help to know what that
> change was, and it might help try to duplicate the problem. Can you query the
> changelog for some details about it?
mysql> SELECT * FROM grouper_change_log_entry WHERE sequence_number = 88721\G
*************************** 1. row ***************************
change_log_type_id: f677ebad71ab4c0c9cb63f14ad9928c2
context_id: 1cf47f5cb1da428b84759f5b01bfa112
created_on: 1484182869667000
sequence_number: 88721
string01: 0660320dd7144056842f3b1be2f1856d
string02: members
string03: I46500
string04: RegistryLDAP
string05: flattened
string06: 686e09db47e14726bb2b976dc25c1c19
string07: My Stem:MyPrefix_TM Foo, Bar, Blarf
string08: 0028e6b1edbd46a4b8285d4c1c4ef31d
string09: 17123756633e40daab179ad09dd89bf6
string10: NULL
string11: NULL
string12: NULL
(I obfuscated the name of the group of course...)
> Of course, PSP should have logged something about its failure to process that
> change; I would have expected an error message to be there before the log lines
> you sent.
There are no ERRORs before that in grouper_error.log, nor are there any
WARN entries.
> Also, it's a long shot, but might there be some information in the
> grouper_event log?
There are no ERRORs or WARNs in grouper_event.log, and an inspection of
the time around 2017-01-12 01:01:15 in the grouper_event.log does
now show anything unusual--just "normal" INFO event log entries.
> Or search for 88720 and 88721 in the log directory,
> obviously ignoring the "Did not get all the way through the batch!" messages.
/var/log/grouperApi# grep 88720 grouper_error.log | grep -v "Did not get all the way through the batch"
2017-01-12 05:45:30,014: [DefaultQuartzScheduler_Worker-7] INFO EventLog.info(156) - - [74202005c5234594b3288720d1205685,'GrouperSystem','application'] session: start (3ms)
So, nothing in grouper_error.log other than the ERRORs about
the batch.
> The "Did not get all the way through the batch! 88720 != 88721" repeated over
> and over, right?
Correct, until the restart of the grouper_loader process.
> If not, then retryOnError might not have been working.
>
>
> Finally, if there aren't any other error messages, then we'll probably have to
> try to duplicate the problem with more verbose logging.
We will see if we can reproduce on dev, though since the problem is moving
from the temp change log to the change log I am not sure how we can
cause the issue easily other than repeatedly making changes to
that particular group. We will try that.
If we can reproduce it we will turn up debugging on the dev tier and send
you some more information.
Since a restart of the Grouper loader process made the error state
disappear, I am assuming it cannot be a problem in the table(s), ie. the
data in that row is not problematic.
I should add that the loader had been up for 2 days previously, but before
that it had been up for about 30 days. So in general the loader is
running as expected.
Thanks,
Scott K
- Re: [grouper-users] Error on job: CHANGE_LOG_consumer_psp, Bee-Lindgren, Bert, 02/19/2019
Archive powered by MHonArc 2.6.19.