perfsonar-user - [perfsonar-user] Modest sized grid has agent failure to archive once or twice a day
Subject: perfSONAR User Q&A and Other Discussion
List archive
[perfsonar-user] Modest sized grid has agent failure to archive once or twice a day
Chronological Thread
- From: "Garnizov, Ivan" <>
- To: Phil Reese <>, "" <>
- Subject: [perfsonar-user] Modest sized grid has agent failure to archive once or twice a day
- Date: Tue, 29 Oct 2019 15:08:12 +0000
Hi Phil,
Yes, we should be on the right direction, especially if the rate of the “a full slate of workers” message has disappeared. Still having only 2 attempts for archival too small. You are still quite easily/quickly dropping the measurement results. I would suggest to have attempts within 1 day with 2 attempts with interval of 1-2h in addition to the ones you have.
Once you reduce the rate of the “full slate of workers” failure, you should also be able to spot more easily another failure, which should be the real cause of the problem. Obviously there is more to it apart of the exhaustion of pScheduler archiver workers. It might be the case not all of the attempts fail, but still there are. Perhaps there is an exhaustion / overload on your Esmond server, if the failure is a timeout.
Regards, Ivan Garnizov
GEANT WP6T3: pS development team GEANT WP7T1: pS deployments GN Operations GEANT WP9T2: Software governance in GEANT
Von: Phil Reese [mailto:]
Hi Ivan, Note, I've been focused on the MaDDash part of the project, so I didn't pay too much attention to my colleague who wanted the PS data in order to graph it with Grafana. Together we looked at the perfSONAR docs for archiver options. The RabbitMQ section (http://docs.perfsonar.net/pscheduler_ref_archivers.html) offered the stanza we used, including the retry-policy, which does seem too aggressive.
From: Garnizov, Ivan <>
Hello Phil,
Thanks for the info. It appears your mesh configuration for the archival of data is causing you troubles. "archiver_data": { "retry-policy": [ { "attempts": 5, "wait": "PT1S" }, { "attempts": 5, "wait": "PT3S" } ],
|
- [perfsonar-user] Modest sized grid has agent failure to archive once or twice a day, Garnizov, Ivan, 10/29/2019
- Re: [perfsonar-user] Modest sized grid has agent failure to archive once or twice a day, Phil Reese, 10/29/2019
- AW: [perfsonar-user] Modest sized grid has agent failure to archive once or twice a day, Garnizov, Ivan, 10/30/2019
- Re: [perfsonar-user] Modest sized grid has agent failure to archive once or twice a day, Phil Reese, 10/30/2019
- AW: [perfsonar-user] Modest sized grid has agent failure to archive once or twice a day, Garnizov, Ivan, 10/30/2019
- <Possible follow-up(s)>
- [perfsonar-user] Modest sized grid has agent failure to archive once or twice a day, Garnizov, Ivan, 10/31/2019
- Re: [perfsonar-user] Modest sized grid has agent failure to archive once or twice a day, Phil Reese, 10/29/2019
Archive powered by MHonArc 2.6.19.