Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Modest sized grid has agent failure to archive once or twice a day

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Modest sized grid has agent failure to archive once or twice a day


Chronological Thread 
  • From: Phil Reese <>
  • To: "Garnizov, Ivan" <>, "" <>
  • Subject: Re: [perfsonar-user] Modest sized grid has agent failure to archive once or twice a day
  • Date: Thu, 31 Oct 2019 20:14:43 -0700

Hi Ivan,

Good news continues today.  No hung systems and we've re-enabled the RabbitMQ archiver.  So far so good.

The only retries that have been in the .json were in the RabbitMQ archive spec.  The original retry config was way to aggressive.  When we added the RabbitMQ archiver back today, it has a single retry after 120sec.  We'll see if that works. 

The main test that we've been watching, on a every 5 min basis is the OWAMP Loss test.  In some additional research today, it seems the new tool for Loss testing is 'twamp'.   I think the party line is that it isn't ready for production use, but I wonder if we could use 'twamp' instead of OWAMP oping?

Do you know if using twamp would lower the load on the archiver workers?  If so, I'd be interested in at least temporarily shifting to use twamp on our isolated setup.  If this seems rationale, might you have example .json lines to be used?

Thanks!
Phil

PS- if you happen to be coming to SC19, I'll be having a few presentations on the project at the Stanford booth, #1255, please stop by if you are attending SC19.



On 10/31/19 8:27 AM, Garnizov, Ivan wrote:

Interestingly enough initially you had configured retries in your config, but these were in the matter of seconds, which only did harm, than help the pscheduler operation.





Archive powered by MHonArc 2.6.19.

Top of Page