Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Broken Maddash dashboard after updates yesterday.

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Broken Maddash dashboard after updates yesterday.


Chronological Thread 
  • From: Casey Russell <>
  • To: Humberto Galiza <>
  • Cc: "" <>
  • Subject: Re: [perfsonar-user] Broken Maddash dashboard after updates yesterday.
  • Date: Fri, 2 Oct 2015 09:22:37 -0500

I feel like you're probably onto something in the sense that I think what's broken is my Maddash Config files, I'm just not sure what broke and why.  Let me throw this out there, because I know just enough to suspect I might have something configured uncleanly here.... but not enough to know how to do it better.

Here are my test specs from my Mesh config file for the bandwidth tests.  we do TCP testing every couple of hours, then a UDP test every 24 hours within the mesh (same for my testing with an external host).

<test_spec bwctl_2h_tcp_test>
  type              perfsonarbuoy/bwctl
  tool              bwctl/iperf
  protocol          tcp
  interval          7200
  duration          20
  force_bidirectional 0
</test_spec>

<test_spec bwctl_24h_udp_test>
  type              perfsonarbuoy/bwctl
  tool              bwctl/iperf
  protocol          udp
  interval          86400
  duration          10
  udp_bandwidth     1000000000
  force_bidirectional 0
</test_spec>

<test_spec bwctl_2h_tcp_test_external>
  type              perfsonarbuoy/bwctl
  tool              bwctl/iperf
  protocol          tcp
  interval          7200
  duration          20
  force_bidirectional 1
</test_spec>

<test_spec bwctl_24h_udp_test_external>
  type              perfsonarbuoy/bwctl
  tool              bwctl/iperf
  protocol          udp
  interval          86400
  duration          10
  udp_bandwidth     1000000000
  force_bidirectional 1
</test_spec>

And my gui_agent_configuration.conf is pretty vanilla except for these two statements at the bottom.  I added these to change the threshold values for owamp and bandwidth testing so that the green/yellow/red thresholds better matched our needs.  I don't know that I properly understood the documentation's description of these fields and their values, so check me to see if these make sense.  And should I have had a 3rd statement to separate the UDP bwctl tests from the TCP?

<maddash_options>
    <perfsonarbuoy/owamp>
        check_command            /opt/perfsonar_ps/nagios/bin/check_owdelay.pl
        check_interval           1800
        check_time_range         900
        acceptable_loss_rate     0.01
        critical_loss_rate       0.02
    </owamp>
    <perfsonarbuoy/bwctl>
        check_command            /opt/perfsonar_ps/nagios/bin/check_throughput.pl
        check_interval           28800
        check_time_range         86400
        acceptable_throughput    750
        critical_throughput      500
    </bwctl>
</maddash_options>


Casey Russell
Network Engineer
Kansas Research and Education Network

2029 Becker Drive, Suite 282

Lawrence, KS  66047

(785)856-9820  ext 9809

On Fri, Oct 2, 2015 at 9:12 AM, Casey Russell <> wrote:
Humberto,

     Thank you, I had high hopes when I saw your message, however, I don't have that stanza in my gui_agent_configuration.conf file.  Or anything relating to an ma_filter. 

     So unfortunately, this fix won't work for me.  I'm still open for suggestions.

Casey Russell
Network Engineer
Kansas Research and Education Network

2029 Becker Drive, Suite 282

Lawrence, KS  66047


On Thu, Oct 1, 2015 at 7:36 PM, Humberto Galiza <> wrote:
Hi Casey,

I had the same issue after upgrading. Then, I commented out these lines below on file /opt/perfsonar_ps/mesh_config/etc/gui_agent_configuration.conf, generated my mesh-config again (./opt/perfsonar_ps/mesh_config/bin/generate_gui_configuration), and got my Maddash back.
   <ma_filter>
            ma_filter_name  bw-ignore-first-seconds
            mesh_parameter_name  omit_interval
   </ma_filter>

To be honest, I didn't understand why these lines have caused this issue. But solved. I hope it could help you.

Thanks,


Humberto Galiza ..::.. AmLight - Americas Lightpaths
E-mail:
P:+1 (786) 288-3367
M:+55 (19) 971-445-570
Skype:humbertogaliza


De: "Casey Russell" <>
Para:
Enviadas: Quinta-feira, 1 de outubro de 2015 16:02:06
Assunto: [perfsonar-user] Broken Maddash dashboard after updates yesterday.
Group, 

     I updated my systems early on Monday to 3.5 and worked through a couple of early issues to get everything stable by Tuesday.  However yesterday morning I came in to find that my Maddash dashboard had gone yellow again overnight.  I suppose it may (or may not) have been related to updates released the evening of the 29th (I do have automatic updates enabled).

     In short, the latency grid, and the TCP bandwidth testing grids still work fine for most hosts.  On my internal hosts, I've lost (in the Maddash dashboard) all UDP bandwidth testing.  The dashboard shows yellow with the warning text:  " Unable to find any tests with data in the given time range where source is ps-bryant-bw.perfsonar.kanren.net and destination is ps-wsu-bw.perfsonar.kanren.net"

     I've verified that I can manually run these tests between the hosts.  I've also gone back and verified that the bugs I've seen reported on the list over the last couple of days don't seem to be the cause of my issue.  However, I'm stumped as to just what IS going on.  I don't know how to query esmond well enough to see if the UDP test data is there, but this feels to me like a situation where the data is still there, but Maddash has lost it's ability to find it.  In situations where a test was running, but stops, you can typically still click on that Maddash square and see the old data, even if there's a recent blank spot.  In this case, there is not even any historical data. 

     Anyone have any thoughts on where to point me next?  http://ps-dashboard.kanren.net/maddash-webui

Maddash log data:
level=INFO ts=2015-10-01T13:43:47.066342Z event=maddash.RunCheckJob.execute.runCheck.end guid=4159a7e8-c611-42b7-80d3-977a971b2abe resultMsg=" Unable to find any tests with data in the given time range where source is ps-bryant-bw.perfsonar.kanren.net and destination is ps-wsu-bw.perfsonar.kanren.net" col=ps-wsu-bw.perfsonar.kanren.net status=0 resultCode=3 grid="KanREN Mesh - KanREN iPerf Bandwidth UDP testing" row=ps-bryant-bw.perfsonar.kanren.net
level=INFO ts=2015-10-01T13:43:47.071295Z event=maddash.RunCheckJob.execute.runCheck.end guid=c78228e2-32c0-4a24-a463-51c15c4dc067 resultMsg=" Unable to find any tests with data in the given time range where source is ps-bryant-bw.perfsonar.kanren.net and destination is ps-ku-bw.perfsonar.kanren.net" col=ps-ku-bw.perfsonar.kanren.net status=0 resultCode=3 grid="KanREN Mesh - KanREN iPerf Bandwidth UDP testing" row=ps-bryant-bw.perfsonar.kanren.net


Lots more log data where that came from, but I won't pollute the list with it.  Ask if there's something you need to see to help and I'll provide.

Thank you in advance

Casey Russell
Network Engineer
Kansas Research and Education Network

2029 Becker Drive, Suite 282

Lawrence, KS  66047








Archive powered by MHonArc 2.6.16.

Top of Page