Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] meshconfig problems

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] meshconfig problems


Chronological Thread 
  • From: Casey Russell <>
  • To: Michael Johnson <>
  • Cc: Pete Siemsen <>,
  • Subject: Re: [perfsonar-user] meshconfig problems
  • Date: Tue, 14 Aug 2018 09:47:47 -0500
  • Ironport-phdr: 9a23:vzyAoRY0yg2AdIgQmLPtcRH/LSx+4OfEezUN459isYplN5qZps+zZh7h7PlgxGXEQZ/co6odzbaO7ea4ASQp2tWoiDg6aptCVhsI2409vjcLJ4q7M3D9N+PgdCcgHc5PBxdP9nC/NlVJSo6lPwWB6nK94iQPFRrhKAF7Ovr6GpLIj8Swyuu+54Dfbx9HiTahYL5+Ngm6oRnMvcQKnIVuLbo8xAHUqXVSYeRWwm1oJVOXnxni48q74YBu/SdNtf8/7sBMSar1cbg2QrxeFzQmLns65Nb3uhnZTAuA/WUTX2MLmRdVGQfF7RX6XpDssivms+d2xSeXMdHqQb0yRD+v9LlgRgP2hygbNj456GDXhdJ2jKJHuxKquhhzz5fJbI2JKPZye6XQds4YS2VcRMZcTy5OAo28YYUBDOQPIPhWoJXmqlQUsRezHxOhCP/zxjJKgHL9wK000/4mEQHDxAEtAcgBsG/Ko9T1KawcTf21zLLTzTrda/NW3Sr25Y/UfRA7vPGMRqlwftTVyUkrDA7FjU+fqYr/PzyL0OQBqW6b4PR8Ve+2jWMstg9/oj+qxsg2i4nJgJoYy17F9SVi3Ik1I8O3SFJhbd6iDpRQqzmWN45xQsM4XW5koiA6xaMauZKjYSgKzpAnywTBZPOaboiE+hXjVOCPLjd+mn1lZKizhxCs/ki80uH8Usi00FBJriVbj9bMt3YN2wbP5ciAT/tx5luh2DiO1wDP9uFLP1o0mbDHJJ4mx748joQTsVjZEiDohUr2kbeadl0i+umm8ujnbbTmppCGOI9sjQH+Kr4imsqhDuQkKgQOWXKb9vq61LH5+032XqlKguUqnabHtpDaJNgUqrS/AwBLzoYv8xe/DzG60NsGh3kHKkxKeA6Zg4TzJV7BPe34Ae+xglStizdk2/bGMaP9ApnXKHjMjq/tfa5j5E5Gxgoz1tdf55ROBbEbOv7zXFH+tMDAAh84Nwy0x+fnCNN61oMfQmKDGKmZP73OsVOQ4OIgPfOMZIgPtDb7Nfcl++bijXwjll8bZ6mmw50XZGq+Hvt4P0WVeXTsgs0OEWcSpAY+SvLliEGaXT5WZna9Q6I86is9CI24EYfOQJ2mjr+Z3CqjAJFbZ3xKB1KJHHfmdIiLQOsAZSedL8N9jjAJW72sRJM92RyvsQL3zqRrIvTP9SAeqJntzsJ65/fJmhE37TF0D9qS02WKT2xsm2MHXT423KRmrUx6y1ePzbF0gvNGGdBN/PxFSAg6NZnbz+x1D9D9RBjNccuOSFajWtmmADcxQcwtw9IWfUpwGsmugxLG0iqkALIajKCHCZk7/67Aw3T8KcNwxGra2KQgilQrR9VDOXO9iaJn8gjcHY/Jk0GXl6awcqQc2TbA+3uZzWqTp0FYSxB/UaHBXX0EZUvZt9L55lncQL+oE7gnNBVOydKaJqtQdtLplUlGROvkONnGe2K+hX2wBRiTxrOLaorlYX8d0DzACEgfjQ8T522LNQw/Bie6v2LeFyJiGUjuY0Pq7elxtmm7TkkqwAGWcUFtzaS69QMIha/UdvUIw7hRuDs9sy4mWxG53snKEJyBoRZsZqNRfYl77VtaknnIshR0eYehIKZliFBZawltsVjp0RxtT5hbnNJ5kXUx0QAnKb6EyEgTMHSc3IvsIfuRK2/u8QqpZrKMnFzSzZGN66IX4bMjqlrluw+nUVIv6Xt81N9cyT6B/ZjQXzcUSo/7B0Yr6wBh9fadZigm+5iS1HtwPLOyvyOYnd8lGaw+20WceIJ6MaiEHQL2W+8dB861L+piz12rPkgsM+RV/qcwOcStMfqbnqOnIbAzsiihiDFs6Ztwwwq26jFnR+rMl8IO2eyDxQaDUx//hVG7v8axn41YM2JBVlGjwDTpUdYCLpZ5epwGXCL3e5W6

Group,

If I might piggyback on here to make suggestions to Pete, and also to Regional Mesh operators and participants.

For operators of Nodes in the regional meshes.  Participating in a regional mesh is likely the first time you've ever needed to modify your pscheduler "limits" file.  The limits file either allows non-local hosts to test to your node OR (by default) does not.  There are a couple off different ways to attack allowing tests through your limits file.   KanREN has been pretty open and we use the Research and Education published filter list here:  (http://docs.perfsonar.net/manage_limits.html) However, this is not the most restrictive way to do it, and your needs may be different.

For operators of the meshes:  If you're the person building the mesh config file.  You most likely have a mix of IPv4 and IPv6 hosts in the mesh.  If you do not specify (in the test_spec) area, what kind of test you want to run, your hosts will just do a lookup at scheduling time.  the hostname/IP combination that's looked up will be all over the board between your IPv4 and IPv6 hosts.  Since the (I think it's IP address?) is the key for matching test results/pairs in your dashboard and graphs, you will have a mess of a dashboard.  I would suggest always (when you don't have full control over the stack on every host) specify ipv4_only in your test specification.  You can always build a (smaller) mesh of IPv6 enabled hosts if you want to see IPv6 performance for comparison.

Maybe the developers would like to correct me on the second point, perhaps there is a default (ipv4_only or ipv6_only)?  But my experience has been the hosts will prefer v6 if they can get a AAAA record and try/prefer it.  Everything else will default to v4 and you end up with a mess. 




Sincerely,
Casey Russell
Network Engineer
KanREN
phone785-856-9809
2029 Becker Drive, Suite 282
Lawrence, Kansas 66047
linkedin twitter twitter


On Tue, Aug 14, 2018 at 9:15 AM, Michael Johnson <> wrote:
To add on to what Ivan mentioned, you only have a local archive configured, even for mesh tests; this is one of your problems:

https://localhost/esmond/perfsonar/archive/

See:
https://perfsonar-1850.frgp.net/pscheduler/tasks/5721a884-a038-4cda-9cfb-e0c5f5d79134

Your archive should include the central MA for the quilt. In your /etc/perfsonar/meshconfig_agent.conf, do you have this set up?
configure_archives  1

You have other issues as well though, for instance when I try to run a latency test from your host to perfsonar.illinois.net, it works. In the other direction, I see this:
$ pscheduler task latency --dest perfsonar.illinois.net --source perfsonar-1850.frgp.net
Submitting task...
Task URL:
https://perfsonar-1850.frgp.net/pscheduler/tasks/7227ea2e-8eca-4cf2-83c4-4aebd69e2424
Running with tool 'owping'
Fetching first run...
perfsonar-1850.frgp.net never scheduled a run for the task.

I am not sure why pscheduler is not scheduling the run. I see this error, but I'm not sure what to make of it:

"iperf3 requires exactly 2 participants, got 1"

from here:
https://perfsonar-1850.frgp.net/pscheduler/tasks/5721a884-a038-4cda-9cfb-e0c5f5d79134/runs/2b7a1375-7229-46b6-ba4b-dfc8dc88de10

I hope this helps.

- Michael


On Mon, Aug 13, 2018 at 06:20:50PM -0600, Pete Siemsen wrote:
Ever since I upgraded to 4.0.2, my (the FRGP's) participation in the Quilt
mesh at http://quiltmesh.onenet.net/maddash-webui/ has been problematic. My
whole row is orange. Embarrassing. I've appended the last 25 lines from
/var/log/perfsonar/meshconfig-agent.log file. These messages mystify me
because I can ping and/or traceroute to most of the hosts that appear in
these error messages, like these

ps-grand-bw.perfsonar.kanren.net
kc-core-psr.mo.more.net

perfsonar-1850$ tail --lines=25 meshconfig-agent.log
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(ps-svl-10g.cenic.net->perfsonar-1850.frgp.net),
continuing with rest of config: 403 FORBIDDEN: Task forbidden by limits:
Hints:
 requester: 129.19.165.2
 server: 137.164.28.121
Identified as everybody
Classified as default
Application: Defaults applied to non-friendly hosts
 Group 1: Limit 'innocuous-tests' failed: Test type not in list
 Group 1: Want any, 0/1 passed, 1/1 failed: FAIL
 Group 1: Failed; stopping here.
 Application FAILS
Proposal does not meet limits
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(ps-grand-bw.perfsonar.kanren.net->
perfsonar-1850.frgp.net), continuing with rest of config: 500 timeout:
timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(ps-grand-lt.perfsonar.kanren.net->
perfsonar-1850.frgp.net), continuing with rest of config: 500 timeout:
timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(nmon-aa.mich.net->perfsonar-1850.frgp.net),
continuing with rest of config: 500 timeout: timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(kc-core-psr.mo.more.net->perfsonar-1850.frgp.net),
continuing with rest of config: 500 timeout: timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(latency.eugn-perfsonar.nero.net->
perfsonar-1850.frgp.net), continuing with rest of config: 500 timeout:
timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(bandwidth.eugn-perfsonar.nero.net->
perfsonar-1850.frgp.net), continuing with rest of config: 500 timeout:
timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(perfsonar-1850.frgp.net->web100.pnw-gigapop.net),
continuing with rest of config: 500 INTERNAL SERVER ERROR: Unable to
determine participants: Process took too long to run.
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(web100.pnw-gigapop.net->perfsonar-1850.frgp.net),
continuing with rest of config: 500 INTERNAL SERVER ERROR: Unable to
determine participants: Process took too long to run.
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(perfsonar.unl.edu->perfsonar-1850.frgp.net),
continuing with rest of config: 500 timeout: timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(perfsonar-msn.wiscnet.net->perfsonar-1850.frgp.net),
continuing with rest of config: 500 timeout: timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(perfsonar-1850.frgp.net->noctuidae.cns.vt.edu),
continuing with rest of config: 500 INTERNAL SERVER ERROR: Unable to
determine participants: Process took too long to run.
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(noctuidae.cns.vt.edu->perfsonar-1850.frgp.net),
continuing with rest of config: 500 INTERNAL SERVER ERROR: Unable to
determine participants: Process took too long to run.
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(psonar.arc.vt.edu->perfsonar-1850.frgp.net),
continuing with rest of config: 500 timeout: timeout

Any clue appreciated :-)

-- Pete

--
To unsubscribe from this list: https://lists.internet2.edu/sympa/signoff/perfsonar-user


--
Michael Johnson
GlobalNOC DevOps Engineer





Archive powered by MHonArc 2.6.19.

Top of Page