perfsonar-user - Re: [perfsonar-user] meshconfig problems
Subject: perfSONAR User Q&A and Other Discussion
List archive
- From: Casey Russell <>
- To: Michael Johnson <>
- Cc: Pete Siemsen <>,
- Subject: Re: [perfsonar-user] meshconfig problems
- Date: Tue, 14 Aug 2018 09:47:47 -0500
- Ironport-phdr: 9a23:vzyAoRY0yg2AdIgQmLPtcRH/LSx+4OfEezUN459isYplN5qZps+zZh7h7PlgxGXEQZ/co6odzbaO7ea4ASQp2tWoiDg6aptCVhsI2409vjcLJ4q7M3D9N+PgdCcgHc5PBxdP9nC/NlVJSo6lPwWB6nK94iQPFRrhKAF7Ovr6GpLIj8Swyuu+54Dfbx9HiTahYL5+Ngm6oRnMvcQKnIVuLbo8xAHUqXVSYeRWwm1oJVOXnxni48q74YBu/SdNtf8/7sBMSar1cbg2QrxeFzQmLns65Nb3uhnZTAuA/WUTX2MLmRdVGQfF7RX6XpDssivms+d2xSeXMdHqQb0yRD+v9LlgRgP2hygbNj456GDXhdJ2jKJHuxKquhhzz5fJbI2JKPZye6XQds4YS2VcRMZcTy5OAo28YYUBDOQPIPhWoJXmqlQUsRezHxOhCP/zxjJKgHL9wK000/4mEQHDxAEtAcgBsG/Ko9T1KawcTf21zLLTzTrda/NW3Sr25Y/UfRA7vPGMRqlwftTVyUkrDA7FjU+fqYr/PzyL0OQBqW6b4PR8Ve+2jWMstg9/oj+qxsg2i4nJgJoYy17F9SVi3Ik1I8O3SFJhbd6iDpRQqzmWN45xQsM4XW5koiA6xaMauZKjYSgKzpAnywTBZPOaboiE+hXjVOCPLjd+mn1lZKizhxCs/ki80uH8Usi00FBJriVbj9bMt3YN2wbP5ciAT/tx5luh2DiO1wDP9uFLP1o0mbDHJJ4mx748joQTsVjZEiDohUr2kbeadl0i+umm8ujnbbTmppCGOI9sjQH+Kr4imsqhDuQkKgQOWXKb9vq61LH5+032XqlKguUqnabHtpDaJNgUqrS/AwBLzoYv8xe/DzG60NsGh3kHKkxKeA6Zg4TzJV7BPe34Ae+xglStizdk2/bGMaP9ApnXKHjMjq/tfa5j5E5Gxgoz1tdf55ROBbEbOv7zXFH+tMDAAh84Nwy0x+fnCNN61oMfQmKDGKmZP73OsVOQ4OIgPfOMZIgPtDb7Nfcl++bijXwjll8bZ6mmw50XZGq+Hvt4P0WVeXTsgs0OEWcSpAY+SvLliEGaXT5WZna9Q6I86is9CI24EYfOQJ2mjr+Z3CqjAJFbZ3xKB1KJHHfmdIiLQOsAZSedL8N9jjAJW72sRJM92RyvsQL3zqRrIvTP9SAeqJntzsJ65/fJmhE37TF0D9qS02WKT2xsm2MHXT423KRmrUx6y1ePzbF0gvNGGdBN/PxFSAg6NZnbz+x1D9D9RBjNccuOSFajWtmmADcxQcwtw9IWfUpwGsmugxLG0iqkALIajKCHCZk7/67Aw3T8KcNwxGra2KQgilQrR9VDOXO9iaJn8gjcHY/Jk0GXl6awcqQc2TbA+3uZzWqTp0FYSxB/UaHBXX0EZUvZt9L55lncQL+oE7gnNBVOydKaJqtQdtLplUlGROvkONnGe2K+hX2wBRiTxrOLaorlYX8d0DzACEgfjQ8T522LNQw/Bie6v2LeFyJiGUjuY0Pq7elxtmm7TkkqwAGWcUFtzaS69QMIha/UdvUIw7hRuDs9sy4mWxG53snKEJyBoRZsZqNRfYl77VtaknnIshR0eYehIKZliFBZawltsVjp0RxtT5hbnNJ5kXUx0QAnKb6EyEgTMHSc3IvsIfuRK2/u8QqpZrKMnFzSzZGN66IX4bMjqlrluw+nUVIv6Xt81N9cyT6B/ZjQXzcUSo/7B0Yr6wBh9fadZigm+5iS1HtwPLOyvyOYnd8lGaw+20WceIJ6MaiEHQL2W+8dB861L+piz12rPkgsM+RV/qcwOcStMfqbnqOnIbAzsiihiDFs6Ztwwwq26jFnR+rMl8IO2eyDxQaDUx//hVG7v8axn41YM2JBVlGjwDTpUdYCLpZ5epwGXCL3e5W6
Group,
If I might piggyback on here to make suggestions to Pete, and also to Regional Mesh operators and participants.
For operators of Nodes in the regional meshes. Participating in a regional mesh is likely the first time you've ever needed to modify your pscheduler "limits" file. The limits file either allows non-local hosts to test to your node OR (by default) does not. There are a couple off different ways to attack allowing tests through your limits file. KanREN has been pretty open and we use the Research and Education published filter list here: (http://docs.perfsonar.net/manage_limits.html) However, this is not the most restrictive way to do it, and your needs may be different.
For operators of the meshes: If you're the person building the mesh config file. You most likely have a mix of IPv4 and IPv6 hosts in the mesh. If you do not specify (in the test_spec) area, what kind of test you want to run, your hosts will just do a lookup at scheduling time. the hostname/IP combination that's looked up will be all over the board between your IPv4 and IPv6 hosts. Since the (I think it's IP address?) is the key for matching test results/pairs in your dashboard and graphs, you will have a mess of a dashboard. I would suggest always (when you don't have full control over the stack on every host) specify ipv4_only in your test specification. You can always build a (smaller) mesh of IPv6 enabled hosts if you want to see IPv6 performance for comparison.
Maybe the developers would like to correct me on the second point, perhaps there is a default (ipv4_only or ipv6_only)? But my experience has been the hosts will prefer v6 if they can get a AAAA record and try/prefer it. Everything else will default to v4 and you end up with a mess.
On Tue, Aug 14, 2018 at 9:15 AM, Michael Johnson <> wrote:
To add on to what Ivan mentioned, you only have a local archive configured, even for mesh tests; this is one of your problems:
https://localhost/esmond/perfsonar/archive/
See:
https://perfsonar-1850.frgp.net/pscheduler/tasks/5721a884-a0 38-4cda-9cfb-e0c5f5d79134
Your archive should include the central MA for the quilt. In your /etc/perfsonar/meshconfig_agent.conf, do you have this set up?
configure_archives 1
You have other issues as well though, for instance when I try to run a latency test from your host to perfsonar.illinois.net, it works. In the other direction, I see this:
$ pscheduler task latency --dest perfsonar.illinois.net --source perfsonar-1850.frgp.net
Submitting task...
Task URL:
https://perfsonar-1850.frgp.net/pscheduler/tasks/7227ea2e-8e ca-4cf2-83c4-4aebd69e2424
Running with tool 'owping'
Fetching first run...
perfsonar-1850.frgp.net never scheduled a run for the task.
I am not sure why pscheduler is not scheduling the run. I see this error, but I'm not sure what to make of it:
"iperf3 requires exactly 2 participants, got 1"
from here:
https://perfsonar-1850.frgp.net/pscheduler/tasks/5721a884-a0 38-4cda-9cfb-e0c5f5d79134/runs /2b7a1375-7229-46b6-ba4b-dfc8d c88de10
I hope this helps.
- Michael
On Mon, Aug 13, 2018 at 06:20:50PM -0600, Pete Siemsen wrote:
Ever since I upgraded to 4.0.2, my (the FRGP's) participation in the Quilt
mesh at http://quiltmesh.onenet.net/maddash-webui/ has been problematic. My
whole row is orange. Embarrassing. I've appended the last 25 lines from
/var/log/perfsonar/meshconfig-agent.log file. These messages mystify me
because I can ping and/or traceroute to most of the hosts that appear in
these error messages, like these
ps-grand-bw.perfsonar.kanren.net
kc-core-psr.mo.more.net
perfsonar-1850$ tail --lines=25 meshconfig-agent.log
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(ps-svl-10g.cenic.net->perfsonar-1850.frgp.net ),
continuing with rest of config: 403 FORBIDDEN: Task forbidden by limits:
Hints:
requester: 129.19.165.2
server: 137.164.28.121
Identified as everybody
Classified as default
Application: Defaults applied to non-friendly hosts
Group 1: Limit 'innocuous-tests' failed: Test type not in list
Group 1: Want any, 0/1 passed, 1/1 failed: FAIL
Group 1: Failed; stopping here.
Application FAILS
Proposal does not meet limits
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(ps-grand-bw.perfsonar.kanren.net->
perfsonar-1850.frgp.net), continuing with rest of config: 500 timeout:
timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(ps-grand-lt.perfsonar.kanren.net->
perfsonar-1850.frgp.net), continuing with rest of config: 500 timeout:
timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(nmon-aa.mich.net->perfsonar-1850.frgp.net ),
continuing with rest of config: 500 timeout: timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(kc-core-psr.mo.more.net->perfsonar-1850.frgp.net),
continuing with rest of config: 500 timeout: timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(latency.eugn-perfsonar.nero.net->
perfsonar-1850.frgp.net), continuing with rest of config: 500 timeout:
timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(bandwidth.eugn-perfsonar.nero.net->
perfsonar-1850.frgp.net), continuing with rest of config: 500 timeout:
timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(perfsonar-1850.frgp.net->web100.pnw-gigapop.net),
continuing with rest of config: 500 INTERNAL SERVER ERROR: Unable to
determine participants: Process took too long to run.
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(web100.pnw-gigapop.net->perfsonar-1850.frgp.net ),
continuing with rest of config: 500 INTERNAL SERVER ERROR: Unable to
determine participants: Process took too long to run.
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(perfsonar.unl.edu->perfsonar-1850.frgp.net ),
continuing with rest of config: 500 timeout: timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(perfsonar-msn.wiscnet.net->perfsonar-1850.frgp.net),
continuing with rest of config: 500 timeout: timeout
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(perfsonar-1850.frgp.net->noctuidae.cns.vt.edu),
continuing with rest of config: 500 INTERNAL SERVER ERROR: Unable to
determine participants: Process took too long to run.
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(noctuidae.cns.vt.edu->perfsonar-1850.frgp.net ),
continuing with rest of config: 500 INTERNAL SERVER ERROR: Unable to
determine participants: Process took too long to run.
2018/08/13 18:06:47 (26630) WARN> perfsonar_meshconfig_agent:430 main:: -
Problem adding test trace(psonar.arc.vt.edu->perfsonar-1850.frgp.net ),
continuing with rest of config: 500 timeout: timeout
Any clue appreciated :-)
-- Pete
--
To unsubscribe from this list: https://lists.internet2.edu/sympa/signoff/perfsonar-user
--
Michael Johnson
GlobalNOC DevOps Engineer
- [perfsonar-user] meshconfig problems, Pete Siemsen, 08/14/2018
- AW: [perfsonar-user] meshconfig problems, Garnizov, Ivan (RRZE), 08/14/2018
- Re: [perfsonar-user] meshconfig problems, Michael Johnson, 08/14/2018
- Re: [perfsonar-user] meshconfig problems, Casey Russell, 08/14/2018
- Re: [perfsonar-user] meshconfig problems, Michael Johnson, 08/14/2018
Archive powered by MHonArc 2.6.19.