Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] mesh config, bidirectional tests and storing results locally - can't wrap my head around what changed

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] mesh config, bidirectional tests and storing results locally - can't wrap my head around what changed


Chronological Thread 
  • From: Dan Doyle <>
  • To: Casey Russell <>
  • Cc: "" <>
  • Subject: Re: [perfsonar-user] mesh config, bidirectional tests and storing results locally - can't wrap my head around what changed
  • Date: Thu, 3 Aug 2017 10:04:49 -0400
  • Ironport-phdr: 9a23:8kJD1RwBPWcq+ZPXCy+O+j09IxM/srCxBDY+r6Qd0uoUKvad9pjvdHbS+e9qxAeQG96Ku7Qc06L/iOPJYSQ4+5GPsXQPItRndiQuroEopTEmG9OPEkbhLfTnPGQQFcVGU0J5rTngaRAGUMnxaEfPrXKs8DUcBgvwNRZvJuTyB4Xek9m72/q89pDXYAhEniaxba9vJxiqsAvdsdUbj5F/Iagr0BvJpXVIe+VSxWx2IF+Yggjx6MSt8pN96ipco/0u+dJOXqX8ZKQ4UKdXDC86PGAv5c3krgfMQA2S7XYBSGoWkx5IAw/Y7BHmW5r6ryX3uvZh1CScIMb7Vq4/Vyi84Kh3SR/okCYHOCA/8GHLkcx7kaZXrAu8qxBj34LYZYeYO/RkfqPZYNgUW2xPUMhMXCBFG4+xb44DAuwcNuhasob9vUMDogexCgexBO3gyDFGiHz43aMkyOkhHh3K3Ak6Et4SrHjZrtP4P7oSX+Cvy6nIyC3OYf1M1jf79YPGfBchoPGIXb1ubMHczlQgGBnBjlqNrozkPzeV2foWvmiU6+pgUvmghHQjqwF3pTig2t0giojUho4P1F/L6Dh5zZ8zKNalRkB7ZtukH4FRtyGcL4Z2RcUiTH9uuCkk1r0Ko5i7czYWyJg/wx7favqHc4uW7R3+VeaRJy10i25ieLK6nxqy7UahyuzgVsmozllKtDBJncXLtnAIzxDT7NKISuBn8Uu71jaP0B7T5vlBIU8umqvbJJ8hwrEqmZoLtkTDBjX6mEPog6+Kbkkk+fKn6+L9Yrr4oJ+QLYl0hR/iMqg2gcy/Bus5PhIIX2eF5eSx0qDo807hQLhSkPE6jrXVvI3fKMgGpaO2HQxY34Mt6xaxEzuqzNEVkWcJIV9AfR+Kj5XlNlfTK/7iF/i/mU6jkDJzyvDGILLhBpLNI2DGkLj7fLZ971RQxxYvzd9D/Z5UBasBIPT0WkDtrtDYDwI5PxaqzOn6FdVxzoIeWWSRDa+FKK7er0OE6v4yL+SJeYMYuyjyJvsg6v7gg381hUMRcKy30ZYZbX21G/RrL1iBbXrpmNgBEGMKvgQkTOztjV2PSSNcaGy2X60h4TE6CIamApnYS4CihLyB2zy0EYdSZmBADVCDDW3kd4SZW/sWdi2dP9JhniQeVbe9U48hyQ2utAjixrp/MOXU4CMYtYnk1Nhz/eLTjwg+9SFvD8uDyWGAVGV0nmITRz8qx6Bzv1ZxylaF0ahknfNYD9pT6O1VUgskL5LT0fF1C82hEj7GK/yAU1u3Cu+7Gio8StZ5l9QUfl1mFtGmphPK2TCnBfkTmqDdV7Iu9aeJ83H3b+p62nLLnP0thl0qRdpLOEWpgql1/gzVAIWPkljfmqq3I/dPlBXR/XuOmDLd9HpTVxR9BOCcBS0S

Casey,

It actually just dawned on me while looking at this - I don't see the usual "created by meshconfig" information in the task spec. Are you / someone else running tests by hand to verify things? If so, it may be entirely possible that "perfsonar-meshconfig" isn't running at all or is failing for whatever reason to set up the tasks in your pscheduler instance. You may want to verify that the meshconfig is actually succeeding - /etc/perfsonar/meshconfig-agent.conf has the right <mesh> blocks, and the logs aren't full of errors.

Here's an example of a task created by meshconfig for reference: https://ps-bryant-lt.perfsonar.kanren.net/pscheduler/tasks/287657f6-8707-45e9-8d0a-2aa6a491607c

That would suggest that meshconfig is running. If meshconfig agent is working, you might want to verify that the information in the actual .json file references your host correctly.

Dan Doyle
GlobalNOC Software Developer
1-812-856-3892

On Aug 3, 2017, at 9:58 AM, Dan Doyle <> wrote:

Casey,

It looks like there may be several issues at play here in the mesh, but I'll try to just focus on the host you identified. For reference, pScheduler has a very rich API that you can pull all the raw data for runs and such out of. Here are all of the scheduled tasks it has for "latencybg" tasks, ie the long lived powstreams typical in a latency mesh. Note: you mentioned ps-bryant-bw, but the linked mesh shows ps-bryant-lt. I'm not sure if this is the same host different interface, but I'm going to continue with the -lt example.


Click any of those will show all the parameters for a run. You can add /runs to the end of any of to see the actual results of each run (a given task may have multiple runs inside of it) which typically includes the raw output and archiving information.

Here's the task from "ps-bryant-lt.perfsonar.kanren.net" to "56m-ps.sox.net" which is a host that seems to be working in a number of spots, so probably is a decent example. 


It looks like the it worked fine, but I don't see any "archiving" information which would suggest that it finished the test and then did nothing with the results.

I would probably start by looking at /etc/perfsonar/meshconfig-agent-tasks.conf and verify that the <measurement_archive> blocks in there are correct. If it's supposed to be learning the MAs from the mesh, ensure that "configure_archives 1" is set in /etc/perfsonar/meshconfig-agent.conf, otherwise ensure that it's either commented out or set to 0.

Hope that helps point things in a useful direction - if not feel free to re-engage. I can take a look at your conf files as well, though I'd also say if you're using API keys for any MAs please be sure to strip any of that information out first.

Dan Doyle
GlobalNOC Software Developer
1-812-856-3892

On Aug 2, 2017, at 6:38 PM, Casey Russell <> wrote:

Group,

     One of our hosts participates in a larger regional mesh, about the time of the upgrades to PS 4.0, many hosts in the mesh "went yellow" meaning no results are found for tests in that grid square.  I've just recently begun looking into why that is.  While I can tell that it's either an inability to store results on my local host, or an inability of the dashboard host to read them.  I can't wrap my head around what changed in 4.0 and what I need to do to fix it.  A number of the hosts in the mesh just seemed to work through the conversion just fine although according to the JSON, they're running the same tests and storing the same *(flipped) results from remote hosts.

     I suspect it's related to the thread I've copied in below between George Uhl and Andrew Lake from Back in April, but even having read it, I can't quite grasp why some hosts in the mesh are working and some aren't.  The mesh is at: http://ps.onenet.net/maddash-webui/index.cgi?grid=Quilt%20Latency  My host is ps-bryant-bw.perfsonar.kanren.net.  You can see the entire horizontal row (save one host) is yellow.  That's the row where all the results should be stored in/retrieved from the local MA on my machine.

I've verified that IPtables isn't blocking access to esmond from off-network (port 443/80), and I've tried adding IP based authentication for the remote hosts.  Both to no effect.

Any suggestions appreciated.  

Sincerely,
Casey Russell
Network Engineer
KanREN
phone785-856-9809
2029 Becker Drive, Suite 282
Lawrence, Kansas 66047
linkedin twitter twitter

On Thu, Apr 27, 2017 at 9:01 AM, Andrew Lake <> wrote:




On April 26, 2017 at 7:00:48 PM, Uhl, George D. (GSFC-423.0)[SGT INC] () wrote:

Thanks for the clarification, Andy.  So back in the 3.5.1 days I had identified pS nodes that I don’t manage as no_agent hosts with the intention of having my managed pS nodes initiate the bi-directional tests and send the bi-directional results the central MA.  Is that feature no longer available in pS 4.0?

You can still do no_agent with force bidirectional and your host will still initiate the test (i.e. be responsible for creating the pscheduler task) but in the caseof throughput, traceroute and ping tests the source is always the be one that sends it to the archiver regardless of the initator. Your OWAMP tests (latency and latencybg in pscheduler terminology) still work the same way since we can use the —flip option to have the local side be the only pscheduler participant and thus responsible for the archiving even when it is not the source. 





Thanks,
George

From: Andrew Lake <>
Date: Wednesday, April 26, 2017 at 4:27 PM
To: George Uhl <>, "" <>
Subject: Re: [perfsonar-user] mesh tests fail to archive results from reverse path tests

Hi George,

Sorry for the delay. The source of the test is always responsible for archiving and is the side that has all the info about whether it succeeded or not. If you swap mcln-ps.maxgigapop.net into the URL you should see what you are after:

2017-04-25T03:53:08-05:00 on mcln-ps.maxgigapop.net and enpl-pt2-10g.eos.nasa.gov with iperf3:

throughput --duration PT30S --source mcln-ps.maxgigapop.net --ip-version 4 --dest enpl-pt2-10g.eos.nasa.gov --window-size 1310720 --parallel 1

* Stream ID 4
Interval       Throughput     Retransmits    Current Window 
0.0 - 1.0      9.03 Gbps      0              2.03 MBytes    
1.0 - 2.0      8.87 Gbps      0              2.03 MBytes    
2.0 - 3.0      8.70 Gbps      0              2.03 MBytes    
3.0 - 4.0      8.92 Gbps      0              2.03 MBytes    
4.0 - 5.0      8.81 Gbps      0              2.03 MBytes    
5.0 - 6.0      8.50 Gbps      0              2.03 MBytes    
6.0 - 7.0      8.05 Gbps      0              2.03 MBytes    
7.0 - 8.0      7.79 Gbps      0              2.03 MBytes    
8.0 - 9.0      7.49 Gbps      0              2.03 MBytes    
9.0 - 10.0     8.43 Gbps      0              2.03 MBytes    
10.0 - 11.0    8.64 Gbps      0              2.03 MBytes    
11.0 - 12.0    8.46 Gbps      0              2.03 MBytes    
12.0 - 13.0    8.10 Gbps      0              2.03 MBytes    
13.0 - 14.0    7.80 Gbps      0              2.03 MBytes    
14.0 - 15.0    7.23 Gbps      0              2.03 MBytes    
15.0 - 16.0    7.04 Gbps      0              2.03 MBytes    
16.0 - 17.0    7.10 Gbps      0              2.03 MBytes    
17.0 - 18.0    6.99 Gbps      0              2.03 MBytes    
18.0 - 19.0    7.26 Gbps      0              2.03 MBytes    
19.0 - 20.0    7.32 Gbps      0              2.03 MBytes    
20.0 - 21.0    7.38 Gbps      0              2.03 MBytes    
21.0 - 22.0    7.37 Gbps      0              2.03 MBytes    
22.0 - 23.0    7.29 Gbps      0              2.03 MBytes    
23.0 - 24.0    7.21 Gbps      0              2.03 MBytes    
24.0 - 25.0    7.11 Gbps      0              2.03 MBytes    
25.0 - 26.0    7.07 Gbps      0              2.03 MBytes    
26.0 - 27.0    7.06 Gbps      0              2.03 MBytes    
27.0 - 28.0    6.99 Gbps      0              2.03 MBytes    
28.0 - 29.0    7.07 Gbps      0              2.03 MBytes    
29.0 - 30.0    7.07 Gbps      0              2.03 MBytes    

Summary
Interval       Throughput     Retransmits    
0.0 - 30.0     7.74 Gbps      0

Archivings:

  To esmond, Finished
    2017-04-25T03:53:53-05:00 400: Invalid JSON returned
    2017-04-25T03:54:58-05:00 400: Invalid JSON returned
    2017-04-25T04:04:05-05:00 400: Invalid JSON returned
    2017-04-25T05:06:55-05:00 400: Invalid JSON returned
    2017-04-25T06:07:27-05:00 400: Invalid JSON returned
    2017-04-25T07:07:32-05:00 400: Invalid JSON returned
    2017-04-25T08:12:35-05:00 400: Invalid JSON returned
    2017-04-25T09:13:20-05:00 400: Invalid JSON returned
    2017-04-25T10:16:57-05:00 400: Invalid JSON returned
    2017-04-25T11:17:02-05:00 400: Invalid JSON returned
    2017-04-25T12:17:08-05:00 400: Invalid JSON returned
    2017-04-25T13:17:13-05:00 400: Invalid JSON returned
    2017-04-25T14:18:37-05:00 400: Invalid JSON returned
    2017-04-25T15:22:30-05:00 Archiver permanently abandoned registering test after 14 attempt(s): 400: Invalid JSON returned


Does archive.eos.nasa.gov allow mcln-ps.maxgigapop.net to connect to it on port 443? Having the source be responsible for the archiving is a change from 3.5 and a result of some of the architectural changes.

Thanks,
Andy




On April 25, 2017 at 3:09:01 PM, Uhl, George D. (GSFC-423.0)[SGT INC] () wrote:

Since the pS 4.0 upgrade I’ve noticed that some tests results are not getting archived to a my central archive.  In these cases I manage one of the hosts and I test to a no_agent host.  It’s the test results sourced from the no_agent host that fail to be archived.  Drilling down into the pscheduler results on my managed host shows tests in both directions run successfully but only the managed->no_agent test results get archived.  Nothing obvious in my meshconfig-agent-tasks.conf file stands out to me that would indicate a cause. 

From my managed host:

2017-04-25T06:53:59-04:00 on enpl-pt2-10g.eos.nasa.gov and mcln-ps.maxgigapop.net with iperf3:

throughput --duration PT30S --source enpl-pt2-10g.eos.nasa.gov --ip-version 4 --dest mcln-ps.maxgigapop.net --window-size 1310720 --parallel 1

* Stream ID 4
Interval       Throughput     Retransmits    Current Window 
0.0 - 1.0      6.19 Gbps      0              2.03 MBytes    
1.0 - 2.0      6.29 Gbps      0              2.03 MBytes    
2.0 - 3.0      6.28 Gbps      0              2.03 MBytes    
3.0 - 4.0      6.22 Gbps      0              2.03 MBytes    
4.0 - 5.0      6.15 Gbps      0              2.03 MBytes    
5.0 - 6.0      6.08 Gbps      0              2.03 MBytes    
6.0 - 7.0      5.89 Gbps      0              2.03 MBytes    
7.0 - 8.0      5.56 Gbps      0              2.03 MBytes    
8.0 - 9.0      5.06 Gbps      0              2.03 MBytes    
9.0 - 10.0     4.65 Gbps      0              2.03 MBytes    
10.0 - 11.0    4.38 Gbps      0              2.03 MBytes    
11.0 - 12.0    4.23 Gbps      0              2.03 MBytes    
12.0 - 13.0    4.25 Gbps      0              2.03 MBytes    
13.0 - 14.0    4.39 Gbps      0              2.03 MBytes    
14.0 - 15.0    6.29 Gbps      0              2.03 MBytes    
15.0 - 16.0    6.67 Gbps      0              2.03 MBytes    
16.0 - 17.0    6.64 Gbps      0              2.03 MBytes    
17.0 - 18.0    6.64 Gbps      0              2.03 MBytes    
18.0 - 19.0    6.68 Gbps      0              2.03 MBytes    
19.0 - 20.0    6.67 Gbps      0              2.03 MBytes    
20.0 - 21.0    6.66 Gbps      0              2.03 MBytes    
21.0 - 22.0    6.63 Gbps      0              2.03 MBytes    
22.0 - 23.0    6.63 Gbps      0              2.03 MBytes    
23.0 - 24.0    6.65 Gbps      0              2.03 MBytes    
24.0 - 25.0    6.68 Gbps      0              2.03 MBytes    
25.0 - 26.0    6.65 Gbps      0              2.03 MBytes    
26.0 - 27.0    6.66 Gbps      0              2.03 MBytes    
27.0 - 28.0    6.68 Gbps      0              2.03 MBytes    
28.0 - 29.0    6.66 Gbps      0              2.03 MBytes    
29.0 - 30.0    6.66 Gbps      0              2.03 MBytes    

Summary
Interval       Throughput     Retransmits    
0.0 - 30.0     6.06 Gbps      0

Archivings:

  To esmond, Finished
    2017-04-25T06:54:40-04:00 Succeeded



From the no_agent host:

2017-04-25T04:53:08-04:00 on mcln-ps.maxgigapop.net and enpl-pt2-10g.eos.nasa.gov with iperf3:

throughput --duration PT30S --source mcln-ps.maxgigapop.net --ip-version 4 --dest enpl-pt2-10g.eos.nasa.gov --window-size 1310720 --parallel 1

* Stream ID 4
Interval       Throughput     Retransmits    Current Window 
0.0 - 1.0      9.03 Gbps      0              2.03 MBytes    
1.0 - 2.0      8.87 Gbps      0              2.03 MBytes    
2.0 - 3.0      8.70 Gbps      0              2.03 MBytes    
3.0 - 4.0      8.92 Gbps      0              2.03 MBytes    
4.0 - 5.0      8.81 Gbps      0              2.03 MBytes    
5.0 - 6.0      8.50 Gbps      0              2.03 MBytes    
6.0 - 7.0      8.05 Gbps      0              2.03 MBytes    
7.0 - 8.0      7.79 Gbps      0              2.03 MBytes    
8.0 - 9.0      7.49 Gbps      0              2.03 MBytes    
9.0 - 10.0     8.43 Gbps      0              2.03 MBytes    
10.0 - 11.0    8.64 Gbps      0              2.03 MBytes    
11.0 - 12.0    8.46 Gbps      0              2.03 MBytes    
12.0 - 13.0    8.10 Gbps      0              2.03 MBytes    
13.0 - 14.0    7.80 Gbps      0              2.03 MBytes    
14.0 - 15.0    7.23 Gbps      0              2.03 MBytes    
15.0 - 16.0    7.04 Gbps      0              2.03 MBytes    
16.0 - 17.0    7.10 Gbps      0              2.03 MBytes    
17.0 - 18.0    6.99 Gbps      0              2.03 MBytes    
18.0 - 19.0    7.26 Gbps      0              2.03 MBytes    
19.0 - 20.0    7.32 Gbps      0              2.03 MBytes    
20.0 - 21.0    7.38 Gbps      0              2.03 MBytes    
21.0 - 22.0    7.37 Gbps      0              2.03 MBytes    
22.0 - 23.0    7.29 Gbps      0              2.03 MBytes    
23.0 - 24.0    7.21 Gbps      0              2.03 MBytes    
24.0 - 25.0    7.11 Gbps      0              2.03 MBytes    
25.0 - 26.0    7.07 Gbps      0              2.03 MBytes    
26.0 - 27.0    7.06 Gbps      0              2.03 MBytes    
27.0 - 28.0    6.99 Gbps      0              2.03 MBytes    
28.0 - 29.0    7.07 Gbps      0              2.03 MBytes    
29.0 - 30.0    7.07 Gbps      0              2.03 MBytes    

Summary
Interval       Throughput     Retransmits    
0.0 - 30.0     7.74 Gbps      0

Archivings:

    This task had no archivings.







Attachment: smime.p7s
Description: S/MIME cryptographic signature




Archive powered by MHonArc 2.6.19.

Top of Page