Skip to Content.
Sympa Menu

perfsonar-user - Re: [perfsonar-user] Unable to retrieve data

Subject: perfSONAR User Q&A and Other Discussion

List archive

Re: [perfsonar-user] Unable to retrieve data


Chronological Thread 
  • From: Joachim <>
  • To: "Garnizov, Ivan" <>
  • Cc: "" <>
  • Subject: Re: [perfsonar-user] Unable to retrieve data
  • Date: Fri, 16 Sep 2022 14:49:30 +0200

Hi Ivan 

I´ve bin looking back one the tests 
and ran a test on 600 packets from the 1G port on the server which worked fine.


then I stumpled over 
perfsonar-configure_nic_parameters and it looks like it failes on one of the interfaces. 
Cant see where that script are logging errors or info too. 
so what I think is rather the one with/without the parameters works, not sure at the point of time. 

do you might have the newest version of the script near by 
so I could compare it with the script on the server and get the right coalesce parameters on the interface 



Kind regards 
Joachim Hunosøe

+45 5374 7719

On 15 Sep 2022, at 17.30, Garnizov, Ivan <> wrote:

Hi Joachim,
 
Well it depends on the point of view.
As you can see from the APAN dashboard the GEANT hosts can cope with the parameters applied by the mesh configuration in measurements with some remote hosts.
 
“A bit” seems not to be the right _expression_ though. Please note that the mesh configuration specifies 600 packets, while your server is chocking already at 60. Also it appears it is not the UDP packets that get lost, but for whatever reason the TCP control session appears to be dropped. You would need to capture the network communication to see the difference.
Dumping the network communication traffic did not much more than that. All I could see is that the last TCP exchange is just dropped / missing, which leaves the owping tool hanging / waiting.
Another example / clue is the statistics output. For the stalled tests when you break the owping process you get the results, but only in one direction and not both statistics. (check the shared measurements again)
 
It is interesting to see some output of results from your side.
 
 
“it would start to run again and not be stalled or block, is that right ?”
I would expect that, BUT then you get very small sessions of OWD latency tests, which might also not be very useful.
 
 
Regards,
Ivan Garnizov
 
GEANT WP6T3: pS development team
GEANT WP7T1: pS deployments GN Operations
GEANT WP9T2: Software governance in GEANT
 
 
 
From: Joachim [] 
Sent: Thursday, September 15, 2022 8:34 AM
To: Garnizov, Ivan (RRZE) <>
Cc: ; Shogo Yoshioka <>
Subject: Re: [perfsonar-user] Unable to retrieve data
 
Hi Ivan 
 
much appreciated for your help 
 
so if the packet amount was set down a bit, 
it would start to run again and not be stalled or block, is that right ?
Just to be sure.
 
Best regards 
Joachim Hunosøe

+45 5374 7719 


On 14 Sep 2022, at 16.07, Garnizov, Ivan <> wrote:
 
Hi Joachim,
 
From the graphs and the mesh configuration, that I have seen I can draw the conclusion, that the server is able to submit data to the measurement archive and that the GNA MaDDash is properly configured.
As it appears from your measurement archive the last collection of a latency measurement between your server and the GEANT London server is:
GMT: Thursday, February 10, 2022 8:15:54 AM
Your time zone: Thursday, February 10, 2022 9:15:54 AM 
GMT+01:00
Relative: 7 months ago
 
Still this server as per the results I sent is submitting current data to the measurement archive. This means the service responsible for the data submission is operational.
 
Now since you report about the London server I decided to make the reverse test and found some strange pattern.
In the output below you’ll notice that latency tests succeed gracefully only if these are of less than 60
Any test with more than 50 packets had to be interrupted and therefor fail….actually in the perfSONAR case it gets stalled and blocks any further runs.
 
I am not sure what conclusions can be drawn from here, but definitely pS implementation is operational and IMO not part of the issue.
 
Regards,
Ivan Garnizov
 
GEANT WP6T3: pS development team
GEANT WP7T1: pS deployments GN Operations
GEANT WP9T2: Software governance in GEANT
 
 
 
@psmp-gn-mgmt-lon-uk ~]$ owping -S psmp-gn-owd-lon-uk.geant.org fi-csc-pstp01-mi1.nordu.net  -c 500 -i 0.1 -s 0 -b 0.0001
owping: NTP: STA_NANO should be set. Make sure ntpd is running, and your NTP configuration is good.
Approximately 52.8 seconds until results available
^Cowping: owp_fetch_sid:Server denied request for to session data - is your clock synchronized via NTP properly?
owping: Unable to fetch data for sid(6d69635ee6cc5ab74d1be403df481b98)
 
--- owping statistics from [fi-csc-pstp01-mi1.nordu.net]:8822 to [psmp-gn-owd-lon-uk.geant.org]:9933 ---
SID:    3e286a81e6cc5ab7581248d7448dfd50
first:  2022-09-14T13:50:48.508
last:   2022-09-14T13:51:40.243
0 sent, 0 lost (0.000%), 0 duplicates
one-way delay min/median/max = nan/nan/nan ms, (err=0 ms)
one-way jitter = nan ms (P95-P50)
TTL not reported
no reordering
 
@psmp-gn-mgmt-lon-uk ~]$ owping -S psmp-gn-owd-lon-uk.geant.org fi-csc-pstp01-mi1.nordu.net  -c 50 -i 0.1 -s 0 -b 0.0001
owping: NTP: STA_NANO should be set. Make sure ntpd is running, and your NTP configuration is good.
Approximately 7.9 seconds until results available
 
--- owping statistics from [psmp-gn-owd-lon-uk.geant.org]:9682 to [fi-csc-pstp01-mi1.nordu.net]:8802 ---
SID:    6d69635ee6cc5abf645ba8c331b8e6aa
first:  2022-09-14T13:50:56.535
last:   2022-09-14T13:51:02.112
50 sent, 0 lost (0.000%), 0 duplicates
one-way delay min/median/max = 18.2/18.3/18.3 ms, (err=0.382 ms)
one-way jitter = 0 ms (P95-P50)
hops = 5 (consistently)
no reordering
 
 
--- owping statistics from [fi-csc-pstp01-mi1.nordu.net]:9663 to [psmp-gn-owd-lon-uk.geant.org]:8934 ---
SID:    3e286a81e6cc5abf7066c2acc09bc700
first:  2022-09-14T13:50:56.399
last:   2022-09-14T13:51:01.590
50 sent, 0 lost (0.000%), 0 duplicates
one-way delay min/median/max = 16.5/16.6/16.6 ms, (err=0.382 ms)
one-way jitter = 0 ms (P95-P50)
hops = 5 (consistently)
no reordering
 
@psmp-gn-mgmt-lon-uk ~]$ owping -S psmp-gn-owd-lon-uk.geant.org fi-csc-pstp01-mi1.nordu.net  -c 60 -i 0.1 -s 0 -b 0.0001
owping: NTP: STA_NANO should be set. Make sure ntpd is running, and your NTP configuration is good.
Approximately 8.9 seconds until results available
^Cowping: _OWPReadTestRequest: Unable to read from socket.
owping: Unable to fetch data for sid(6d69635ee6cc5ad10395c2042067b987)
 
--- owping statistics from [fi-csc-pstp01-mi1.nordu.net]:9518 to [psmp-gn-owd-lon-uk.geant.org]:9356 ---
SID:    3e286a81e6cc5ad10e98bb4dc6ab0699
first:  2022-09-14T13:51:13.996
last:   2022-09-14T13:51:19.013
60 sent, 0 lost (0.000%), 0 duplicates
one-way delay min/median/max = 16.4/16.6/16.6 ms, (err=0.382 ms)
one-way jitter = 0 ms (P95-P50)
hops = 5 (consistently)
no reordering
 
@psmp-gn-mgmt-lon-uk ~]$ owping -S psmp-gn-owd-lon-uk.geant.org fi-csc-pstp01-mi1.nordu.net  -c 50 -i 0.1 -s 0 -b 0.0001
owping: NTP: STA_NANO should be set. Make sure ntpd is running, and your NTP configuration is good.
Approximately 7.9 seconds until results available
 
--- owping statistics from [psmp-gn-owd-lon-uk.geant.org]:9333 to [fi-csc-pstp01-mi1.nordu.net]:9543 ---
SID:    6d69635ee6cc5be197ca9637b9dcd45f
first:  2022-09-14T13:55:46.756
last:   2022-09-14T13:55:51.465
50 sent, 0 lost (0.000%), 0 duplicates
one-way delay min/median/max = 14.4/14.5/14.5 ms, (err=0.391 ms)
one-way jitter = 0.1 ms (P95-P50)
hops = 5 (consistently)
no reordering
 
 
--- owping statistics from [fi-csc-pstp01-mi1.nordu.net]:8813 to [psmp-gn-owd-lon-uk.geant.org]:9419 ---
SID:    3e286a81e6cc5be1a3c4199f5bb7edcd
first:  2022-09-14T13:55:46.574
last:   2022-09-14T13:55:51.800
50 sent, 0 lost (0.000%), 0 duplicates
one-way delay min/median/max = 16.4/16.5/16.5 ms, (err=0.391 ms)
one-way jitter = 0 ms (P95-P50)
hops = 5 (consistently)
no reordering
 
 
 
 
 
 
 
From: Joachim [] 
Sent: Wednesday, September 14, 2022 2:46 PM
To: Garnizov, Ivan (RRZE) <
>
Cc: 
; Shogo Yoshioka <>
Subject: Re: [perfsonar-user] Unable to retrieve data
 
Hi Ivan 
 
thats one of those that looks to be okay, 
besites that reports: tells site can’t test
 
but the following are the once Im troubleshooting and investigating in 
 
Shown in pscheduler monitor with finished and no error logs but not in the perfsonar webinterface or maddsah 
 
Shown in pscheduler monitor with missed or failed with error logs 
 
 
Best regards 
Joachim Hunosøe

+45 5374 7719 



On 14 Sep 2022, at 14.36, Garnizov, Ivan <> wrote:
 
Hi Joachim,
 
The APAN GNA mesh is in general something specific, since the dashboard is managed by APAN and you provide the measurement results from your organisations measurement store.
 
Does this match one of the measurements you are looking for?
 
<image001.png>
 
Regards,
Ivan Garnizov
 
GEANT WP6T3: pS development team
GEANT WP7T1: pS deployments GN Operations
GEANT WP9T2: Software governance in GEANT
 
 
 
 
 
From: Joachim [] 
Sent: Wednesday, September 14, 2022 9:02 AM
To: Garnizov, Ivan (RRZE) <
>
Cc: 
; Shogo Yoshioka <>
Subject: Re: [perfsonar-user] Unable to retrieve data
 
Hi Ivan
 
 
I fixed one of my issues, the server could not handle any more yesterday, so most of the perfsonar services was Exited and not active. but they are running again after reboot and manual start service one by one.
but then the next issue shows.
 
some of the test the host NOX-HEL runs to GEANT Open London, GEANT Open Paris and Cernet GXP-BJ(210.25.186.62), are finished in pscheduler monitor and have no errors in pscheduler.log 
but they are still unable to retreive data either to the host it self or the maddash. 
 
when it comes to the following 
MANLAN, Pacific Wave Seattle and WIX they are unable to run the tests.
also have reports on WIX that the site is down, so those 3 makes more sense why they cant have any results to the maddash 
Sep 14 06:48:54 fi-csc-pstp01.nordu.net powstream[24534]: OWPControlOpen([web100.pnw-gigapop.net]:861): Couldn't open 'control' connection to server: Connection timed out
Sep 14 06:48:54 fi-csc-pstp01.nordu.net powstream[24540]: OWPControlOpen([ps.test.wix.internet2.edu]:861): Couldn't open 'control' connection to server: Connection timed out
Sep 14 06:48:55 fi-csc-pstp01.nordu.net powstream[24532]: OWPControlOpen([ps.test.manlan.internet2.edu]:861): Couldn't open 'control' connection to server: Connection timed out
 
but first three are a little strange, because they are shown in the pscheduler monitor and I can run them manually with results but those aren’t retrieved to the webinterface or the maddash
 
about the schedule that actually a good idea, thanks. 
 
 
 
Best regards 
Joachim Hunosøe

+45 5374 7719 




On 13 Sep 2022, at 16.57, Garnizov, Ivan <> wrote:
 
Hi Joachim,

I am not sure, why would you insist that the results are sent to MaDDash. MaDDash is only a visualisation of archived data.
In order to understand where the results are sent, you should look in the diagnostic information about the tasks. The pScheduler monitor is not the right place to look for this.

Please send some URLs of a task registered on the host in question. I would suggest at least one successful and one failing task as per your understanding.
If your host is not publicly reachable, then you should try to read through the information yourself.

You can get many URLs of pScheduler runs from this command: pscheduler schedule
But in order to get some info about completed runs, I would suggest to use: pscheduler schedule -PT2H

Please provide some results to be able to elaborate on them.

Regards,
Ivan

-----Original Message-----
From: Joachim [] 
Sent: Tuesday, September 13, 2022 8:16 AM
To: Garnizov, Ivan (RRZE) <>
Cc: 
Subject: Re: [perfsonar-user] Unable to retrieve data

Hi Ivan 

sure
well when I run pscheduler monitor and looks at the pscheduler log the tasks that Im looking for and are orange in the maddash, 
they are completed by the pscheduler host 

but it does not look like the data are send to the maddash host 

Best regards 
Joachim Hunosøe

+45 5374 7719




On 12 Sep 2022, at 15.38, Garnizov, Ivan <> wrote:

Hi Joachim,

In the general case there are no issues with MaDDash reporting/plotting results from measurements.

Please describe in more details, what appears successful and what not.
" some of the task that my host are running har success " it is not clear if one and same host is involved in successful and failed measurements (GUI visualisation)
" plus manual running the test gives result to the other perfsonar host " doesn't tell much, since you might not be setting properly the correct measurement archive as destination for the results.
" but the result of this task are not send to the remote maddash gui ": results are never sent to a MaDDash GUI. The results are retrieved by the GUI from the MA.


Regards,
Ivan Garnizov

GEANT WP6T3: pS development team
GEANT WP7T1: pS deployments GN Operations
GEANT WP9T2: Software governance in GEANT




-----Original Message-----
From:  [] On Behalf Of Joachim
Sent: Monday, September 12, 2022 9:23 AM
To: 
Subject: [perfsonar-user] Unable to retrieve data

Hey everyone 

Im troubleshooting on some maddash grid 
and from what I can see 

some of the task that my host are running har success
plus manual running the test gives result to the other perfsonar host 

but the result of this task are not send to the remote maddash gui

do any of you had this issue in the past and what resolve it for you. 

Best regards 
Joachim Hunosøe

+45 5374 7719

 
--
To unsubscribe from this list: https://lists.internet2.edu/sympa/signoff/perfsonar-user

Attachment: smime.p7s
Description: S/MIME cryptographic signature




Archive powered by MHonArc 2.6.24.

Top of Page