perfsonar-user - Re: [perfsonar-user] [EXTERNAL] RE: Cannot query non-lead server for run result
Subject: perfSONAR User Q&A and Other Discussion
List archive
Re: [perfsonar-user] [EXTERNAL] RE: Cannot query non-lead server for run result
Chronological Thread
- From: "Uhl, George D. (GSFC-423.0)[Arctic Slope Technical Services, Inc.]" <>
- To: "Garnizov, Ivan" <>, "" <>
- Cc: "Jackson, Wayne" <>, "Germain, Andrew M. (GSFC-423.0)[Arctic Slope Technical Services, Inc.]" <>, "Butler, Todd F" <>
- Subject: Re: [perfsonar-user] [EXTERNAL] RE: Cannot query non-lead server for run result
- Date: Thu, 5 Aug 2021 13:12:00 +0000
- Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nasa.gov; dmarc=pass action=none header.from=nasa.gov; dkim=pass header.d=nasa.gov; arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=sFT3kxDFi/6U3jSdxvJCLAEqWMfICRGnJAc2uHeu820=; b=GSlz3UEJwqbKRovzCgnvG8LtgNhYfrk+MkLfKItDNYIiIImBDQlQhb+MCfVYaGmeVkwuukgyI08Hkc7YD2IXy4xZWXhYGC4TE0aVSbvThETcs03/WWkHXygagbFW3N09Y0Ud7ovgAkAdQX/pib7oD98Nr2G/VY/MCh6hDni3p1vFgsF1TgiVspaqnom8wQ8jMON9AQUlF38ZtOeF//JRSrZH1XSy0zSOExjbOigRLN8sUh4UmJD1WgXUEGM6bm25DvNvfkLVIpbrw85GlxsPF2jyPTrSZmGzpGTerlx1OJl81EJ+1QMG5yy1GbWeGYmFuZBNp7QC+sXkW3IdEKZ3wg==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=afvUdhzc1Iis0YT5aakQ4o633wQp/azNOPqkSZPXVDjaYJfWymUOHra7hRG8UTBxzox8lH26XQy7kTF7Z3gDCL3YxHzIrBjA4JWcgLDb8mvxtUaIFFMCD6V6xtYYyKQ9CkzAvbUCjBqYbQmdMMiXeXzNy6cfMswjDzuFM8P1LdFQO0TrtSTVVWdBtyohOdk3VqAiSipO+m8pM5P9lTbGYodS0nM6EcCb7GTkwE6HfYCedmv1J0c0bxG3qXLi+g519l+st/cn5DC8HG6yyHwPel/Q+mmxGBc2BdWFLWypznkGJPuqYkV0kiVUd5FbsH5I3LlDK7vXQ8WVKKKSvTi7VA==
- Dkim-filter: OpenDKIM Filter v2.11.0 ndmsvnpf102.ndc.nasa.gov D02544006CF3
Hi Ivan,
I’m sorry but I sent incorrect test information in my last email. “HostA” is actually a no-agent host. Both the managed host and non-agent hosts are running 4.4.0. I believe the no-agent host is running the perfsonar toolkit bundle.
To fill in more of the blanks, when I run “psconfig pscheduler-tasks” I get this entry
021-08-05T12:43:31Z - 2021-08-05T12:43:44Z (Finished) throughput --source <Non-Agent-Host IP> --source-node <Non-Agent-Host IP> --dest <Managed-Host IP> --dest-node <Managed-Host IP> --duration PT10S --ip-version 4 (Run with tool 'iperf3') https://localhost/pscheduler/tasks/716ae323-0af1-4a4d-9028-ccf35ffc5c02/runs/9ab28d4a-adc4-4605-8267-3cab5b441206
# pscheduler result https://localhost/pscheduler/tasks/716ae323-0af1-4a4d-9028-ccf35ffc5c02/runs/9ab28d4a-adc4-4605-8267-3cab5b441206 produces Cannot query non-lead server for run result.
When I check /var/log/perfsonar/psconfig-pscheduler-agent-tasks.log, I see records for this task every hour. The latest is here:
2021/08/05 12:51:01 INFO guid=8524DC64-F5EB-11EB-BBAB-A4BBA32715F6 task_name=CONUS_IPv4_Throughput config_url=https://<Central MA IP>/psconfig/conus.json config_src=remote task={"archives":[{"data":{"measurement-agent":"<Managed-Host IP>","url":"https://<Managed-Host IP> /esmond/perfsonar/archive","_auth-token":"31d371ae39cdad21a5d5d1c2317004ed8e756ea0"},"archiver":"esmond"}],"test":{"spec":{"source":"< Non-Agent-Host IP >","dest":"< Managed-Host IP >","dest-node":"< Managed-Host IP >","duration":"PT10S","schema":1,"source-node":"< Non-Agent-Host IP >","ip-version":4},"type":"throughput"},"reference":{"psconfig":{"created-by":{"user-agent":"psconfig-pscheduler-agent","uuid":"2ED6048E-F04B-11E9-A9F6-63FAA52715F6"}}},"tools":["iperf3"],"schedule":{"sliprand":true,"slip":"PT7200S","repeat":"PT7200S"}}
When I check /var/ log/perfsonar/psconfig-pscheduler-agent-transactions.log, I’m getting that FORBIDDEN error message.
2021/08/05 12:54:11 INFO guid=8524DC64-F5EB-11EB-BBAB-A4BBA32715F6 action="delete" checksum=mGQYoVPx0FVGetppWabMbA test_type=throughput lead_url=https://< Non-Agent-Host IP >/pscheduler task={"test":{"spec":{"source":"< Non-Agent-Host IP >","dest":"< Managed-Host IP >","dest-node":"< Managed-Host IP >","duration":"PT10S","source-node":"< Non-Agent-Host IP >","schema":1,"ip-version":4},"type":"throughput"},"tools":["iperf3"],"_key":null,"schema":1,"schedule":{"until":"2021-08-06T18:28:20Z","sliprand":true,"repeat":"PT7200S","slip":"PT7200S","start":"2021-08-05T18:28:20Z"},"detail":{"spec-limits-passed":[[{"udp":{"match":false},"duration":{"range":{"upper":"PT60S","lower":"PT5S"}}}]],"runs-started":0,"cli":["--source","< Non-Agent-Host IP >","--source-node","< Non-Agent-Host IP >","--dest","< Managed-Host IP >","--dest-node","< Managed-Host IP >","--duration","PT10S","--ip-version","4"],"first-run-href":"https://<localhost IP>/pscheduler/tasks/ed995a76-5cbf-4710-b484-6c3461ad71ba/runs/first","runs-href":"https://< localhost IP>/pscheduler/tasks/ed995a76-5cbf-4710-b484-6c3461ad71ba/runs","added":"2021-08-05T11:59:05Z","post":"P0D","slip":"PT2H","multi-result":false,"participants":["< Non-Agent-Host IP >","< Managed-Host IP >"],"href":"https://< localhost IP>/pscheduler/tasks/ed995a76-5cbf-4710-b484-6c3461ad71ba","anytime":false,"exclusive":true,"enabled":true,"next-run-href":"https://< localhost IP>/pscheduler/tasks/ed995a76-5cbf-4710-b484-6c3461ad71ba/runs/next","hints":{"requester":"< Non-Agent-Host IP >","server":"< Managed-Host IP >"},"diags":"H\ni\nn\nt\ns\n:\n\n\n \n \nr\ne\nq\nu\ne\ns\nt\ne\nr\n:\n \n2\n0\n6\n.\n1\n9\n6\n.\n1\n7\n6\n.\n2\n1\n2\n\n\n \n \ns\ne\nr\nv\ne\nr\n:\n \n1\n6\n9\n.\n1\n5\n4\n.\n1\n9\n7\n.\n1\n8\n\n\nI\nd\ne\nn\nt\ni\nf\ni\ne\nd\n \na\ns\n \ne\nv\ne\nr\ny\nb\no\nd\ny\n\n\nC\nl\na\ns\ns\ni\nf\ni\ne\nd\n \na\ns\n \nd\ne\nf\na\nu\nl\nt\n\n\nA\np\np\nl\ni\nc\na\nt\ni\no\nn\n:\n \nD\ne\nf\na\nu\nl\nt\ns\n \na\np\np\nl\ni\ne\nd\n \nt\no\n \nn\no\nn\n-\nf\nr\ni\ne\nn\nd\nl\ny\n \nh\no\ns\nt\ns\n\n\n \n \nG\nr\no\nu\np\n \n1\n:\n \nL\ni\nm\ni\nt\n \n'\ni\nn\nn\no\nc\nu\no\nu\ns\n-\nt\ne\ns\nt\ns\n'\n \nf\na\ni\nl\ne\nd\n:\n \nP\na\ns\ns\ne\nd\n \nb\nu\nt\n \ni\nn\nv\ne\nr\nt\ne\nd\n\n\n \n \nG\nr\no\nu\np\n \n1\n:\n \nL\ni\nm\ni\nt\n \n'\nt\nh\nr\no\nu\ng\nh\np\nu\nt\n-\nd\ne\nf\na\nu\nl\nt\n-\nt\ni\nm\ne\n'\n \np\na\ns\ns\ne\nd\n\n\n \n \nG\nr\no\nu\np\n \n1\n:\n \nL\ni\nm\ni\nt\n \n'\nt\nh\nr\no\nu\ng\nh\np\nu\nt\n-\nd\ne\nf\na\nu\nl\nt\n-\nu\nd\np\n'\n \nf\na\ni\nl\ne\nd\n:\n \nU\nD\nP\n \nt\ne\ns\nt\ni\nn\ng\n \nn\no\nt\n \na\nl\nl\no\nw\ne\nd\n\n\n \n \nG\nr\no\nu\np\n \n1\n:\n \nL\ni\nm\ni\nt\n \n'\ni\nd\nl\ne\ne\nx\n-\nd\ne\nf\na\nu\nl\nt\n'\n \nf\na\ni\nl\ne\nd\n:\n \nT\ne\ns\nt\n \ni\ns\n \nn\no\nt\n \n'\ni\nd\nl\ne\ne\nx\n'\n\n\n \n \nG\nr\no\nu\np\n \n1\n:\n \nW\na\nn\nt\n \na\nn\ny\n,\n \n1\n/\n4\n \np\na\ns\ns\ne\nd\n,\n \n3\n/\n4\n \nf\na\ni\nl\ne\nd\n:\n \nP\nA\nS\nS\n\n\n \n \nA\np\np\nl\ni\nc\na\nt\ni\no\nn\n \nP\nA\nS\nS\nE\nS\n\n\nP\nr\no\np\no\ns\na\nl\n \nm\ne\ne\nt\ns\n \nl\ni\nm\ni\nt\ns\n\n\nP\nr\ni\no\nr\ni\nt\ny\n \ns\ne\nt\n \nt\no\n \nd\ne\nf\na\nu\nl\nt\n \no\nf\n \n0","duration":"PT19S","runs":8,"participant":1,"start":"2021-08-05T18:28:20Z"},"archives":[{"data":{"measurement-agent":"< Managed-Host IP >","url":"https://<Central MA IP>/esmond/perfsonar/archive","_auth-token":null},"archiver":"esmond"}],"reference":{"psconfig":{"created-by":{"user-agent":"psconfig-pscheduler-agent","uuid":"2ED6048E-F04B-11E9-A9F6-63FAA52715F6"}}},"href":"https://< localhost >/pscheduler/tasks/ed995a76-5cbf-4710-b484-6c3461ad71ba","tool":"iperf3"}
2021/08/05 12:54:11 ERROR guid=8524DC64-F5EB-11EB-BBAB-A4BBA32715F6 action="delete" msg=Problem deleting test throughput/iperf3( Non-Agent-Host IP -> Managed-Host IP), continuing with rest of config: FORBIDDEN: Forbidden.
Thanks again for your help, George
From:
"Garnizov, Ivan" <>
Hello George,
I doubt there is a direct relation between the pSconfig event and the frequency of throughput failures. I would suggest checking /var/log/perfsonar/psconfig-pscheduler-agent-tasks.log to look for missing records ;) You know for tests between the hosts in question.
Please also check the pScheduler schedule for Non-Starters of runs on this test specification.
Please tell, if these servers are all of pS 4.4 release or is there a mixture of 4.4 and 4.3.x?
Please also note that the test you shared doesn’t match the direction of the test failure from above. Failure is: HostB -> HostA Success is: HostA -> HostB
The error you share tells me only that pSconfig wasn’t able to remove a stalled record. This by itself might in fact lead indeed to problems. I’ll leave to other for a comment.
Regards, Ivan Garnizov
GEANT WP6T3: pS development team GEANT WP7T1: pS deployments GN Operations GEANT WP9T2: Software governance in GEANT
From: [mailto:]
On Behalf Of "Uhl, George D. (GSFC-423.0)[Arctic Slope Technical Services, Inc.]"
Hello,
We’ve been experiencing multiple test failures since upgrading from 4.3.4 to 4.4.0. These failures have impacted the regular testing within our mesh.
Our mesh throughput tests are scheduled to run in a 4 hour, 2 hour window or a 1 hour window. Slip time for tests are allotted the same time windows and slip time is randomized. We are getting the reoccurring error we’re receiving in tests scheduled through the mesh.
Cannot query non-lead server for run result.
We think the following error in psconfig-pscheduler-agent-transactions.log might be a clue but we don’t understand its intent.
2021/08/03 12:27:45 ERROR guid=8B4895F0-F455-11EB-BBAB-A4BBA32715F6 action="delete" msg=Problem deleting test throughput/iperf3([Test-Host-B IP]->[Test-Host-A IP]), continuing with rest of config: FORBIDDEN: Forbidden.
However, we’re able to run adhoc throughput tests with these same hosts on the CLI with no problem. See below.
Thanks, George Uhl NASA GSFC
# pscheduler task throughput --source <managed-host-A IP> --source-node <managed-host-A IP> --dest <managed-host-B IP> --dest-node <managed-host-B IP> --duration PT30S --ip-version 4 Submitting task... Task URL: https:// <managed-host-A IP>/pscheduler/tasks/7bb99a7f-e682-48c3-8fad-83ee5f1c776e Running with tool 'iperf3' Fetching first run...
Next scheduled run: https:// <managed-host-A IP>/pscheduler/tasks/7bb99a7f-e682-48c3-8fad-83ee5f1c776e/runs/c0853f4a-ace3-4e71-8005-def380e2f626 Starts 2021-08-02T10:38:26-04 (~5 seconds) Ends 2021-08-02T10:39:05-04 (~38 seconds) Waiting for result...
* Stream ID 5 Interval Throughput Retransmits Current Window 0.0 - 1.0 22.75 Mbps 0 441.64 KBytes 1.0 - 2.0 125.81 Mbps 0 3.75 MBytes 2.0 - 3.0 157.28 Mbps 0 3.99 MBytes 3.0 - 4.0 157.30 Mbps 0 3.99 MBytes 4.0 - 5.0 157.28 Mbps 0 3.99 MBytes 5.0 - 6.0 157.28 Mbps 0 3.99 MBytes 6.0 - 7.0 157.29 Mbps 0 3.99 MBytes 7.0 - 8.0 157.29 Mbps 0 3.99 MBytes 8.0 - 9.0 157.29 Mbps 0 3.99 MBytes 9.0 - 10.0 157.28 Mbps 0 3.99 MBytes 10.0 - 11.0 157.29 Mbps 0 3.99 MBytes 11.0 - 12.0 157.29 Mbps 0 3.99 MBytes 12.0 - 13.0 157.30 Mbps 0 3.99 MBytes 13.0 - 14.0 157.27 Mbps 0 3.99 MBytes 14.0 - 15.0 157.29 Mbps 0 3.99 MBytes 15.0 - 16.0 157.28 Mbps 0 3.99 MBytes 16.0 - 17.0 157.28 Mbps 0 3.99 MBytes 17.0 - 18.0 157.28 Mbps 0 3.99 MBytes 18.0 - 19.0 157.28 Mbps 0 3.99 MBytes 19.0 - 20.0 157.29 Mbps 0 3.99 MBytes 20.0 - 21.0 157.28 Mbps 0 3.99 MBytes 21.0 - 22.0 157.28 Mbps 0 3.99 MBytes 22.0 - 23.0 157.31 Mbps 0 3.99 MBytes 23.0 - 24.0 157.26 Mbps 0 3.99 MBytes 24.0 - 25.0 167.81 Mbps 0 3.99 MBytes 25.0 - 26.0 157.28 Mbps 0 3.99 MBytes 26.0 - 27.0 157.28 Mbps 0 3.99 MBytes 27.0 - 28.0 157.29 Mbps 0 3.99 MBytes 28.0 - 29.0 157.29 Mbps 0 3.99 MBytes 29.0 - 30.0 157.28 Mbps 0 3.99 MBytes
Summary Interval Throughput Retransmits Receiver Throughput 0.0 - 30.0 152.10 Mbps 0 151.17 Mbps
No further runs scheduled.
|
- [perfsonar-user] Cannot query non-lead server for run result, Uhl, George D. (GSFC-423.0)[Arctic Slope Technical Services, Inc.], 08/03/2021
- RE: [perfsonar-user] Cannot query non-lead server for run result, Garnizov, Ivan, 08/04/2021
- Re: [perfsonar-user] [EXTERNAL] RE: Cannot query non-lead server for run result, Uhl, George D. (GSFC-423.0)[Arctic Slope Technical Services, Inc.], 08/05/2021
- [perfsonar-user] Odd issue., Thomas, Philip, 08/26/2021
- Re: [perfsonar-user] Odd issue., Thomas, Philip, 08/26/2021
- RE: [perfsonar-user] Cannot query non-lead server for run result, Garnizov, Ivan, 08/04/2021
Archive powered by MHonArc 2.6.24.