perfsonar-user - Re: [perfsonar-user] maddash-01.nordu.net
Subject: perfSONAR User Q&A and Other Discussion
List archive
- From: Raul Lopes <>
- To: Robert Lageano <>, Christopher Walker <>, Joachim Hunosøe <>, Tristan Sullivan <>
- Cc: "" <>
- Subject: Re: [perfsonar-user] maddash-01.nordu.net
- Date: Mon, 25 Sep 2023 09:00:51 +0000
- Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=jisc.ac.uk; dmarc=pass action=none header.from=jisc.ac.uk; dkim=pass header.d=jisc.ac.uk; arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=RLCLgUGOF86yXYRa6lwqm2lEMnO6FImfklPtiVoQ0CE=; b=aOJyrEukp2y0JrPn/oRV2aibtyI5tDqcunLkv7VJBobd0vraeuIhSdHhCyXVA0gBKZS+N8IUTFLlVrXBJwqh8501tkiicfawUx20rlmEiw+tViP2dYWVPAkjCrwSbW1DNnOVTbdEz2NTnGLKP3I9jE1EBLs/dwMCi7skBPjtbWO1V1+Ap5thXsZo1SLvcblXUVIrzYR0OPjhnvhEK0TV4oE2pw7moH3AXY04+PdEO1C/MbPGNUKGjbdnPg9S3Ze22JdHaBE5ZsyaqhT982AOf3Ffx61ZucKC0uf258c9RGgVYp+hbMFzPxl2cnc210iOgdWwLuFskUHz07ZozBy2QQ==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=gSkJxQPZXmqzQ4bVbLgXgltaLNy36bpSmrBbEeCPaAeoLybfF7+lscFFkp0DJxfQoWyh9vqwP8R/KNI0bg+ISeD1Xlz41v6gwST9DTMHkyPrb1H0qVFV34Jnw5HurAAcWg0GyEJpw5GZhJA1gmEznPp6m+maiyFJi9sKXW09ne2THRQbpXwpaiYek9AQiGV0S6ZIBNeDbWOOW77lWjmxh2UGo5fBKBDLqcjx7qNu9nwaPMRUD19NicwVazaR7dwh9ar6wqT0/kLm0v8RWCidPVbOgGFQFQ4tIZlOIGjsVupVN67mRKMYGsaHe+y0wf5v2ievCupmAUINLXEB9veS1g==
- Msip_labels:
Hi,
I have seen this happening with a large number of perfSONAR hosts: the hosts stops running tests. I’ve even opened a github issue about it.
Summary: you have to stop all perfsonar related services. Using systemctl to find the services and stopping may not be enough. You may have to sig kill them. Then restart or reboot.
Regards, Raul
From:
<> on behalf of Robert Lageano <> Hi Tristan,
I've just tried increasing the number of shards as per your instructions. However, that still hasn't solved our issues here. Pscheduler-archiver continues to crash even with the increased number of shards. Would you have any other solution for that?
Sep 25 00:42:45 perfsonar python3[38575]: archiver WARNING 5681: Failed to archive https://perfsonar/pscheduler/tasks/e047c1db-b3bb-4c62-bf2c-e5> Sep 25 00:42:45 perfsonar python3[38575]: archiver WARNING <html><head> Sep 25 00:42:45 perfsonar python3[38575]: archiver WARNING <title>503 Service Unavailable</title> Sep 25 00:42:45 perfsonar python3[38575]: archiver WARNING </head><body> Sep 25 00:42:45 perfsonar python3[38575]: archiver WARNING <h1>Service Unavailable</h1> Sep 25 00:42:45 perfsonar python3[38575]: archiver WARNING <p>The server is temporarily unable to service your Sep 25 00:42:45 perfsonar python3[38575]: archiver WARNING request due to maintenance downtime or capacity Sep 25 00:42:45 perfsonar python3[38575]: archiver WARNING problems. Please try again later.</p> Sep 25 00:42:45 perfsonar python3[38575]: archiver WARNING </body></html> Sep 25 00:42:45 perfsonar python3[38575]: archiver WARNING 5681: Gave up archiving https://perfsonar/pscheduler/tasks/e047c1db-b3bb-4c62-bf2c-e5>
Whenever I restart pscheduler, it seems to be happy again but it crashes after a few minutes.
Cheers,
Rob ----------------------------------------------------------------------------- Robert Lageano Want the latest news from NCI? nci.org.au Find out more about NCI: YouTube | Facebook | Twitter | LinkedIn | Podcast
From: <> on behalf of Tristan Sullivan <>
Hi,
In case anyone with the shards problem wants to revive their host before the next release, a workaround is to increase the maximum number of shards:
curl -k --netrc-file pw -X PUT https://localhost:9200/_cluster/settings -H 'Content-Type: application/json' -d '{"persistent": {"cluster.max_shards_per_node": "2000"}}'
Where pw is a file in the directory from which you run the command, in the following format:
machine localhost login admin password {password}
The password can be found in /etc/perfsonar/opensearch/auth_setup.out.
Tristan
From: <> on behalf of Joachim Hunosøe <>
Hi Chris
Alright, ah yeah holy moly the opensciencegrid ain’t looking good, thank you so much for the info, it helps calm down again. After many frustrating hours.
Then I’m just looking forward for the next release :-)
Best regards System Engineer +45 5374 7719
|
- [perfsonar-user] maddash-01.nordu.net, Joachim Hunosøe, 09/19/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Christopher Walker, 09/22/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Joachim Hunosøe, 09/22/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Tristan Sullivan, 09/22/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Robert Lageano, 09/25/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Raul Lopes, 09/25/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Weed, Adam, 09/25/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Phil Reese, 09/25/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Phil Reese, 09/25/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Christopher Walker, 09/27/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Raul Lopes, 09/27/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Phil Reese, 09/27/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Christopher Walker, 09/27/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Phil Reese, 09/25/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Robert Lageano, 09/25/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Tristan Sullivan, 09/22/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Joachim Hunosøe, 09/22/2023
- Re: [perfsonar-user] maddash-01.nordu.net, Christopher Walker, 09/22/2023
Archive powered by MHonArc 2.6.24.