perfsonar-user - Re: [perfsonar-user] Maximum number of shards reached
Subject: perfSONAR User Q&A and Other Discussion
List archive
- From: Andrew Lake <>
- To: Tristan Sullivan <>
- Cc: "" <>
- Subject: Re: [perfsonar-user] Maximum number of shards reached
- Date: Wed, 30 Aug 2023 07:00:50 -0700
Hi Tristan,
This is a known issue and we have some fixes committed and are testing for a 5.0.5. You did the right thing in terms of mitigating it in the meantime as far as bumping the shards as a mitigation. The actual fix will be two-fold:
- The indices are currently defaulting to having a primary shard and a replica shard...which makes no sense in the Toolkit context since it is one node. On update, existing indices and all new indices will have the replicas properly set to 0. This will cut shard count in half.
- The current ILM policy is properly deleting data over time but it is rolling-over the indices daily which is creating a shard per day per test type. These shards are all very small and it really does not need to be rolling over that often. We'll be updating this policy as well so it does it weekly and maybe a size-based component as well (there is a bit of a balancing act we have to play given the wide range of toolkit specs in the wild).
- The current ILM policy is properly deleting data over time but it is rolling-over the indices daily which is creating a shard per day per test type. These shards are all very small and it really does not need to be rolling over that often. We'll be updating this policy as well so it does it weekly and maybe a size-based component as well (there is a bit of a balancing act we have to play given the wide range of toolkit specs in the wild).
I think the TL;DR is you did the right thing and wait for 5.0.5 for the solution to the root of the problem.
Thanks,
Andy
Andy
On Tue, Aug 29, 2023 at 9:56 AM Tristan Sullivan <> wrote:
--Hello,
I have three perfsonar boxes running the latest version of perfsonar on Centos 7. A few days ago, all the results disappeared from the web interfaces on all three boxes. On all of them, I found many entries in /var/log/logstash/logstash-plain.log containing this: "this action would add [2] total shards, but this cluster currently has [1000]/[1000] maximum shards open;"
I listed the shards in the DB, and indeed there were 1,000 of them. I increased the maximum number of shards to 2,000, and test results are appearing again, but I'm not sure that's really the best solution. In particular, I was wondering what the default is for deleting old data now; I found the cronjob that used to do it for esmond, but I couldn't find one for opensearch. I only have about four months of data, so it seems like other people should be having this problem too, unless they changed the default maximum number of shards.
Any insight is appreciated.
Thank you and regards,
Tristan
To unsubscribe from this list: https://lists.internet2.edu/sympa/signoff/perfsonar-user
- [perfsonar-user] Maximum number of shards reached, Tristan Sullivan, 08/29/2023
- Re: [perfsonar-user] Maximum number of shards reached, Andrew Lake, 08/30/2023
Archive powered by MHonArc 2.6.24.