-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for pscheduler optimisation due to constant load even for small mesh #1502
Comments
Any thoughts on this? |
We've seen memory growth in the runner before. I suspect that it comes from a combination of having processes in the runner that take on long-running jobs (e.g., latencybg) and grow from taking on smaller jobs. As pSConfig refreshes things, the older processes drop off and new ones start, which is where the 24-hour cycle happens. Fixing it will require a re-think of some of that and some rework of the runner code to avoid it as much as possible. |
Maybe it indeed requires additional work but taking into account that even testptoint struggles to run a simple mesh now it seems like an urgent work. My 4 nodes mesh is totally filled up doing almost nothing with multiple failed and missed tests. |
Here is my production mesh in my metro network:
3 nodes (small node hardware BRIX GB-BASE-3160, Intel(R) Celeron(R) CPU J3160 @ 1.60GHz with 8G RAM). Two are Debian12, one Ubuntu20.
All nodes are perfsonar-testpoint with 5.1.4 bundle. All have the same default config. All sending results to central default installation archive.
The mesh runs a very light set of tests: throughput every 3hs, latency and some dns, http, rtt tests. See JSON attached.
There were no significant changes to the mesh definition in the timeframe
Issues observed (all graphs attached from built-in Prometheus monitoring, for last 14 days ):
pozman.json
The text was updated successfully, but these errors were encountered: