Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job Manager: avoid gap between finishing job and starting new one #633

Open
jdries opened this issue Sep 25, 2024 · 1 comment
Open

Job Manager: avoid gap between finishing job and starting new one #633

jdries opened this issue Sep 25, 2024 · 1 comment
Assignees

Comments

@jdries
Copy link
Collaborator

jdries commented Sep 25, 2024

The job manager needs to keep the cluster busy as good as possible, to ensure that X jobs finish within estimated time.
Currently, we see that it can take a (relatively) long time between finishing a job, dealing with the output, and creating and starting a new job.

One trick could be to also have a queue of created jobs, that can be started as soon as another job is finished. The creation of jobs can then happen outside of the critical path between job finish and start.

Extra good would be if we could even already start the next job, and only then start handling results of finished job.

Note: if openEO backend has better support for job queue's, like yarn, it could be possible to already start the job as well, but this is not the case on current CDSE.

@VictorVerhaert
Copy link
Contributor

Received feedback from a user that the jobmanager can spend a lot of time downloading the results of a succesful job before starting the next one. One way to solve this is to add multithreading and queues to start the on job finished tasks without blocking the job manager

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants