Skip to content

WIP: Concurrent tasks #101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

andrewMacmurray
Copy link

@andrewMacmurray andrewMacmurray commented May 12, 2025

This adds concurrent tasks to gren. The existing Task api has been modified to:

  • Add Task.concurrent, which runs an array of tasks concurrently.
  • Make map2 run concurrently.
  • Remove map4, map5 in favour of andMap.

The existing Scheduler kernel implementation is largely unchanged but with the addition of a _Scheduler_concurrent helper that spawns multiple processes and handles collecting results and signalling errors.

Copy link
Member

@robinheghan robinheghan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all: it works! This actually speed up compiling-the-compiler a tiny amount in my very limited test :) Great work!

That said, I think there are still a few tweaks that needs to be done.

@andrewMacmurray andrewMacmurray force-pushed the concurrent-tasks branch 9 times, most recently from 5a0a78d to dbf4acb Compare May 13, 2025 07:26
@andrewMacmurray andrewMacmurray force-pushed the concurrent-tasks branch 2 times, most recently from 45a945c to 880f5f9 Compare May 13, 2025 07:56
The scheduler queue starts to become very slow when a large number of tasks are in flight (Array.shift is O(n) as it needs to reindex the array every time an element is removed).

This modifies `_Scheduler_enqueue` to loop through the active procs and resetting the queue when done. Larger arrays (1M+) are now handled more efficiently.
const handled = A2(_Scheduler_onError, onError, success);
return handled;
});
procs = batch.map(_Scheduler_rawSpawn);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: maybe perform the rawSpawn inside of tasks.map to save one iteration through the array of tasks?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was an edge case I found with this in the experiments repo I can't quite remember (something where the behaviour with an immediate failure was different) - will dig it up and write a test for it.

This runs an array of tasks concurrently. If any of them fail the rest of the running tasks are cleaned up.
Because `map2` is now concurrent `Task.sequence` needs to be implemented using `andThen`.
map4+ become less useful as these are just chains of `andMap`. `andMap` runs concurrently as it's defined using `map2`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants