-
-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Group tasks from a generator #5
Comments
Thanks for the suggestion. I'm not sure I see the benefit of building an API like that though. I think you can already do the same thing using native API. With an array of data, you can use Array#map: const data = ['a', 'b', 'c'];
await task.group(
task => data.map(
item => task(`Working on ${item}`, async ({ setTitle }) => {
await someAsyncTask()
})
),
{
concurrency: 2
}
) In the case of generators, since the generator doesn't send all the values at once, I don't think a task group can be used. Task Groups are better for when you know all the tasks but they're queued. With generators, you can do: const data = function*() {
yield 'a';
yield 'b';
yield 'c';
};
for (const item of data()) {
task(`Working on ${item}`, async ({ setTitle }) => {
await someAsyncTask()
})
} |
@privatenumber what you propose is the problem I explained at the beginning of the issue. I have to create a new callback for each task, instead of reusing the same one for each record. Additionally I don't know in advance how many records I will generate. In my case I have thousands of them and they all run using the same function. If we can depend on the concurrency of the task system to spawn function as spots are freed then we can write simple code that performs well without the need to create our own concurrency system as that's something already managed by tasuku. Do you get the idea? My current code with a big array is a bit slow and that causes issues in the terminal where tasuku spawns thousands of pending line (1 by task). What I want is that he spawns 1 line per task currently running in the map function, when a task completes then it invokes the next one, and so on... |
This is just a performance optimization, right? I'm curious if this is actually a measurable difference. Behind the scenes, it's using React via Ink, and I know they are quite memory inefficient (high allocate/de-allocate) by design.
I prefer not to add async/concurrency control to keep tasuku minimal and focused on rendering the task. I think that can be done with another module. import PQueue from 'p-queue'
const queue = new PQueue({ concurrency: 2 })
for (const item of generator()) {
queue.add(
() => task(`Working on ${item}`, async ({ setTitle }) => {
await someAsyncTask()
})
);
} Does that meet your needs?
And you want these tasks to be running with a limited concurrency too, correct? It might make more sense to simply add a Alternatively, you can directly pass into import pMap from 'p-map';
const sites = [
getWebsiteFromUsername('sindresorhus'), //=> Promise
'https://avajs.dev',
'https://github.com'
];
const mapper = site => task(`Working on ${site}`, async ({ setTitle }) => {
await someAsyncTask()
})
const result = await pMap(sites, mapper, {concurrency: 2});
console.log(result); Seems p-map also accepts an iterable, which might address your first problem with the generator. |
@privatenumber thanks for the suggestions. I'll check them out shortly. I just think it would be nice to be able to reuse the One more thing - just so you know, when I run a dataset of 500 records with task.group(task => data.map(item => task('title', async () => await doSomething(item))); takes around 50 seconds (including around 4~5 seconds of overhead from the rest of my code). Then I run the exact same code, with the only exception that I remove tasuku and I map await Promise.all(data.map(doSomething)); It takes less than 7 seconds (including the 4~5 seconds of prior overhead). So there is definitively a bottleneck somewhere here. I only did quick tests for the moment as I have been short in time, I'll try to come back with a reproducible case as soon as I find a moment |
The primarily feature of Interesting. I haven't had a case where I needed to run that many tasks but that's definitely a problem. Thanks, looking forward to the reproduction! |
Is your feature request related to a problem?
I have a large dataset that I want to process as a grouped set of tasks.
task.group
forces me to create 1 function per task to run, it is slow and no efficient for memory.Describe the solution you'd like
It would be more efficient to have something like
task.map
that takes a generator or an array of data, as well as a function to execute for each record.example:
Task.map can also manage concurrency and invoke the next function only when a seat is freed:
I'm happy to send a PR for this
The text was updated successfully, but these errors were encountered: