feat(babel): add parallel processing via worker threads#1956
feat(babel): add parallel processing via worker threads#1956davidtaylorhq wants to merge 1 commit intorollup:masterfrom
Conversation
|
@Andarist @shellscape sorry for the ping, but just wanted to check if this is something you'd consider looking at? The performance benefits are pretty massive for large projects using Babel. |
|
I'm really looking forward to this landing! I have a local tarball that I've been copying into my projects to have the benefits of this PR - and it's really good. I can no longer imagine using babel in rollup/vite/rolldown without the parallel mode <3 (usage without parallel mode is painfully slow) |
packages/babel/src/index.js
Outdated
|
|
||
| const parallelWorkerCount = typeof parallel === 'number' ? parallel : 4; | ||
| workerPool = new WorkerPool( | ||
| new URL('./worker.js', import.meta.url).pathname, |
There was a problem hiding this comment.
is this compatible with repo-wide supported node.js version?
There was a problem hiding this comment.
Worker Threads and import.meta.url are both supported in Node 14 👍
packages/babel/src/workerPool.js
Outdated
There was a problem hiding this comment.
have you considered using some existing worker pool libraries?
There was a problem hiding this comment.
I aimed to match the implementation of the existing Terser plugin, since I wasn't sure if adding new dependencies would be welcome. I'm certainly happy to explore it though! Did you have any particular libraries in mind?
workerpool looks like it could be a good bet 👀
There was a problem hiding this comment.
I guess it's fine to keep it as-is if it matches the one in the terser plugin. It would be nice to extract this to a shared package but I don't know what's the proper drill for this. It probably could/should be extracted once this PR lands.
There was a problem hiding this comment.
I investigated this some more, and I do think that switching to the workerpool package is a good idea. It handles a lot more edge cases than the custom implementation, and is much less code.
I've pushed a refactor to use this. But if there are any concerns about introducing a new dependency, let me know and I can revert back to the custom implementation.
|
@CharlieHelps please do a thorough review of this PR to determine if it's ready to merge. Look at the diff, reviews, comments, and referenced issues/docs/links to build a complete understanding. Then respond with a comment that includes a list of MUST fix items that must be fixed before the the PR is merged, and a list of IMPROVEMENTS that shouldn't block the merge but may be quick improvements or things that could be addressed in a follow up PR. The MUST fix items should explain exactly what's wrong and what needs to change to resolve the problem. Both should be numbered lists and the IMPROVEMENTS list should continue counting up from the MUST list. It's OK if there are no IMPROVEMENTS or MUST fix items, you can just omit the lists. |
|
@davidtaylorhq looks like Github's buggy UX bit us on this and I can't see/approve the CI runs. please push a new commit to the branch to trigger that UX again. |
|
@shellscape you can close and re-open a PR to trigger that, usually -- no reason to make a contributor do a silly because GH is buggy <3 |
|
I pulled the branch and reviewed the diff + existing review comments. Local verification for MUST fix
IMPROVEMENTS
|
| transformOptions = | ||
| babelHelpers === BUNDLED | ||
| ? addBabelPlugin(transformOptions, bundledHelpersPlugin) | ||
| : transformOptions; |
There was a problem hiding this comment.
Hm, so this now relies on babelHelpers input to skip the addBabelPlugin call in the renderChunk's case, right?
There was a problem hiding this comment.
Yes that's right - the output plugin doesn't pass babelHelpers 👍
packages/babel/src/index.js
Outdated
| return workerPool.runTask({ | ||
| inputCode: code, | ||
| babelOptions: { ...babelOptions, filename }, | ||
| runPreflightCheck: !skipPreflightCheck, | ||
| babelHelpers | ||
| }); |
There was a problem hiding this comment.
q: I'm not sure what we actually get from this.error but maybe it's worth rechecking if we shouldn't "convert" the errors from this to this.error calls?
There was a problem hiding this comment.
Yeah I think that makes sense - that'll make things consistent in parallel/non-parallel mode. Done 👍
packages/babel/src/index.js
Outdated
| if (parallel) { | ||
| const parallelAllowed = | ||
| isSerializable(babelOptions) && !overrides?.config && !overrides?.result; |
There was a problem hiding this comment.
perhaps it would be nice to move this validation to unpackInputPluginOptions? It would also be quite great if the thrown error message could mention the offending option.
There was a problem hiding this comment.
It's tricky to move it entirely to unpackInputPluginOptions, because at that point the 'overrides/customCallback' functions haven't been evaluated, and so we don't have the final babelOptions.
I have improved the error message to give a better indication of which option has failed the serializability check 👍
| } | ||
|
|
||
| return transformCode(code, babelOptions, overrides, customOptions, this); | ||
| return transformCode({ |
There was a problem hiding this comment.
q: why we don't attempt to parallelize this one?
There was a problem hiding this comment.
Refactored so that the parallel option is available for the output plugin as well 👍
1a508bf to
9a92f61
Compare
|
@davidtaylorhq please ping me for a review explicitly when you wrap up the work here and undraft it. I'll try not to miss it |
b30d1aa to
333ecc8
Compare
|
@Andarist I think this is in a good spot now. I've worked through all of the human & machine comments above, and I've refactored it to use the I've updated the "@rollup/plugin-babel": "davidtaylorhq/rollup-plugins#babel-parallel-built&path:/packages/babel", |
packages/babel/src/index.js
Outdated
| ); | ||
| } | ||
|
|
||
| return workerPool.exec('transform', [taskOpts]).catch((err) => context.error(err.message)); |
There was a problem hiding this comment.
q: does context.error actually throws/fails the execution or does it only log the error? I hope it's the first one
There was a problem hiding this comment.
Correct - it fails the build. See https://rollupjs.org/plugin-development/#this-error
Structurally equivalent to this.warn, except that it will also abort the bundling process with an error.
packages/babel/src/index.js
Outdated
| }); | ||
| } | ||
|
|
||
| function executeWithWorkerPool(workerPool, context, taskOpts, babelOptions) { |
There was a problem hiding this comment.
the name of this suggests it's a generic helper for executing different tasks but actually it's only capable of calling transform. I think the name of this helper should reflect that. Please just rename this to transformWithWorkerPool and then taskOpts should become transformsOpts too
packages/babel/src/index.js
Outdated
| if (workerPool) { | ||
| await workerPool.terminate(); | ||
| } |
There was a problem hiding this comment.
nit: this could be await workerPool?.terminate()
packages/babel/src/index.js
Outdated
| if (workerPool && !this.meta.watchMode) { | ||
| await workerPool.terminate(); | ||
| } |
There was a problem hiding this comment.
nit: similarly here this could be simplified
| if (workerPool && !this.meta.watchMode) { | |
| await workerPool.terminate(); | |
| } | |
| if (!this.meta.watchMode) { | |
| await workerPool?.terminate(); | |
| } |
Add a `parallel` option that processes files concurrently using Node.js worker threads. This reduces build times for large projects where Babel transformation is a bottleneck. This is similar to the existing parallel behavior of `@rollup/plugin-terser`, but uses the `workerpool` package instead of a custom implementation. This required some fairly significant refactoring, because we can only pass serializable objects between the main thread and the worker threads. It also required changes to the plugin's own build config, so that we can generate a dedicated worker entrypoint. Validations are added to ensure that unserializable config (e.g. inline babel plugins) cannot be used alongside the new parallel mode. For people using dedicated babel config files, this isn't a problem, because they are loaded directly by babel in the worker thread itself. The worker threads do have a setup cost, so this only makes sense for large projects. In Discourse, enabling this parallel mode cuts our overall vite (rolldown) build time by about 45% (from ~11s to ~6s) on my machine.
333ecc8 to
7b775f8
Compare
Rollup Plugin Name:
@rollup/plugin-babelThis PR contains:
Are tests included?
Breaking Changes?
If yes, then include "BREAKING CHANGES:" in the first commit message body, followed by a description of what is breaking.
List any relevant issue numbers: n/a
Description
feat(babel): add parallel processing via worker threads
Add a
paralleloption that processes files concurrently using Node.js worker threads. This reduces build times for large projects where Babel transformation is a bottleneck. This is similar to the existing parallel behavior of@rollup/plugin-terser.This required some fairly significant refactoring, because we can only pass serializable objects between the main thread and the worker threads. It also required changes to the plugin's own build config, so that we can generate a dedicated worker entrypoint.
Validations are added to ensure that unserializable config (e.g. inline babel plugins) cannot be used alongside the new parallel mode. For people using dedicated babel config files, this isn't a problem, because they are loaded directly by babel in the worker thread itself.
The worker threads do have a setup cost, so this only makes sense for large projects. In Discourse, enabling this parallel mode cuts our overall vite (rolldown) build time by about 45% (from ~11s to ~6s) on my machine.
This PR is based on #1954, which should ideally be merged first. Happy to rebase this one on
mainif the other PR isn't mergable.For testing, I've pushed a built version of this branch to https://github.com/davidtaylorhq/plugins/tree/babel-parallel-built. In pnpm, it can be installed like: