You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I believe Rush's task scheduling capabilities are excellent, but there are still some flaws that I think are intolerable. Specifically, it’s about how to balance executing the minimum number of tasks with achieving the maximum level of parallel execution
Details
Let's say our project's dependency relationships are as follows:
Usually, on CI we need to run the build, lint, and test tasks. If we want to execute these three tasks in maximum parallel, we need to define them in command-line.json as follows:
Thus, for the above project, when project B changes, we need to run the above command on projects A, B, E, C, D, and G.
In the end, we need to execute 6 * 3 = 18 tasks. However, in this case, we don't need to run lint and test for A and E, right?
At this point, we can split the above command-line.json into:
On CI, we will execute the following commands separately:
1. rush build --from git:origin/master
2. rush test --impacted-by git:origin/master
At this point, we only need to execute 6 + 2 * 4 = 14 tasks, which means we don't need to run lint and test for A and E.
However, splitting the entire execution process into two separate runs means we cannot maximize the parallel execution of all tasks.
Therefore, we need a way to both execute tasks in parallel and minimize the number of tasks executed.
So, back to the beginning, let's assume the dependencies of the _phase script are as follows:
Generally speaking, the model to date has been that we expect the unchanged tasks to replay from the build cache and therefore have a minimal impact on overall runtime.
Generally speaking, the model to date has been that we expect the unchanged tasks to replay from the build cache and therefore have a minimal impact on overall runtime.
Yes, caching is a huge optimization technique, and at the same time, we also need a smarter scheduling strategy.
Summary
I believe Rush's task scheduling capabilities are excellent, but there are still some flaws that I think are intolerable. Specifically, it’s about how to balance executing the minimum number of tasks with achieving the maximum level of parallel execution
Details
Let's say our project's dependency relationships are as follows:
Usually, on CI we need to run the build, lint, and test tasks. If we want to execute these three tasks in maximum parallel, we need to define them in
command-line.json
as follows:On CI, we will run the command:
Thus, for the above project, when project B changes, we need to run the above command on projects A, B, E, C, D, and G.
In the end, we need to execute 6 * 3 = 18 tasks. However, in this case, we don't need to run lint and test for A and E, right?
At this point, we can split the above
command-line.json
into:On CI, we will execute the following commands separately:
At this point, we only need to execute 6 + 2 * 4 = 14 tasks, which means we don't need to run lint and test for A and E.
However, splitting the entire execution process into two separate runs means we cannot maximize the parallel execution of all tasks.
Therefore, we need a way to both execute tasks in parallel and minimize the number of tasks executed.
So, back to the beginning, let's assume the dependencies of the
_phase
script are as follows:This means we need to execute the build before lint and test, so the command can now be simplified to:
In the background, Rush will execute
--impacted-by
in a safe manner, meaning it will execute the build of A and E as shown in the diagram above.The text was updated successfully, but these errors were encountered: