-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: change workflow evaluation to use tokio::spawn
.
#320
Conversation
This refactoring primarily affects `wdl-engine` and how workflows are evaluated. Previously, workflows were evaluated without spawning any tokio tasks. This meant that subgraph evaluation was not concurrent, although WDL tasks were spawned concurrently. With the upcoming support for integrating Crankshaft, the `Backend` trait was refactored such that the backend itself can maintain state and notify callers of when a WDL task is spawned and when it completes. This allows for the local execution backend to be rewritten in a way that enables it to respect the `cpu` and `memory` requirements of a WDL task. To support making the evaluator futures `Send`, we now utilize `rowan::AstPtr` to send AST nodes between threads performing evaluation. The evaluation graphs are mapped to be `Send` so that evaluation can keep a shared reference to the graph across concurrently executing tokio tasks. Also included with these changes are: * Use `tracing-indicatif` to prevent mixing of tracing output and progress bars; the progress bar now consistently shows at the end of tracing output. * Made `wdl_analysis::Document` trivially clonable. * Minor changes to internal names in `wdl-analysis`. * Fixed a missing constraint on the `basename` function definition for an overload that's only required for WDL 1.2. * Added a `Config` type to `wdl-engine` that controls how evaluation is performed; in the future this can be used to support evaluation configuration files. * Expanded the `ProgressKind` enumeration to allow for tracking how many tasks are "ready" (i.e. waiting to be spawned by the backend) vs. "executing" (i.e. currently running). * Removed the incorrect mapping of host paths to guest paths; this will soon be reimplemented as part of the initial support for a Docker backend. * Added retry logic to task execution.
7883586
to
f51246c
Compare
Co-authored-by: Andrew Frantz <[email protected]>
Changed: * Ensured a dependency from the command section to every input in a task evaluation graph; this will be necessary to properly map input files that aren't referenced by the command by otherwise treated as "used" (e.g. a `bai` file input). This is relevant for future backends that need to map files into containers. * Made the working directory attempt-specific so that if a task fails and a retry is made, a new working directory is used while preserving the previous attempt's working directory. * Expanded upon the comment for the scatter `concurrency` setting. * Changed the task `retries` configuration option to mean the "default" number of retries to use if a task does not specify its own max retries requirement, which is also now respected in task execution. * Added `attempt` to the `ProgressKind::TaskExecutionStarted` variant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: what about the temp dir on retries? And stderr/stdout files? Are those overwritten or preserved?
This commit fixes the retry logic in task evaluation to re-evaluate the command section before retrying. The change enables a command to see an updated `task` variable in 1.2 and can therefore know when it has failed in a previous attempt. Additionally, this makes it so that we also re-evaluate the hints and requirements sections upon retry; the hope is that in the near future the spec will be updated to allow those sections to have task scope. A test was added to retry a task.
Looks like a new nightly was released that changed formatting, so need to bump my local nightly and reformat. |
There are now two temp directories in the directory layout as described in a previous comment: one is where files are created before and after we evaluate the command and one is for the evaluation of the current attempt's command. So effectively they are preserved from attempt to attempt along with the command/stdout/stderr files. |
This refactoring primarily affects
wdl-engine
and how workflows are evaluated. Previously, workflows were evaluated without spawning any tokio tasks. This meant that subgraph evaluation was not concurrent, although WDL tasks were spawned concurrently.With the upcoming support for integrating Crankshaft, the
Backend
trait was refactored such that the backend itself can maintain state and notify callers of when a WDL task is spawned and when it completes. This allows for the local execution backend to be rewritten in a way that enables it to respect thecpu
andmemory
requirements of a WDL task.To support making the evaluator futures
Send
, we now utilizerowan::AstPtr
to send AST nodes between threads performing evaluation. The evaluation graphs are mapped to beSend
so that evaluation can keep a shared reference to the graph across concurrently executing tokio tasks.Also included with these changes:
tracing-indicatif
to prevent mixing of tracing output and progress bars; the progress bar now consistently shows at the end of tracing output and can also be quieted.wdl_analysis::Document
trivially clonable.wdl-analysis
.basename
function definition for an overload that's only required for WDL 1.2.Config
type towdl-engine
that controls how evaluation is performed; in the future this can be used to support evaluation configuration files.ProgressKind
enumeration to allow for tracking how many tasks are "ready" (i.e. waiting to be spawned by the backend) vs. "executing" (i.e. currently running).Before submitting this PR, please make sure:
changes (when appropriate).
CHANGELOG.md
(see"keep a changelog" for more information).