Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Organize outputs into structs where appropriate #44

Open
chrisamiller opened this issue May 6, 2022 · 2 comments
Open

Organize outputs into structs where appropriate #44

chrisamiller opened this issue May 6, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@chrisamiller
Copy link
Member

Pipelines which are considered "terminal" should get their outputs organized into structs (which coupled with our scripts for pulling results, makes for neatly nested directories). Right now immuno.cwl is organized in this way.

  1. define the list of these (certainly somatic exome, rnaseq, immuno, etc)

  2. figure out how to pass results from wdls that can be run either as subworkflows or stand-alone. (i.e. rnaseq, which feeds into immuno). Can the output structs just be returned and elements accessed from the higher level wdls? Or do we need to use a param/conditionals to say "do this struct stuff only if you're not being run as a subworkflow".

@tmooney
Copy link
Member

tmooney commented Aug 31, 2022

It should be fine to organize the outputs into arbitrary data structures in either CWL or WDL and pass them along as outputs from tools or subworkflows. The trade-off is having a bunch of custom types defined that you need to track down to know what to expect (which maybe doesn't matter if you're only looking at the output JSON and reading through it).

The current implementation of this in immuno.wdl uses the deprecated-in-WDL-1.1 object keyword... however Cromwell doesn't support WDL 1.1 (just like it doesn't support CWL v1.2 🙂) so I guess it's fine for now?

@chrisamiller
Copy link
Member Author

We are open to ideas here - it's definitely clunky. The end goal is to have a gather step at the end that can assign specific names to certain output files and into subdirectories as well. Since cromwell/WDL doesn't natively support this, we thought it ideal to encode as much of that as possible into the WDL, and have a simple script that parses the structs. The alternative would seem to be defining some kind of output mapping file alongside the wdl to be parsed by another script, but that seems more prone to being broken as wdls are changed.

Would love to hear your suggestions if you have other ways to accomplish this goal in a cleaner/better supported manner!

@malachig malachig added the enhancement New feature or request label Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants