Replies: 1 comment 1 reply
-
|
I think this example is a use case for the solution described here: #452 (comment) Basically you need a mechanism to delete intermediate files once they are no longer needed, but also not re-compute them if their downstream outputs already exist. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I've been reading though the docs and past discussions, but I cannot understand how to make my use case work.
To give an example, say I have two processes with the following DAG:
Process
Adownloads a giant file, andBcompute some summary statistics on it and stores the result in astoreDir. The files downloaded inAare so large I need to delete them once the workflow is done. But the issue is that over time I need to run new samples through the workflow, but I don't wantAto re-download the files I deleted... because if it does I'll run out of disk space.Is there a way to have nextflow not re-run
Aif output files forBexist? This is assuming the outputs for past runs ofAhave been deleted. Furthermore, I want to be able to extend the workflow by adding a processCafterB, so I'll need it to run the workflow again for all past samples as well (but again, without runningAagain).Beta Was this translation helpful? Give feedback.
All reactions