vre-process_nextflow-executor/tests/TCGA at master · inab/vre-process_nextflow-executor

README.md

In order to fetch the dataset used for this test you have to run materialize-data.sh script, like:

./materialize-data.sh

and you have to run next command in order to materialize the containers which are needed by the workflow:

./materialize-containers.sh

The data of that remote resource has been derived from the materials of next manuscript:

Folder data contains benchmarking metrics results from the 2018 TCGA-PanCancer benchmark for the 34 analyzed cancer types. Those files follow the structure of the 'aggregation' datasets from the Elixir Benchmarking Data Model. Json schemas for those datasets can be found here
Folder metrics_ref_datasets contains the gold standards defined by the community for each of the cancer types.
Folderpublic_ref contains the reference data used by the community for validation/predictions.
All_Together.txt is a gene predictions file which can be used as input to test the workflow.