Skip to content

Commit c72d1d7

Browse files
committed
Successful run of PDA download, updated status evaluation
* Run is still ongoing. Had a fatal error reported as described in README.project, but does not appear this was in fact fatal to the download
1 parent 3980dc3 commit c72d1d7

12 files changed

+266
-219
lines changed

00_start_docker.compute1.sh

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Launch docker environment before running import, so that `parallel` is available
2+
3+
# Map home directory (containing token) and storage directory
4+
#VOLUME_MAPPING="/storage1/fs1/home1/Active/home/m.wyczalkowski /storage1/fs1/m.wyczalkowski"
5+
VOLUME_MAPPING="/home/m.wyczalkowski /storage1/fs1/m.wyczalkowski"
6+
7+
LSF_ARGS="-G compute-lding"
8+
IMAGE="mwyczalkowski/cromwell-runner"
9+
10+
>&2 echo Launching $IMAGE on compute1
11+
CMD="bash src/start_docker.sh -I $IMAGE -M compute1 -g \"$LSF_ARGS\" $@ $VOLUME_MAPPING"
12+
echo Running: $CMD
13+
eval $CMD
14+
15+

00_start_docker.sh

Lines changed: 0 additions & 32 deletions
This file was deleted.

10_get_UUID.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# We want all PDA genomic RNA-Seq samples available at GDC
2+
source gdc-import.config.sh
23

3-
CATALOG="/storage1/fs1/home1/Active/home/m.wyczalkowski/Projects/CPTAC3/CPTAC3.catalog/CPTAC3.Catalog.dat"
4-
OUT="dat/UUID_download.dat"
4+
OUT=$UUID
55
mkdir -p dat
66

7-
grep PDA $CATALOG | grep genomic | cut -f 11 | sort > $OUT
7+
grep PDA $CATALOG_MASTER | grep genomic | cut -f 11 | sort > $OUT
88
echo Written to $OUT

2_get_Catalog.sh renamed to 15_summarize_download.sh

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -65,9 +65,6 @@ echo " Targeted Sequencing: $TARG_SIZE Tb in $TARG_COUNT files"
6565
echo " TOTAL: $TOT_SIZE Tb in $TOT_COUNT files"
6666
}
6767

68-
mkdir -p dat
69-
UUID="dat/UUID-all.dat"
70-
7168
>&2 echo Catalog: $CATALOG_MASTER
7269

7370
head -n1 $CATALOG_MASTER > $CATALOG_H

20_start_download.sh

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
DATAD="/storage1/fs1/m.wyczalkowski/Active/Primary/CPTAC3.share/CPTAC3-GDC"
2+
TOKEN="/home/m.wyczalkowski/Projects/CPTAC3/import/token/gdc-user-token.2020-01-31T20_48_13.912Z.txt"
3+
CATALOG="/home/m.wyczalkowski/Projects/CPTAC3/CPTAC3.catalog/CPTAC3.Catalog.dat"
4+
UUID="dat/UUID-download.dat"
5+
6+
#TESTARGS=-1ddd
7+
TESTARGS=$@
8+
9+
src/start_downloads.sh -S $CATALOG -O $DATAD -t $TOKEN -g "-G compute-lding" -M -q general $TESTARGS - < $UUID

30_evaluate_download_status.sh

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
#!/bin/bash
2+
3+
# author: Matthew Wyczalkowski [email protected]
4+
5+
# This is a wrapper around src/evaluate_status.sh with parsing of configuration for convenience
6+
# All arguments passed to here will be passed to evaluate_status.sh
7+
8+
# TODO: src/evaluate_status.sh can accept specific UUIDs as arguments, wereas here we just
9+
# go through all UUIDs. Might want to improve this to allow UUID lists to be passed
10+
11+
source gdc-import.config.sh
12+
13+
if [ $LSF == 1 ]; then
14+
ARG="-M"
15+
fi
16+
17+
CMD="bash src/evaluate_status.sh $ARG -S $CATALOG_MASTER -O $DATA_ROOT $@ - < $UUID"
18+
eval $CMD
19+

3_import.sh

Lines changed: 0 additions & 12 deletions
This file was deleted.

README.compute1-testing.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
DATAD="/storage1/fs1/m.wyczalkowski/Active/Primary/CPTAC3.share/CPTAC3-GDC"
2-
TOKEN="/storage1/fs1/home1/Active/home/m.wyczalkowski/Projects/CPTAC3/import/token/gdc-user-token.2020-01-31T20_48_13.912Z.txt"
2+
#TOKEN="/storage1/fs1/home1/Active/home/m.wyczalkowski/Projects/CPTAC3/import/token/gdc-user-token.2020-01-31T20_48_13.912Z.txt"
3+
TOKEN="/home/m.wyczalkowski/Projects/CPTAC3/import/token/gdc-user-token.2020-01-31T20_48_13.912Z.txt"
34
CATALOG="/home/m.wyczalkowski/Projects/CPTAC3/CPTAC3.catalog/CPTAC3.Catalog.dat"
4-
UUID="UUID.dat"
5+
UUID="UUIDs.dat"
56

67
Starting docker for direct run:
78
```

README.project.md

Lines changed: 27 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,27 @@
1-
Development work to:
2-
1. Update data transfer tool to v1.5
3-
See https://gdc.cancer.gov/access-data/gdc-data-transfer-tool
4-
2. Confirm works on compute1
5-
3. Simplify model of downloading
1+
This project consists of development of Y3 version and download of PDA RNA-Seq data.
2+
See README.compute1-testing.md for details about development
3+
4+
PDA download is defined by UUID list in dat/UUID-download.dat, and consists of the following:
5+
6+
Total required disk space WGS: 0 Tb in 0 files
7+
WXS: 0 Tb in 0 files
8+
RNA-Seq: 1.69332 Tb in 205 files
9+
miRNA-Seq: 0 Tb in 0 files
10+
Methylation Array: 0 Tb in 0 files
11+
Targeted Sequencing: 0 Tb in 0 files
12+
TOTAL: 1.69332 Tb in 205 files
13+
14+
# Downloading
15+
16+
bash 20_start_download.sh -J5
17+
18+
## Downloading errors
19+
20+
```
21+
Processing 25 / 205 [ Mon Feb 24 21:55:52 UTC 2020 ]: 1da42751-ac08-44d3-be4a-079c3cad256d
22+
Running: parallel --semaphore -j5 --id 20200224210947 --joblog ./logs/parallel.1da42751-ac08-44d3-be4a-079c3cad256d.log --tmpdir ./logs "bash src/launch_download.sh -g \"-G compute-lding\" -M -q general -o /storage1/fs1/m.wyczalkowski/Active/Primary/CPTAC3.share/CPTAC3-GDC 1da42751-ac08-44d3-be4a-079c3cad256d /home/m.wyczalkowski/Projects/CPTAC3/import/token/gdc-user-token.2020-01-31T20_48_13.912Z.txt 2917532a-84e2-478f-8002-2cdc0933731a.rna_seq.genomic.gdc_realn.bam BAM"
23+
Fatal ERROR. Exiting.
24+
```
25+
26+
However, the jobs still seem to be running and the download is going OK. Perhaps the fatal error has something to do with tmux and exiting?
27+

evaluate_batch_status.sh

Lines changed: 0 additions & 29 deletions
This file was deleted.

0 commit comments

Comments
 (0)