Skip to content

Commit 3980dc3

Browse files
committed
Updating docker_start to be general purpose
1 parent 8f62ad7 commit 3980dc3

File tree

9 files changed

+230
-2
lines changed

9 files changed

+230
-2
lines changed

00_start_docker.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
#source Project.config.sh
2-
# Launch docker environment at MGI before running cromwell.
2+
# Launch docker environment before running import, so that `parallel` is available
33

44
if [ "$#" -ne 1 ]; then
55
>&2 echo ERROR: pass SYSTEM argument

10_get_UUID.sh

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# We want all PDA genomic RNA-Seq samples available at GDC
2+
3+
CATALOG="/storage1/fs1/home1/Active/home/m.wyczalkowski/Projects/CPTAC3/CPTAC3.catalog/CPTAC3.Catalog.dat"
4+
OUT="dat/UUID_download.dat"
5+
mkdir -p dat
6+
7+
grep PDA $CATALOG | grep genomic | cut -f 11 | sort > $OUT
8+
echo Written to $OUT

README.compute1-testing.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,3 +66,5 @@ With parallel (-J 5) turned on,
6666
-> This script has to be run in a container which has parallel installed
6767
-> using scripts from https://github.com/ding-lab/CromwellRunner to start cromwell runner
6868
- do this for now instead of tyring to merge principal container
69+
70+
See docker/README.docker.sh

docker/README.docker.sh

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
TODO: implement general purpose docker launcher with the following features:
2+
* Aware of MGI, compute, docker environments
3+
- select queue, other defaults accordingly
4+
* Can map arbitrary paths through command line arguments like, PATH_H:PATH_C
5+
- if form is PATH_H, implies PATH_C=PATH_H
6+
* Select through command line arguments
7+
- memory
8+
- image
9+
- dryrun
10+
- run bash or given command line
11+
- arbitrary LSF arguments
12+
* Idea is to use a common script for launching both cromwell runner and importGDC containers
13+
14+
Past work: TinDaisy start docker is a good one:
15+
/Users/mwyczalk/Projects/TinDaisy/TinDaisy-Core/src/start_docker.sh
16+
./start_docker.LSF.sh
17+
../src/launch_download.sh
18+
- does both LSF and docker

docker/launcher/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Scripts for launching CromwellRunner environment, from
1+
Scripts for launching CromwellRunner environment, obtained from
22

33
/home/m.wyczalkowski/Projects/TinDaisy/CromwellRunner/CR.dev/docker
44
https://github.com/ding-lab/CromwellRunner
File renamed without changes.
File renamed without changes.

src/start_docker.sh

Lines changed: 200 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,200 @@
1+
#!/bin/bash
2+
3+
# Matthew Wyczalkowski <[email protected]>
4+
# https://dinglab.wustl.edu/
5+
6+
# TODO: implement general purpose docker launcher with the following features:
7+
# * Aware of MGI, compute, docker environments
8+
# - select queue, other defaults accordingly
9+
# * Can map arbitrary paths through command line arguments like, PATH_H:PATH_C
10+
# - if form is PATH_H, implies PATH_C=PATH_H
11+
## * Select through command line arguments
12+
# - memory
13+
# - image
14+
# - dryrun
15+
# - run bash or given command line
16+
# - arbitrary LSF arguments
17+
# * Idea is to use a common script for launching both cromwell runner and importGDC containers
18+
#
19+
# Past work: TinDaisy start docker is a good one:
20+
# /Users/mwyczalk/Projects/TinDaisy/TinDaisy-Core/src/start_docker.sh
21+
# ./start_docker.LSF.sh
22+
# ../src/launch_download.sh
23+
# - does both LSF and docker
24+
25+
read -r -d '' USAGE <<'EOF'
26+
Start docker container in standard docker or LSF environments with optional mounted volumes
27+
Usage: start_docker.sh [options] [ data_path_1 [ data_path_2 ...] ]
28+
29+
Required options:
30+
-I DOCKER_IMAGE: Specify docker image. Required.
31+
32+
Options:
33+
-h: show help
34+
-d: dry run. print out docker statement but do not execute
35+
-M SYSTEM: Available systems: MGI, compute1, docker. Default: docker
36+
-m MEM_GB: request given memory resources on launch
37+
-c DOCKER_CMD: run given command in non-interactive mode. Default is to run /bin/bash in interactive mode
38+
-L LOGD: Log directory on host. Logs are written to $LOGD_H/log/*.[err|out] for non-interactive mode only
39+
-g LSF_ARGS: optional arguments to pass verbatim to bsub. LSF mode only
40+
-q LSFQ: queue to use when launching LSF command. Defaults are research-hpc for SYSTEM = MGI,
41+
general-interactive for SYSTEM = compute1
42+
43+
One or more data_path arguments will map volumes on docker start. If data_path is PATH_H:PATH_C,
44+
then PATH_C will map to PATH_H. If only a single path is given, it is equivalent to PATH_C=PATH_H
45+
EOF
46+
47+
# We can launch in importGDC root dir or ./src. Test based on existence of utils.sh, and cd to root dir if necessary
48+
# utils.sh might live in . or ./src, depending on where this script runs
49+
if [ -e utils.sh ]; then
50+
cd ..
51+
elif [ ! -e src/utils.sh ]; then
52+
>&2 ERROR: cannot locate src/utils.sh
53+
exit 1
54+
fi
55+
source src/utils.sh
56+
57+
SCRIPT=$(basename $0)
58+
LSF_ARGS=""
59+
DOCKER_CMD="/bin/bash"
60+
INTERACTIVE=1
61+
WRITE_LOGS=0
62+
SYSTEM="docker"
63+
64+
BSUB="bsub"
65+
DOCKER="docker"
66+
67+
while getopts ":I:hdM:m:L:c:g:q:" opt; do
68+
case $opt in
69+
I)
70+
DOCKER_IMAGE="$OPTARG"
71+
;;
72+
h)
73+
echo "$USAGE"
74+
exit 0
75+
;;
76+
d)
77+
DRYRUN="d"
78+
;;
79+
M)
80+
SYSTEM="$OPTARG"
81+
;;
82+
m)
83+
MEM_GB="$OPTARG"
84+
;;
85+
L)
86+
LOGD=$OPTARG
87+
;;
88+
c)
89+
DOCKER_CMD="$OPTARG"
90+
INTERACTIVE=0
91+
WRITE_LOGS=1
92+
;;
93+
g)
94+
LSF_ARGS="$LSF_ARGS $OPTARG"
95+
;;
96+
q)
97+
LSFQ="$OPTARG"
98+
;;
99+
\?)
100+
>&2 echo "$SCRIPT: ERROR. Invalid option: -$OPTARG" >&2
101+
>&2 echo "$USAGE"
102+
exit 1
103+
;;
104+
:)
105+
>&2 echo "$SCRIPT: ERROR. Option -$OPTARG requires an argument." >&2
106+
>&2 echo "$USAGE"
107+
exit 1
108+
;;
109+
esac
110+
done
111+
shift $((OPTIND-1))
112+
113+
if [ -z $DOCKER_IMAGE ]; then
114+
>&2 echo Error: Docker image \(-I\) not specified
115+
>&2 echo "$USAGE"
116+
exit 1
117+
fi
118+
119+
if [ $SYSTEM == "docker" ]; then
120+
LSFQ_DEFAULT=""
121+
IS_LSF=0
122+
elif [ $SYSTEM == "MGI" ]; then
123+
LSFQ_DEFAULT="-q research-hpc"
124+
IS_LSF=1
125+
elif [ $SYSTEM == "compute1" ]; then
126+
LSFQ_DEFAULT="-q general-interactive"
127+
IS_LSF=1
128+
else
129+
>&2 echo ERROR: Unknown SYSTEM: $SYSTEM
130+
>&2 echo "$USAGE"
131+
exit 1
132+
fi
133+
if [ $IS_LSF == 1 ] && [ -z $LSFQ ]; then
134+
LSFQ=$LSFQ_DEFAULT
135+
fi
136+
137+
PATH_MAP=""
138+
# Loop over all arguments, host directories which will be mapped to container directories
139+
for DP in "$@"
140+
do
141+
# Each data path DP consists of one or two paths separated by :
142+
# If 2 paths, they are PATH_H:PATH_C
143+
# If 1 path, define PATH_C = PATH_H
144+
PATH_H=$(echo "$DP" | cut -f 1 -d :)
145+
PATH_C=$(echo "$DP" | cut -f 2 -d :)
146+
147+
if [ ! -d $PATH_H ]; then
148+
>&2 echo ERROR: $PATH_H is not an existing directory
149+
exit 1
150+
fi
151+
152+
# Using python to get absolute path of DATDH. On Linux `readlink -f` works, but on Mac this is not always available
153+
# see https://stackoverflow.com/questions/1055671/how-can-i-get-the-behavior-of-gnus-readlink-f-on-a-mac
154+
ABS_PATH_H=$(python -c 'import os,sys;print(os.path.realpath(sys.argv[1]))' $PATH_H)
155+
156+
if [ -z $PATH_C ]; then
157+
PATH_C=$ABS_PATH_H
158+
fi
159+
160+
>&2 echo Mapping $PATH_C to $ABS_PATH_H
161+
if [ $IS_LSF == 1 ]; then
162+
PATH_MAP="$PATH_MAP $ABS_PATH_H:$PATH_C"
163+
else
164+
PATH_MAP="$PATH_MAP -v $ABS_PATH_H:$PATH_C"
165+
fi
166+
done
167+
168+
if [ $WRITE_LOGS == 1 ]; then
169+
mkdir -p $LOGD
170+
test_exit_status
171+
ERRLOG="$LOGD/${UUID}.err"
172+
OUTLOG="$LOGD/${UUID}.out"
173+
>&2 echo Output logs written to: $OUTLOG and $ERRLOG
174+
rm -f $ERRLOG $OUTLOG
175+
176+
if [ $IS_LSF == 1 ]; then
177+
LSF_LOGS="-e $ERRLOG -o $OUTLOG"
178+
else
179+
LOG_REDIRECT="> $OUTLOG 2> $ERRLOG"
180+
fi
181+
fi
182+
183+
# This is the command that will execute on docker
184+
if [ $INTERACTIVE == 1 ]; then
185+
if [ $IS_LSF == 1 ]; then
186+
LSF_ARGS="$LSF_ARGS -Is"
187+
else
188+
DOCKER_ARGS="$DOCKER_ARGS -it"
189+
fi
190+
fi
191+
192+
if [ $IS_LSF == 1 ]; then
193+
ECMD="export LSF_DOCKER_VOLUMES=\"$PATH_MAP\" "
194+
run_cmd "$ECMD" $DRYRUN
195+
DCMD="$BSUB $LSFQ $LSF_ARGS $LSF_LOGS -a \"docker($DOCKER_IMAGE)\" $DOCKER_CMD "
196+
else
197+
DCMD="$DOCKER run $DOCKER_ARGS $PATH_MAP $DOCKER_IMAGE $DOCKER_CMD $LOG_REDIRECT"
198+
fi
199+
200+
run_cmd "$DCMD" $DRYRUN

0 commit comments

Comments
 (0)