Skip to content

Commit

Permalink
Try to use jgo for classpath instead of shaded fat jar
Browse files Browse the repository at this point in the history
Needs
 - mvn on classpath
 - jgo from [resolve-only](scijava/jgo#37) pull request (Python >= 3.7)

Currently fails with:
```
Unable to convert into Paintera dataset: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 736, 10.36.110.15, executor 69): java.lang.NoSuchMethodError: org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream.<init>(Ljava/io/Ou
        at org.janelia.saalfeldlab.n5.GzipCompression.getOutputStream(GzipCompression.java:89)
        at org.janelia.saalfeldlab.n5.DefaultBlockWriter.write(DefaultBlockWriter.java:49)
        at org.janelia.saalfeldlab.n5.DefaultBlockWriter.writeBlock(DefaultBlockWriter.java:83)
        at org.janelia.saalfeldlab.n5.N5FSWriter.writeBlock(N5FSWriter.java:133)
        at org.janelia.saalfeldlab.n5.imglib2.N5LabelMultisets.saveLabelMultisetBlock(N5LabelMultisets.java:308)
        at org.janelia.saalfeldlab.n5.imglib2.N5LabelMultisets.saveLabelMultisetBlock(N5LabelMultisets.java:337)
        at org.janelia.saalfeldlab.label.spark.convert.ConvertToLabelMultisetType.lambda$convertToLabelMultisetType$dce05d2$1(ConvertToLabelMultisetType.java:226)
        at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1040)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
        at scala.collection.TraversableOnce$class.reduceLeft(TraversableOnce.scala:185)
        at scala.collection.AbstractIterator.reduceLeft(Iterator.scala:1336)
        at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1015)
        at org.apache.spark.rdd.RDD$$anonfun$reduce$1$$anonfun$14.apply(RDD.scala:1013)
        at org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:2130)
        at org.apache.spark.SparkContext$$anonfun$33.apply(SparkContext.scala:2130)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:109)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
```

Probably dependency issue between spark and N5 (or other libraries) that is not an issue with shaded fat jar.
  • Loading branch information
hanslovsky committed Oct 23, 2019
1 parent 4718964 commit cd86533
Showing 1 changed file with 62 additions and 0 deletions.
62 changes: 62 additions & 0 deletions flintstone-jgo.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
#!/bin/bash

if [ "$#" -lt "3" ]; then
echo -e "Not enough arguments!" 1>&2
exit 1
fi

CONTAINING_DIRECTORY="$( dirname "${BASH_SOURCE[0]}" )"
SPARK_JANELIA="${SPARK_JANELIA:-${CONTAINING_DIRECTORY}/spark-janelia/spark-janelia}"
# MVN="${MVN:-/misc/local/maven-3.2.2/bin/mvn}"

RUNTIME="${RUNTIME:-8:00}"
SPARK_VERSION="${SPARK_VERSION:-2.3.1}"
TERMINATE="${TERMINATE:-1}"
MIN_WORKERS="${MIN_WORKERS:-1}"

N_EXECUTORS_PER_NODE="${N_EXECUTORS_PER_NODE:-6}"
N_CORES_PER_EXECUTOR="${N_CORES_PER_EXECUTOR:-5}"
MEMORY_PER_NODE="${MEMORY_PER_NODE:-300}"
SPARK_OPTIONS="${SPARK_OPTIONS:-}"

N_DRIVER_THREADS="${N_DRIVER_THREADS:-16}"

N_NODES=$1; shift
COORDINATE="$1"; shift
CLASS=$1; shift
ARGV="$@"

ARTIFACT="$(echo $COORDINATE | tr ':' '\n' | head -n2 | tail -n1)"
WORKSPACE="$(jgo --repository scijava.public=https://maven.scijava.org/content/groups/public --resolve-only $COORDINATE)"
MAIN_JAR="$(ls ${WORKSPACE}/*jar -1 | grep -E "${ARTIFACT}-[0-9]+")"
ALL_JARS="$(ls ${WORKSPACE}/*jar -1 | tr '\n' ',' | sed 's/,$//')"

# echo "$MEMORY_PER_NODE / $N_EXECUTORS_PER_NODE"
# echo "$N_NODES * $N_EXECUTORS_PER_NODE"
export MEMORY_PER_EXECUTOR="$(($MEMORY_PER_NODE / $N_EXECUTORS_PER_NODE))"
export N_EXECUTORS="$(($N_NODES * $N_EXECUTORS_PER_NODE))"
export PARALLELISM="$(($N_EXECUTORS * $N_CORES_PER_EXECUTOR * 3))"

SUBMIT_ARGS="${SUBMIT_ARGS} --verbose"
SUBMIT_ARGS="${SUBMIT_ARGS} --conf spark.default.parallelism=$PARALLELISM"
SUBMIT_ARGS="${SUBMIT_ARGS} --conf spark.executor.instances=$N_EXECUTORS_PER_NODE"
SUBMIT_ARGS="${SUBMIT_ARGS} --conf spark.executor.cores=$N_CORES_PER_EXECUTOR"
SUBMIT_ARGS="${SUBMIT_ARGS} --conf spark.executor.memory=${MEMORY_PER_EXECUTOR}g"
SUBMIT_ARGS="${SUBMIT_ARGS} ${SPARK_OPTIONS}"
SUBMIT_ARGS="${SUBMIT_ARGS} --jars $ALL_JARS"
SUBMIT_ARGS="${SUBMIT_ARGS} --class $CLASS"
SUBMIT_ARGS="${SUBMIT_ARGS} ${MAIN_JAR}"
SUBMIT_ARGS="${SUBMIT_ARGS} ${ARGV}"

LOG_FILE="${HOME}/.sparklogs/${CLASS}.o%J"

"${SPARK_JANELIA}" \
--nnodes="${N_NODES}" \
--no_check \
--driveronspark \
--silentlaunch \
--minworkers="${MIN_WORKERS}" \
--hard_runtime=${RUNTIME} \
--submitargs="${SUBMIT_ARGS}" \
--driveroutfile=${LOG_FILE} \
lsd

0 comments on commit cd86533

Please sign in to comment.