Skip to content

gregoryg/cdsw-hail-genetics-tutorial

Repository files navigation

Hail tutorial

Created by Tom White ([email protected])

Hail is an open-source, scalable framework for exploring and analyzing genetic data. This repo contains the Hail Tutorial, lightly reformatted to run in Cloudera Data Science Workbench.

Status: In Progress
Use Case: Genetics

Steps:

  1. Go to Project > Settings > Environment > Spark Configuration: hail-genetics-tutorial/spark-defaults.conf
  2. Open a CDSW terminal and run setup.sh
  3. Create a Python Session and run tutorial.py
  4. When finished, run cleanup.sh in the terminal

Recommended Session Sizes:

Estimated Runtime:

Notes:

  1. HAIL requires java version 8. If you are running multiple versions on java on your system, you can set the Project Setting's Environmental Varaiables for JAVA_HOME, PATH, etc.

Recommended Jobs/Pipeline: None

Demo Script TBD

Related Content: Video (Internal Only!): https://cloudera.webex.com/cloudera/ldr.php?RCID=af7861670238dc884a134c59ce55049e

Releases

No releases published

Packages

No packages published