This is a docker image based on the
dropSeqPipe. The image can be
used to process dropSeq or seq-well data sequenced with Illumina (and
possibly other formats). There is some additional functionality
including merging fastq.gz
files and automatically creating the
samples.csv
file required by the dropSeqPipe by extracting them from
the names of the input files.
Under normal circumstances no
The environmental variables used in this container include
-
SAMPLENAMES
: the names of the samples, that will be used to select and mergefastq.gz
files. For exampleSAMPLENAMES=sample1 sample2
would merge all the fastq files starting with
sample1
andsample2
. By defaultSAMPLENAMES=""
(i.e. if not specified) and sample names will be determined automatically by extracting the sample names with the regular expressionr"(.*)_S\d+_L\d{3}_R\d_\d{3}.fastq.gz"
. -
NUMCELLS
: a number of cells/beads to extract from each sample (counted after merging the fastq files). If you don't want to discard any barcodes set this to a large number but this may raise some errors at the last stages of dropSeqPipe where it tries to merge all count tables into one (but you can safely ignore those and just use the intermediate count tables). -
JOBS
: number of processes to be used by snakemake. -
TARGETS
: the type of analysis to perform, the default isall
. If you want to perform just a preliminary qc select one of the available targets from https://github.com/Hoohm/dropSeqPipe/wiki/Running-dropSeqPipe#modes. Also for executing additional target rules (for plots, etc.)
The pipeline expects the following volumes to be mounted
/input:ro
, the location of thefastq.gz
files/raw_data:rw
, an empty directory where the merged fastq files will be stored/results:rw
, where the results will be stored. This directroy must contain the adapter file and the gtf_biotypes.yaml/ref:rw
, the location of the annotation and genome files. The script will look forgenome.fa
andannotation.gtf
files. Then it will generated STAR index and other files necessary for the pipeline, if such files already exist the pipeline will reuse them without regenerating them./samples.csv
(optional), if you want to provide a specificsamples.csv
file according to the dropSeqPipe standard. If not present it will be automatically generated based onSAMPLENAMES
or on file names ifSAMPLENAMES
is not provided./config/config.yaml
(optinal), to provide a customized version of the config file.