Skip to content

Commit

Permalink
add option to specify assemblies via config file
Browse files Browse the repository at this point in the history
  • Loading branch information
pmenzel committed Jan 19, 2024
1 parent b11d31e commit 8f923f4
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 1 deletion.
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,15 @@ snakemake -s /opt/software/ont-assembly-snake/Snakefile --use-conda --cores 20

Assemblies created in each step are contained in the files `output.fa` in each folder and symlinked as `.fa` files in the `assemblies/` folder, see the example below.

As an alternative to create subfolders within `assemblies/`, it is also possible to use a YAML file to list the desired assemblies, e.g. in a file called `samples.yaml`:
```
assemblies:
- mysample_flye+medaka
- mysample+filtlongMB500_flye+racon2+medaka
- mysample_raven2+medaka+pilon
```
and add the argument `--configfile samples.yaml` to the Snakemake command line.

## Test dataset
A test dataset containing a pair of ONT and Illumina sequencing data from the same bacterial isolate is available in the repository [ont-assembly-snake-testdata](https://github.com/pmenzel/ont-assembly-snake-testdata). See the instructions therein on how to download the dataset and run the ont-assembly-snake and [score-assemblies](https://github.com/pmenzel/score-assemblies) workflows.

Expand Down
6 changes: 5 additions & 1 deletion Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ if config.get("run_score_assemblies", False):
module score_assemblies:
snakefile:
github("pmenzel/score-assemblies", path="Snakefile", branch="master")
#"score-assemblies/Snakefile"
# "score-assemblies/Snakefile"
config:
config

Expand Down Expand Up @@ -75,6 +75,10 @@ def get_R2_fq(wildcards):
# ignore symlinks in assemblies/folder, e.g. sample_flye.fa -> assemblies/sample_flye/output.fa
sample_assemblies = [a for a in sample_assemblies if not re.search("\.fa", a)]

# if config files with list of assemblies is given, then use this instead of folders in assemblies/
if config.get("assemblies", False):
sample_assemblies = list(set(config["assemblies"]))

# if any desired assembly requires homopolish then at least one reference genome should be provided
if not references and [
string for string in sample_assemblies if "homopolish" in string
Expand Down

0 comments on commit 8f923f4

Please sign in to comment.