Basecalling and Genome Assembly

MinION basecalling and assembly

Convert fast5 files to pod5 format:

pod5 convert fast5 <directory-of-fast5s> --output output_pod5s/ --one-to-one <directory-of-fast5s>

2A. Dorado basecalling on laptop:

dorado download --model <model-name>
dorado basecaller --emit-fastq <directory-of-pod5s> | gzip > output.fq.gz # try and implement pigz parallel compression

2B. Dorado basecalling on LCC cluster using dorado.sh script:

sbatch $scripts/dorado.sh pod5_directory

Use canu.sh SLURM script for Canu assembly:

assembly=<assembly-prefix>
nano_reads=<fastq directory>
canu -d ${assembly}_canu_run -p $assembly genomeSize=45m useGrid=false gridOptionsOVS=" --time 96:00:00 --partition=CAC48M192_L --ntasks=1 --cpus-per-task=4 " minReadLength=1000 -nanopore-raw $nano_reads

Rescuing raw files from failed MinION runs:

./recover_reads <Representative-fast5-file> </Library/MinKNOW/data/queued_reads/complete-reads-directory> --output-directory Recovered_fast5

Access MINION genomes

Determine contig lengths

Run the SeqLen.pl script:

perl SeqLen.pl <genome.fasta>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BaseCallingAssembly.md

BaseCallingAssembly.md

Basecalling and Genome Assembly

MinION basecalling and assembly

Access MINION genomes

Determine contig lengths

Files

BaseCallingAssembly.md

Latest commit

History

BaseCallingAssembly.md

File metadata and controls

Basecalling and Genome Assembly

MinION basecalling and assembly

Access MINION genomes

Determine contig lengths