diff --git a/README.md b/README.md index e22307b..c45769c 100644 --- a/README.md +++ b/README.md @@ -31,7 +31,28 @@ If you use this workflow in a paper, don't forget to give credits to the authors ## Workflow overview -TODO: include first part of the figure here. + + + +---------- + +This workflow is a best-practice workflow for the analysis of ribosome footprint sequencing (Ribo-Seq) data. +The workflow is built using [snakemake](https://snakemake.readthedocs.io/en/stable/) and consists of the following steps: + + 1. Obtain genome database in `fasta` and `gff` format (`python`, [NCBI Datasets](https://www.ncbi.nlm.nih.gov/datasets/docs/v2/)) + 1. Using automatic download from NCBI with a `RefSeq` ID + 2. Using user-supplied files + 2. Check quality of input sequencing data (`FastQC`) + 3. Cut adapters and filter by length and/or sequencing quality score (`cutadapt`) + 4. Deduplicate reads by unique molecular identifier (UMI, `umi_tools`) + 5. Map reads to the reference genome (`STAR aligner`) + 6. Sort and index for aligned seq data (`samtools`) + 7. Filter reads by feature type (`bedtools`) + 8. Generate summary report for all processing steps (`MultiQC`) + 9. Shift ribo-seq reads according to the ribosome's P-site alignment (`R`, `ORFik`) + 10. Return report as HTML and PDF files (`R markdown`, `weasyprint`) + +If you want to contribute, report issues, or suggest features, please get in touch on [github](https://github.com/MPUSP/snakemake-bacterial-riboseq). ## Installation diff --git a/resources/images/logo.png b/resources/images/logo.png new file mode 100644 index 0000000..d364c94 Binary files /dev/null and b/resources/images/logo.png differ diff --git a/resources/images/logo.svg b/resources/images/logo.svg new file mode 100644 index 0000000..9e69a26 --- /dev/null +++ b/resources/images/logo.svg @@ -0,0 +1,1212 @@ + + + +gfffasta Input- fastq.gz files from Ribo-Seq- RefSeq ID or- gff / fasta files from userProcess reads- QC filtering- cut adapters- extract UMIsMapping- align reads to ref genome- sort and index bam files- deduplicate based on UMI- filter and export bam filesExport- filtered bam files- statistics- QC reportShifting- shift reads to align P-site- 3' or 5'-end basedAuthors: Rina Ahmed-Begrich, Michael JahnAffiliation: Max-Planck-Unit for the Science of PathogensPage: https://github.com/MPUSP/snakemake-bacterial-riboseqIcons: from nf-core/sarek (https://nf-co.re), CC-BY 4.0NCBI toolscutadaptFastQCMultiQCUMI extractUMI dedupbaipdfhtmlcsvR markdownORFikRSTARsamtoolsreportsreferenceseq datasnakemake-bacterial-riboseqvcfvcffastqvcfvcfbam