Skip to content

Latest commit

 

History

History
64 lines (32 loc) · 1.71 KB

README.md

File metadata and controls

64 lines (32 loc) · 1.71 KB

Falsitron identification

DOI

Main scripts for the paper:

Direct long-read RNA sequencing identifies a subset of questionable exitrons likely arising from reverse transcription artifacts

Falsitron detection pipeline and example

Isoform annotation

The identification of isoforms from long read data is made following the ONT pipeline based on StringTie and other tools

cDNA and directRNA reads are processed independently and using the default parameters of the pipeline.

Falsitron search

We use the script candidate_search.R in the following manner:

Rscript candidate_search.R <input_file>

The input file contains the following lines:

  • Library query
  • Library target
  • path to gffcompare tracking file on the form: query_target.tracking
  • path to gffcompare tracking file on the form: target_query.tracking
  • path to the query GFF annotation file
  • path to the target GFF annotation file
  • path to the query BAM file
  • path to the target BAM file
  • output path including prefix to use on the output files

An example of this file is contained here.

query = Library with potential artifacts (usually cDNA)

target = Library to compare (usually dRNA)

The script will output the candidates under different filters, as described in the figure above.

Repeat search

The script repeat_search.R is used to process the file with the filter F3 in order to search for direct repeats in the candidates.