Skip to content

Releases: esteinig/vircov

0.6.0

18 Feb 23:53
3327d01
Compare
Choose a tag to compare

Major updates making applications more useful 🥳

Two short-hand command-line arguments (-i and -T) break with previous versions 💀

  • Release binaries CI/CD
  • Input alignment format (-i/--alignment) from file extension (bam|sam|cram|paf) or specifically with --alignment-format
  • Added --aligned/--group-aligned filter to supplement filter by unique aligned reads (--reads/--group-reads)
  • Pretty table output short argument is now -T (previously -t)
  • Input alignment short argument is now -i (previously -A)
  • Added -H argument to print machine-readable header to non-pretty table output [#13]
  • Reference alignment grouping by field in header and automated reference selection:
    - Requires annotation in reference sequence header (description) e.g. taxid=9606; segment="M"
    - Whitespace around header fields or values is trimmed (start-end) internally on parsing
    - --group-by <field>: group alignments by this field
    - --group-sep <delimiter>: the delimiter with which fields in the header are separated
    - --group-select-split <dir>: selects a single reference per group and outputs to file in <dir >({group_id}.fasta)
    - --group-select-by <coverage|reads>: selection by highest coverage or max reads
    - --group-select-order outputs the selected reference with index prefixes sorted by select-by metric ({idx}-{group_id}.fasta)
    - Example: --group-by "taxid=" --group-sep ";" --group-select-split ref_seqs/ --group-select-by coverage
  • If segment fields are specified each select segment reference is output by highest coverage or reads
    - Command line: --segment-field and --segment-field-nan
    - Example: --segment-field "segment=" --segment-field-nan "segment=N/A"
  • Grouped filtering and outputs behave different to non-grouped filtering and outputs:
    - Non-group filters (--regions, --reads, --aligned, --coverage, --length) are applied before grouping
    - Group filters can be applied (--group-regions, --group-reads, --group-coverage, --group-aligned)
    - Grouped output fields are distinct from the non-grouped fields - they change the following (described in --help):
    * Reference sequence identifier is the value that is grouped by followed by the number of grouped members in brackets e.g. 9606 (5)
    * Distinct alignment regions are summed across group members
    * Alignments are summed across group members
    * Unique reads aligned are recomputed across group members
    * Covered bases and reference lengths are set to 0
    * Coverage is selected to be the highest among the group members
  • Conditional coverage filter applied to --regions filters and applies it only if coverage is below this threshold
    - This rescues high coverage sequences as these usually have few regions
    - --regions-coverage <0.0-1.0> - a sufficient value can be somewhere around 0.3 - 0.6
    - Short argument for conditional coverage filter (-t) has replaced pretty table output (now -T)

0.5.0

23 Mar 04:48
d80f9bd
Compare
Choose a tag to compare

Command line:

  • --paf | --bam input with "-" for reading from stdin
  • changed long name of --cov-reg to --regions

Main:

  • added SAM/BAM/CRAM support [#3]
  • rewrote interval parsing for PAF format [#8]
  • fixed bug in filtering coverage plot outputs [#9]
  • added table output confirmation test [#5]
  • added basic BAM reader tests, including query alignment length from CIGAR [#5]
  • reimplemented custom PAF parser due to variable CIGAR tags [#8]

Other:

  • replaced noodles fasta parsing with rust-bio
  • removed csv crate

Test coverage:

  • couldn't figure out one line for file name match statement [#14]
  • slight regression in coverage from reader functions

0.4.0

18 Mar 11:02
971f996
Compare
Choose a tag to compare

Operational, added features / command line options:

  • input alignment now required arg: vircov test.paf [previously: --paf option]
  • filter results output by
    • --seq-len: minimum reference sequence length
    • --cov-reg minimum number of detected coverage regions
  • long help menu with --help
  • pretty table output with --table
  • 100% test coverage 🥳
  • continuous integration for Linux and MacOS

0.3.0

16 Mar 04:25
Compare
Choose a tag to compare
  • coverage plot implemented [#2]
  • added tests (~40% coverage)

0.2.0

15 Mar 22:41
aa55281
Compare
Choose a tag to compare
  • output enhanced with useful summary statistics [#1]

0.1.0

12 Mar 10:24
6002c1e
Compare
Choose a tag to compare

Working prototype