Skip to content

Releases: cfe-lab/CFEIntact

Barnacle

19 Jan 21:27
9bd0ae4

Choose a tag to compare

This release, codename Barnacle, strengthens the “known-good” edges of the pipeline with a focus on subtype handling, ORF length/distance cutoffs, and CLI + packaging ergonomics.

New Features and Improvements

  • Subtype Handling: Reference Sequences & BLAST DB Assets

    • Subtype reference sequences have been replaced with a newly curated/edited set (edited sequences carry a “.CfE” suffix in their names in the source set). This improves consistency for subtype-based alignment and intactness evaluation.
    • Added BLAST database metadata (*.fasta.njs) for subtype alignment databases, making the shipped subtype BLAST assets more explicit and inspectable.
  • CLI Usability Improvements

    • The --subtype argument is now optional and defaults to all, reducing friction for users who don’t need subtype-specific evaluation.
  • ORF Length/Distance Cutoffs Updated

    • ORF length tolerance limits were updated, aligning detection logic with revised empirical thresholds.
    • Distance cutoffs were updated for major ORFs (gag, pol, env) and small ORFs (vif, vpr, tat/rev exons, vpu, nef).
  • Defect Detection Logic: Reading-Frame Insertions

    • Removed a redundant early-return condition in reading frame insertion checking. This makes insertion detection more consistent (i.e., insertions are evaluated rather than being skipped under a “good-enough match” condition).
  • Packaging, Dependencies, and Tooling

    • Added uv.lock improving reproducibility.
    • Python version constraints were corrected, and NumPy was explicitly installed/added where needed.
    • Dependabot configuration was updated; pyproject.toml metadata was adjusted (including improving the LICENSE field and removing README from pyproject.toml to avoid Dependabot issues).
    • Dependency bump: aligntools 1.2.1 → 1.2.2.

Breaking Changes and Impact for Users

  • Supported Subtypes Have Changed

    • Because the subtype reference sequence set was replaced, the effective list of supported subtypes has changed. Any downstream tooling that assumes a specific subtype list (or specific reference identifiers) should be reviewed and updated.
  • Intactness Calls May Shift Due to New Cutoffs

    • The updated distance and ORF length tolerance defaults can change pass/fail outcomes (especially near threshold boundaries). If you compare results across versions, expect some sequences to flip classification.
  • CLI Behavior Change

    • --subtype is no longer required (defaults to all). If your wrappers always passed --subtype, they will continue to work; if you relied on the CLI rejecting missing subtype, that behavior is now different.

Documentation, Examples, and Further Notes

  • Docs were updated for:

    • Revised cutoffs + a direct “source of truth” link for those values.
    • A simplified installation command and an updated example command.
    • Minor link/formatting fixes.

Summary

This release improves subtype robustness (new curated references + BLAST DB metadata), updates ORF distance/length tolerances to better match observed distributions, simplifies CLI usage by defaulting subtype to all, and strengthens packaging reproducibility with uv.lock and dependency/tooling updates. Users who depend on the previous subtype set or on exact threshold behavior should re-validate their pipelines against the new defaults.

Full changelog: v1.23.2...v1.26.0

Squidward

13 Mar 22:19
6ae623e

Choose a tag to compare

This release marks a significant milestone in enhancing our tool’s precision and robustness. Our team has focused on refining key algorithms, improving defect messaging, and standardizing output formats while maintaining tight integration with third‐party alignment and BLAST tools. This is still CFEIntact version 1, codename "Squidward."

New Features and Improvements

  • Defects and Output Files

    • The internal defect classes have been renamed and improved for clarity. For example, "DeletionInOrf" and "InsertionInOrf" are now simply "Deletion" and "Insertion" (and additional changes in start/stop defect names).
    • The defect messages have been refined to include more precise language (for example, reporting exactly the number of insertions/deletions relative to the accepted tolerance).
    • In outputs the "errors" file has been renamed to "defects", and field names (e.g. "code", "message", and "region") have been updated accordingly.
    • Output CSV (or JSON) now uses a "regions" file that holds ORF/region information along with holistic information and defects; fields coming from the FoundORF structure have been expanded to now include subtype start/end values plus aminoacid and nucleotide representations.
  • ORF Detection & Alignment Enhancements

    • Improvements in the ORF detection algorithm (e.g. changes in how candidate positions are computed, using the translation table and "biggest protein" detection) help improve precision.
    • Small refinements in the calculation of the reading frame, handling of frameshifts, and the computation of indel impact contribute to a more accurate intactness determination.
  • Code Robustness and Error Handling

    • In the "wrappers" module (both MAFFT and BLAST calls), additional exception handling has been added. Now, if a third‐party tool fails (or is interrupted), a clear UserError is raised with guidance on how to check the FASTA file formatting.
    • The get_biggest_protein function has been reworked to allow for better detection of the longest valid translated region based on whether a start codon is required.
  • Dependency and Build Process Enhancements

    • GitHub workflows have been updated to upgrade pip automatically before installation.
    • The Dockerfile now sets a fixed working directory to /w.
    • The docs' Dockerfile and gitignore have been slightly reworked for clarity and consistency.

Breaking Changes and Impact for Users

  • Defect Messages and Terminology

    • The printed messages in defect objects have changed. If you rely on parsing specific strings from CFEIntact output (for example, "DeletionInOrf" is now "Deletion", "InternalStopInOrf" is now "InternalStop", etc.), please update your downstream tools and scripts accordingly.
    • The field "error" in the output files is now replaced by "code," and the "orf" field is now "region."
  • Output File Names

    • Previously the GitHub tests expected an "errors.csv" file; it is now written as "defects.csv."
    • The overall "orfs" output is now published as "regions.csv." If you have any automation or custom reporting that relies on these filenames or header fields, please update them.
  • API and Command‐Line Changes

    • The command "cfeintact check" now by default tests for defects using the updated messages and writes a "defects.csv" file.
    • In GitHub workflow and Docker run commands, flags (e.g. "--ignore-distance" and "--ignore-packaging-signal") remain the same, but tests have been updated so that the existence of "defects.csv" is verified rather than "errors.csv."

Documentation, Examples, and Further Notes

  • Updated documentation pages now include revised navigation (with an updated Quick Start that adds an "Installation" and "Data Preparation" section), so users and developers should refer to the online docs for the latest instructions.
  • The release notes note that output fields and defect messages have changed --- if you integrate CFEIntact in automated workflows, please verify that your parsing scripts (or Codecov integrations) work with the new structure.

Summary

This release includes "small improvements and fixes" with notable enhancements to defect detection messages, output file structure, and precision in ORF analysis. While these changes improve overall usability and accuracy, they may require adjustments for users relying on previously defined output formats and error message contents.

Full Changelog: v1.18.8...v1.23.2

v1.18.8

12 Jul 21:44
7c67d4b

Choose a tag to compare

Full Changelog: v1.18.7...v1.18.8

Stable release 1

08 Jul 21:00
2e8a134

Choose a tag to compare

Initial release

05 Apr 22:33
f77d120

Choose a tag to compare

The first release after renaming the project.