Skip to content

Releases: btmartin721/SNPio

SNPio Version 1.2.1

07 Jan 07:08
Compare
Choose a tag to compare

Changelog

This document outlines the changes made to the project with each
release.

Version 1.2.1 (2025-02-22)

Features

  • Improved the PopGenStatistics class to include new functionality to calculate genetic distances between populations:

    • calculate genetic distances between populations using the
      neis_genetic_distance() method. The
      method calculates Nei's genetic distance between populations and
      returns a pandas DataFrame with the genetic distances.
  • The PopGenStatistics class now has the following public (user-facing) methods:

    • neis_genetic_distance
    • calculate_d_statistics
    • detect_fst_outliers
    • summary_statistics
    • amova
  • The AMOVA method now returns a dictionary with the AMOVA results. Its functionality has been greatly extended to follow Excoffier et al. (1992) and Excoffier et al. (1999) methods. The method now calculates the variance components (within populations, within regions among popoulations, and among regions), Phi-statistics, and p-values via bootstrapping for the AMOVA analysis. A regionmap dictionary is now required to map populations to regions/groups. The method also has the following new parameters:

    • `n_bootstraps`: The number of bootstraps to perform.
    • `n_jobs`: The number of jobs to run in parallel.
    • `random_seed`: The random seed for reproducibility.

Enhancements

  • Improved the PopGenStatistics class to
    include new functionality to calculate observed and expected
    heterozygosity per population and nucleotide diversity per population.

  • Improved the PopGenStatistics class to
    include new functionality to calculate Weir and Cockerham's Fst
    between populations.

  • Improved aesthetics of the Fst heatmap plot.

  • Improved the PopGenStatistics class to
    include new functionality to plot D-statistics (Patterson's,
    Partitioned, and D-foil) and save them as CSV files.

  • Improved the PopGenStatistics class to
    include new functionality to calculate Nei's genetic distance between
    populations.

  • Improved the PopGenStatistics class to
    include new functionality to plot Nei's distance matrix between
    populations.

  • Improved the PopGenStatistics class to include new functionality to plot Fst outliers.

    • Two ways:
      • DBSCAN clustering method
      • Bootstrapping method
  • Improved the PopGenStatistics class to
    include new functionality to plot summary statistics. The method now
    returns a dictionary with the summary statistics.

  • Improved the PopGenStatistics class to
    include new functionality to calculate AMOVA results. The method now
    returns a dictionary with the AMOVA results.

  • Improved the PopGenStatistics class to
    include new functionality to calculate genetic distances between
    populations. The method calculates Nei's genetic distance between
    populations and returns a pandas DataFrame with the genetic distances.

Changes

  • Much of the code has been refactored to improve readability and
    maintainability. This includes moving the
    neis_genetic_distance() method to the
    genetic_distance module, the
    amova() method to the
    amova module, and the
    fst_outliers() method to the
    fst_outliers module. The
    summary_statistics() method has been
    moved to the summary_statistics module,
    and the D-statistics methods have been moved to the
    d_statistics module.

Deprecations

The following method have been deprecated:

  • `wrights_fst()`: Uses
    weir_cockerham_fst_between_populations()
    instead.

Bug Fixes

  • Fixed bug where the PopGenStatistics
    class did not have the verbose and
    debug attributes.
  • Fixed bug where the PopGenStatistics
    class did not have the genotype_data
    attribute.
  • Fixed warnings in
    snpio.plotting.plotting.Plotting class
    with the font family.
  • Fixed bug with VCFReader class when a
    non-tabix-indexed and uncompressed VCF file was read. The bug caused
    an error when reading an uncompressed VCF file.

SNPio v1.1.4

27 Oct 19:20
Compare
Choose a tag to compare

Fixed issue with paths in snpio/docs/source/getting_started.rst

Docs would not build on readthedocs without fixing.

v1.1.3.0

27 Oct 18:15
Compare
Choose a tag to compare
moved docs to snpio/docs

SNPio v1.1.3

27 Oct 18:08
Compare
Choose a tag to compare

Version 1.1.3 (2024-10-25)

Features

  • Updated tree parsing functionality and added it to the TreeParser class in the analysis/tree_parser.py module to conform to refactor, and added new functionality to parse, modify, draw, and save Newick and NEXUS tree files.
  • siterates and qmatrix files now dynamically determine if they are in IQ-TREE format or if they are just in a simple tab-delimited or comma-delimited format.
  • site_rates and qmat are now read in as pandas DataFrames with less complex logic.
  • Added unit test for tree parsing.
  • Added integration test for tree parsing.
  • Added documentation for tree parsing.

Bug Fixes

  • Fixed bug where the PhylipReader and StructureReader classes did not have the verbose and debug attributes.

Changes

  • q property is now called qmat for clarity and easier searching in files.
  • Removed redundant siterates_iqtree and qmatrix_iqtree arguments attributes from the GenotypeData, VCFReader, PhylipReader, StructureReader, and TreeParser classes.
  • Added error handling for tree parsing.
  • Added error handling for siterates and qmatrix files.

SNPio v1.1.2

25 Oct 20:17
Compare
Choose a tag to compare

Version 1.1.1 (2024-10-25)

Features

  • Updated tree parsing functionality and added it to the TreeParser class in the analysis/tree_parser.py module to conform to refactor, and added new functionality to parse, modify, draw, and save Newick and NEXUS tree files.
  • siterates and qmatrix files now dynamically determine if they are in IQ-TREE format or if they are just in a simple tab-delimited or comma-delimited format.
  • site_rates and qmat are now read in as pandas DataFrames with less complex logic.
  • Added unit test for tree parsing.
  • Added integration test for tree parsing.
  • Added documentation for tree parsing.

Bug Fixes

  • Fixed bug where the PhylipReader and StructureReader classes did not have the verbose and debug attributes.

Changes

  • q property is now called qmat for clarity and easier searching in files.
  • Removed redundant siterates_iqtree and qmatrix_iqtree arguments attributes from the GenotypeData, VCFReader, PhylipReader, StructureReader, and TreeParser classes.
  • Added error handling for tree parsing.
  • Added error handling for siterates and qmatrix files.

SNPio v1.1.1.0

25 Oct 20:10
Compare
Choose a tag to compare

Version 1.1.1 (2024-10-25)

Features

  • Updated tree parsing functionality and added it to the TreeParser class in the analysis/tree_parser.py module to conform to refactor, and added new functionality to parse, modify, draw, and save Newick and NEXUS tree files.
  • siterates and qmatrix files now dynamically determine if they are in IQ-TREE format or if they are just in a simple tab-delimited or comma-delimited format.
  • site_rates and qmat are now read in as pandas DataFrames with less complex logic.
  • Added unit test for tree parsing.
  • Added integration test for tree parsing.
  • Added documentation for tree parsing.

Bug Fixes

  • Fixed bug where the PhylipReader and StructureReader classes did not have the verbose and debug attributes.

Changes

  • q property is now called qmat for clarity and easier searching in files.
  • Removed redundant siterates_iqtree and qmatrix_iqtree arguments attributes from the GenotypeData, VCFReader, PhylipReader, StructureReader, and TreeParser classes.
  • Added error handling for tree parsing.
  • Added error handling for siterates and qmatrix files.

SNPio v1.1.1

25 Oct 19:52
7f28ba5
Compare
Choose a tag to compare

Version 1.1.1 (2024-10-25)

Features

  • Updated tree parsing functionality and added it to the TreeParser class in the analysis/tree_parser.py module to conform to refactor, and added new functionality to parse, modify, draw, and save Newick and NEXUS tree files.
  • siterates and qmatrix files now dynamically determine if they are in IQ-TREE format or if they are just in a simple tab-delimited or comma-delimited format.
  • site_rates and qmat are now read in as pandas DataFrames with less complex logic.
  • Added unit test for tree parsing.
  • Added integration test for tree parsing.
  • Added documentation for tree parsing.

Bug Fixes

  • Fixed bug where the PhylipReader and StructureReader classes did not have the verbose and debug attributes.

Changes

  • q property is now called qmat for clarity and easier searching in files.
  • Removed redundant siterates_iqtree and qmatrix_iqtree arguments attributes from the GenotypeData, VCFReader, PhylipReader, StructureReader, and TreeParser classes.
  • Added error handling for tree parsing.
  • Added error handling for siterates and qmatrix files.

v1.1.0.1

12 Oct 05:33
b609e33
Compare
Choose a tag to compare
  • Added UserManual.pdf via pandoc
  • tidied up API docs

v1.1.0

11 Oct 05:20
da5c3d2
Compare
Choose a tag to compare

Version 1.1.0 (2024-10-08)

Features

  • Full refactor of the codebase to improve user-friendliness, maintainability and readability.
    • Method chaining: All functions now return the object itself, allowing for method chaining and custom filtering orders with NRemover2.
    • Most objects now just take a GenotypeData object as input, making the code more modular and easier to maintain.
    • Improved documentation and docstrings.
    • Improved error handling.
    • Improved logging. All logging is now done with the Python logging module via the custom LoggerManager class.
    • Improved testing.
    • Improved performance.
      • Reduced memory usage.
      • Reduced disk usage.
      • Reduced CPU usage.
      • Reduced execution time, particularly for reading, loading, filtering, and processing large VCF files.
    • Improved plotting.
    • Improved data handling.
    • Improved file handling. All filenames now use pathlib.Path objects.
    • Code modularity: Many functions are now in separate modules for better organization.
    • Full unit tests for all functions.
    • Full integration tests for all functions.
    • Full documentation for all functions.

Version 1.0.4

10 Sep 22:19
3ffcb9d
Compare
Choose a tag to compare

Version 1.0.4 (2023-09-10)

Features

  • Added functionality to filter out linked SNPs using CHROM and POS fields from VCF file.

Performance

  • Made the Sankey plot function more modular and dynamic.

Bug Fixes

  • Fix spacing between printed STDOUT.