Releases: btmartin721/SNPio
SNPio Version 1.2.1
Changelog
This document outlines the changes made to the project with each
release.
Version 1.2.1 (2025-02-22)
Features
-
Improved the PopGenStatistics class to include new functionality to calculate genetic distances between populations:
- calculate genetic distances between populations using the
neis_genetic_distance() method. The
method calculates Nei's genetic distance between populations and
returns a pandas DataFrame with the genetic distances.
- calculate genetic distances between populations using the
-
The PopGenStatistics class now has the following public (user-facing) methods:
- neis_genetic_distance
- calculate_d_statistics
- detect_fst_outliers
- summary_statistics
- amova
-
The AMOVA method now returns a dictionary with the AMOVA results. Its functionality has been greatly extended to follow Excoffier et al. (1992) and Excoffier et al. (1999) methods. The method now calculates the variance components (within populations, within regions among popoulations, and among regions), Phi-statistics, and p-values via bootstrapping for the AMOVA analysis. A regionmap dictionary is now required to map populations to regions/groups. The method also has the following new parameters:
- `n_bootstraps`: The number of bootstraps to perform.
- `n_jobs`: The number of jobs to run in parallel.
- `random_seed`: The random seed for reproducibility.
Enhancements
-
Improved the PopGenStatistics class to
include new functionality to calculate observed and expected
heterozygosity per population and nucleotide diversity per population. -
Improved the PopGenStatistics class to
include new functionality to calculate Weir and Cockerham's Fst
between populations. -
Improved aesthetics of the Fst heatmap plot.
-
Improved the PopGenStatistics class to
include new functionality to plot D-statistics (Patterson's,
Partitioned, and D-foil) and save them as CSV files. -
Improved the PopGenStatistics class to
include new functionality to calculate Nei's genetic distance between
populations. -
Improved the PopGenStatistics class to
include new functionality to plot Nei's distance matrix between
populations. -
Improved the PopGenStatistics class to include new functionality to plot Fst outliers.
- Two ways:
- DBSCAN clustering method
- Bootstrapping method
- Two ways:
-
Improved the PopGenStatistics class to
include new functionality to plot summary statistics. The method now
returns a dictionary with the summary statistics. -
Improved the PopGenStatistics class to
include new functionality to calculate AMOVA results. The method now
returns a dictionary with the AMOVA results. -
Improved the PopGenStatistics class to
include new functionality to calculate genetic distances between
populations. The method calculates Nei's genetic distance between
populations and returns a pandas DataFrame with the genetic distances.
Changes
- Much of the code has been refactored to improve readability and
maintainability. This includes moving the
neis_genetic_distance() method to the
genetic_distance module, the
amova() method to the
amova module, and the
fst_outliers() method to the
fst_outliers module. The
summary_statistics() method has been
moved to the summary_statistics module,
and the D-statistics methods have been moved to the
d_statistics module.
Deprecations
The following method have been deprecated:
- `wrights_fst()`: Uses
weir_cockerham_fst_between_populations()
instead.
Bug Fixes
- Fixed bug where the PopGenStatistics
class did not have the verbose and
debug attributes. - Fixed bug where the PopGenStatistics
class did not have the genotype_data
attribute. - Fixed warnings in
snpio.plotting.plotting.Plotting class
with the font family. - Fixed bug with VCFReader class when a
non-tabix-indexed and uncompressed VCF file was read. The bug caused
an error when reading an uncompressed VCF file.
SNPio v1.1.4
Fixed issue with paths in snpio/docs/source/getting_started.rst
Docs would not build on readthedocs without fixing.
v1.1.3.0
moved docs to snpio/docs
SNPio v1.1.3
Version 1.1.3 (2024-10-25)
Features
- Updated tree parsing functionality and added it to the
TreeParser
class in theanalysis/tree_parser.py
module to conform to refactor, and added new functionality to parse, modify, draw, and save Newick and NEXUS tree files. siterates
andqmatrix
files now dynamically determine if they are in IQ-TREE format or if they are just in a simple tab-delimited or comma-delimited format.site_rates
andqmat
are now read in as pandas DataFrames with less complex logic.- Added unit test for tree parsing.
- Added integration test for tree parsing.
- Added documentation for tree parsing.
Bug Fixes
- Fixed bug where the
PhylipReader
andStructureReader
classes did not have theverbose
anddebug
attributes.
Changes
q
property is now calledqmat
for clarity and easier searching in files.- Removed redundant
siterates_iqtree
andqmatrix_iqtree
arguments attributes from theGenotypeData
,VCFReader
,PhylipReader
,StructureReader
, andTreeParser
classes. - Added error handling for tree parsing.
- Added error handling for
siterates
andqmatrix
files.
SNPio v1.1.2
Version 1.1.1 (2024-10-25)
Features
- Updated tree parsing functionality and added it to the
TreeParser
class in theanalysis/tree_parser.py
module to conform to refactor, and added new functionality to parse, modify, draw, and save Newick and NEXUS tree files. siterates
andqmatrix
files now dynamically determine if they are in IQ-TREE format or if they are just in a simple tab-delimited or comma-delimited format.site_rates
andqmat
are now read in as pandas DataFrames with less complex logic.- Added unit test for tree parsing.
- Added integration test for tree parsing.
- Added documentation for tree parsing.
Bug Fixes
- Fixed bug where the
PhylipReader
andStructureReader
classes did not have theverbose
anddebug
attributes.
Changes
q
property is now calledqmat
for clarity and easier searching in files.- Removed redundant
siterates_iqtree
andqmatrix_iqtree
arguments attributes from theGenotypeData
,VCFReader
,PhylipReader
,StructureReader
, andTreeParser
classes. - Added error handling for tree parsing.
- Added error handling for
siterates
andqmatrix
files.
SNPio v1.1.1.0
Version 1.1.1 (2024-10-25)
Features
- Updated tree parsing functionality and added it to the
TreeParser
class in theanalysis/tree_parser.py
module to conform to refactor, and added new functionality to parse, modify, draw, and save Newick and NEXUS tree files. siterates
andqmatrix
files now dynamically determine if they are in IQ-TREE format or if they are just in a simple tab-delimited or comma-delimited format.site_rates
andqmat
are now read in as pandas DataFrames with less complex logic.- Added unit test for tree parsing.
- Added integration test for tree parsing.
- Added documentation for tree parsing.
Bug Fixes
- Fixed bug where the
PhylipReader
andStructureReader
classes did not have theverbose
anddebug
attributes.
Changes
q
property is now calledqmat
for clarity and easier searching in files.- Removed redundant
siterates_iqtree
andqmatrix_iqtree
arguments attributes from theGenotypeData
,VCFReader
,PhylipReader
,StructureReader
, andTreeParser
classes. - Added error handling for tree parsing.
- Added error handling for
siterates
andqmatrix
files.
SNPio v1.1.1
Version 1.1.1 (2024-10-25)
Features
- Updated tree parsing functionality and added it to the
TreeParser
class in theanalysis/tree_parser.py
module to conform to refactor, and added new functionality to parse, modify, draw, and save Newick and NEXUS tree files. siterates
andqmatrix
files now dynamically determine if they are in IQ-TREE format or if they are just in a simple tab-delimited or comma-delimited format.site_rates
andqmat
are now read in as pandas DataFrames with less complex logic.- Added unit test for tree parsing.
- Added integration test for tree parsing.
- Added documentation for tree parsing.
Bug Fixes
- Fixed bug where the
PhylipReader
andStructureReader
classes did not have theverbose
anddebug
attributes.
Changes
q
property is now calledqmat
for clarity and easier searching in files.- Removed redundant
siterates_iqtree
andqmatrix_iqtree
arguments attributes from theGenotypeData
,VCFReader
,PhylipReader
,StructureReader
, andTreeParser
classes. - Added error handling for tree parsing.
- Added error handling for
siterates
andqmatrix
files.
v1.1.0.1
- Added UserManual.pdf via pandoc
- tidied up API docs
v1.1.0
Version 1.1.0 (2024-10-08)
Features
- Full refactor of the codebase to improve user-friendliness, maintainability and readability.
- Method chaining: All functions now return the object itself, allowing for method chaining and custom filtering orders with
NRemover2
. - Most objects now just take a
GenotypeData
object as input, making the code more modular and easier to maintain. - Improved documentation and docstrings.
- Improved error handling.
- Improved logging. All logging is now done with the Python logging module via the custom
LoggerManager
class. - Improved testing.
- Improved performance.
- Reduced memory usage.
- Reduced disk usage.
- Reduced CPU usage.
- Reduced execution time, particularly for reading, loading, filtering, and processing large VCF files.
- Improved plotting.
- Improved data handling.
- Improved file handling. All filenames now use pathlib.Path objects.
- Code modularity: Many functions are now in separate modules for better organization.
- Full unit tests for all functions.
- Full integration tests for all functions.
- Full documentation for all functions.
- Method chaining: All functions now return the object itself, allowing for method chaining and custom filtering orders with
Version 1.0.4
Version 1.0.4 (2023-09-10)
Features
- Added functionality to filter out linked SNPs using CHROM and POS fields from VCF file.
Performance
- Made the Sankey plot function more modular and dynamic.
Bug Fixes
- Fix spacing between printed STDOUT.