Releases: PGScatalog/pgsc_calc
v2.0.0-beta.1
Changelog
Bug fixes
- Fix samplesheet parsing error warnings by @smlmbrt in #322
- Write consistent column sets to variant information files by @nebfield in #330
Full Changelog: v2.0.0-beta...v2.0.0-beta.1
v2.0.0-beta
Changelog
Graduating to beta with the release of our preprint 🎉
Improvements
- Improve aggregation PGScatalog/pygscatalog#23
- Improve matching performance PGScatalog/pygscatalog#22
- Improve match error docs #311
- Publish dependencies to Bioconda to improve conda profile UX
Bug fixes
- Fix for PGScatalog/pygscatalog#21
- Closes #301
- Specify modules explicitly to fix #312
- Fix bim input to
pgscatalog-aggregate
#319
pgsc_calc v2.0.0-alpha.6
Changelog
2024-05-28 update: We're investigating unexpected pgscatalog.core.lib.pgsexceptions.MatchRateError
in some environments (e.g. UK Biobank on a HPC). This release has been downgraded to a pre-release
Please note the minimum required nextflow version has been updated to v23.10.0, released in October 2023. Run nextflow self-update
to upgrade your nextflow version.
Improvements
- Migrate our custom python tools to new
pygscatalog
packages- Reference / target intersection now considers allelic frequency and variant missingness to determine PCA eligibility
- Downloads from PGS Catalog should be faster (async)
- Packages are now documented
- Update plink version to alpha 5.10 final #179
- Add docs describing cloud execution
- Add correlation test comparing calculated scores against known good scores
- When matching variants, matching logs are now written before scorefiles to improve debugging UX
- Improvements to PCA quality (ensuring low missingness and suitable MAF for PCA-eligble variants in target samples).
- This could allow us to implement MAF/missingness filters for scoring file variants in the future.
Bug fixes
- Fix ancestry adjustment with VCFs #252
- Fix support for scoring files that only have one effect type column #280
- Fix adjusting PGS with zero variance (skip them) #283
- Check for reserved characters in sampleset names
Known bug
- Incorrectly adjusting the
AVG
in--run_ancestry
mode #301 - unexpected
pgscatalog.core.lib.pgsexceptions.MatchRateError
in some environments (e.g. UK Biobank on a HPC)
pgsc_calc v2.0.0-alpha.5
Changelog
Improvements
- Automatically mount directories inside singularity containers without setting any configuration
- Improve permanent caching of ancestry processes with
--genotypes_cache
parameter - resync with nf-core framework
- Refactor combine_scorefiles to improve speed and quality control processes
Bug fixes
- Fix semantic storeDir definitions causing problems cloud execution (google batch)
- Fix missing DENOM values with multiple custom scoring files (score calculation not affected)
- Fix liftover failing silently with custom scoring files (thanks Brooke!)
Misc:
- Move aggregation step out of report
- Improve speed of
ANCESTRY_ANALYSIS
pgsc_calc v2.0.0-alpha.4
Changelog
Improvements
- Give a more helpful error message when there's no valid matches in
match_combine
Bug fixes
- Fix retrying downloads when the EBI servers are sleepy on a Monday morning
- Fix numeric sample identifiers breaking ancestry analysis
- Check chr prefix in samplesheets
pgsc_calc v2.0.0-alpha.3
Improvements:
- Automatically retry scoring with more RAM on larger datasets
- Describe scoring precision in docs
- Change handling of VCFs to reduce errors when recoding
- Internal changes to improve support for custom reference panels
Bug fixes:
- Fix VCF input to ancestry projection subworkflow (thanks
frahimov
andAWS-crafter
for patiently debugging) - Fix scoring options when reading allelic frequencies from a reference panel (thanks
raimondsre
for reporting the changes from v1.3.2 -> 2.0.0-alpha) - Fix conda profile action
pgsc_calc v2.0.0-alpha.2
Changelog
- Bump
pgscatalog_utils
v0.4.0 -> v0.4.1- Closes #165
pgsc_calc v2.0.0-alpha.1
This patch fixes a bug when running the workflow directly from github with the test profile (i.e. without cloning first). Thanks to @staedlern for reporting the problem.
pgsc_calc v2.0.0-alpha
This is the alpha release of the pgsc_calc
pipeline's major new feature: to compare samples to a reference population in order to adjust PGS with genetic ancestry data (see documentation for details). The normal calculation of PGS is largely unaffected and directly comparable with previous versions of the calculator and PGS calculated with other tools.
Features
Major
- Breaking changes to samplesheet structure to provide more flexible support for extra genomic file types in the future.
- Genetic ancestry group similarity is calculated to a population reference panel (default: 1000 Genomes) when the
--run_ancestry
flag is supplied. This runs using PCA and projection implemented in thefraposa_pgsc (v0.1.0)
package. - Calculated PGS can be adjusted for genetic ancestry using empirical PGS distributions from the most similar reference panel population or continuous PCA-based regressions.
These new features are optional and don't run in the default workflow.
Minor
- Speed optimizations for PGS scoring (skipping allele frequency calculation). Thanks to @mglev1n for the suggestion!
Credits
pgsc_calc v1.3.2
This patch fixes a bug that caused the effect weight column in some PGS Catalog scoring files to be read as strings instead of floats, which triggered an assertion error. Thanks to @j0n-a for reporting the problem.