Releases: Ensembl/plant-scripts
11012024
This release ships with updates to GET_PANGENES: code changes since the publication of the manuscript, involving:
- fixed bug in handling - strand coords in sub query2ref_coords
- sub _parseCIGARfeature handles correctly 1bp CS-type SNPs when computing overlap with optional query coord
- tested rename_pangenes.pl with MAGIC16 rice dataset, check AgBioData nomenclature rules at https://github.com/Ensembl/plant-scripts/blob/df9cfdef5e49e6f463a08e7ed8ec8a04556735ff/pangenes/rename_pangenes.pl#L5C48-L5C57 ; code to update a previous cluster set not yet in place
15112023
This release ships with updates to:
-
GET_PANGENES: code and documentation changes since the publication of the manuscript, involving improved handling of input GFF files and calculation of overlap coordinates from WGA segments in different strands.
-
REST-based recipes.
pangenes_benchmark
Pangene sets of Arabidopsis (ACK), rice, wheat and barley datasets produced while benchmarking get_pangenes as described at https://doi.org/10.1186/s13059-023-03071-z and https://www.biorxiv.org/content/10.1101/2023.01.03.520531v2
The HOWTO* files contain the actual commands required to produce these results with the input FASTA & GFF files (32GB), which should be first be downloaded from
test_rice
Toy dataset to test the scripts for pan-gene analysis.
nrTEplants
Release 0.3 (Jun2020) the nrTEplants library of plant transposable elements which minimizes overlap with sequence containing protein domains known to be part of NLR genes. This sequence set was computed after combining TREP, SINEbase, REdat, RepetDB, EDTArice, EDTAmaize, SoyBaseTE, TAIR10TE, SunflowerTE, MelonTE, RosaTE and SUNREP and obtaining a non-redundant collection with GET_HOMOLOGUES-EST.
Check the code and documentation at https://github.com/Ensembl/plant_tools/tree/master/bench/repeat_libs
Citation: Contreras-Moreira,B., Filippi,C.V., Naamati,G., Girón,C.G., Allen,J.E. and Flicek,P. (2021) Efficient masking of plant genomes by combining kmer counting and curated repeats Genomics. Plant Genome https://doi.org/10.1002/tpg2.20143
23102020
This release was created to obtain a DOI from Zenodo