Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another test data failure after fresh conda install #1073

Open
amcomeau opened this issue Oct 22, 2024 · 2 comments
Open

Another test data failure after fresh conda install #1073

amcomeau opened this issue Oct 22, 2024 · 2 comments

Comments

@amcomeau
Copy link

I have another similar error to what is showing on a lot of posts, but this is with the most current version install with a fresh environment, so I'm not sure why all dependencies are not included in the conda. I have had to download a few things so far to get other sections working, but now there seems to be a problem with AUGUSTUS even though it is specifically installed as part of the conda.

Are you using the latest release?
Yes, directly installed from conda.

Describe the bug
Test run fails with AUGUSTUS intron errors.

What command did you issue?
funannotate test -t all --cpus 40

Logfiles
The Clean, Mask and Predict (unit testing) modules all complete...then we hit the error at the BUSCO training module:

Running funannotate predict BUSCO-mediated training unit testing
CMD: funannotate predict -i test.softmasked.fa --protein_evidence protein.evidence.fasta -o annotate --cpus 40 --species Awesome busco

[Oct 21 10:38 PM]: OS: Ubuntu 20.04, 48 cores, ~ 264 GB RAM. Python: 3.8.19
[Oct 21 10:38 PM]: Running funannotate v1.8.17
[Oct 21 10:38 PM]: GeneMark not found and $GENEMARK_PATH environmental variable missing. Will skip GeneMark ab-initio prediction.
[Oct 21 10:38 PM]: Skipping CodingQuarry as no --rna_bam passed
[Oct 21 10:38 PM]: Parsed training data, run ab-initio gene predictors as follows:
Program Training-Method
augustus busco
glimmerhmm busco
snap busco
[Oct 21 10:38 PM]: Loading genome assembly and parsing soft-masked repetitive sequences
[Oct 21 10:38 PM]: Genome loaded: 6 scaffolds; 3,776,588 bp; 19.75% repeats masked
/home/an351485/bin/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-p2g.py:14: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
from pkg_resources import parse_version
[Oct 21 10:38 PM]: Mapping 1,065 proteins to genome using diamond and exonerate
[Oct 21 10:38 PM]: Found 1,505 preliminary alignments with diamond in 0:00:01 --> generated FASTA files for exonerate in 0:00:00
Progress: 1505 complete, 0 failed, 0 remaining
[Oct 21 10:38 PM]: Exonerate finished in 0:00:11: found 1,272 alignments
[Oct 21 10:38 PM]: Running BUSCO to find conserved gene models for training ab-initio predictors
[Oct 21 10:40 PM]: 370 valid BUSCO predictions found, validating protein sequences
[Oct 21 10:41 PM]: 202 BUSCO predictions validated
[Oct 21 10:41 PM]: Training Augustus using BUSCO gene models
Error: In sequence CP022970.1_48453-52721: One CDS exon does not begin properly after the previous CDS exon.602 >= 600
GBProcessor::getGeneList(): Intron has non-positive length.
Encountered error after reading 0 annotations.

...this then continues for multiple instances of the above error...

augustus: ERROR
No genbank sequences found.

Traceback (most recent call last):
File "/home/an351485/bin/miniconda3/envs/funannotate/bin/funannotate", line 10, in
sys.exit(main())
File "/home/an351485/bin/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/funannotate.py", line 717, in main
mod.main(arguments)
File "/home/an351485/bin/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/predict.py", line 2094, in main
lib.trainAugustus(
File "/home/an351485/bin/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/library.py", line 10971, in trainAugustus
train_results = getTrainResults(
File "/home/an351485/bin/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/library.py", line 10708, in getTrainResults
float(values1[1]),
UnboundLocalError: local variable 'values1' referenced before assignment
#########################################################
Traceback (most recent call last):
File "/home/an351485/bin/miniconda3/envs/funannotate/bin/funannotate", line 10, in
sys.exit(main())
File "/home/an351485/bin/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/funannotate.py", line 717, in main
mod.main(arguments)
File "/home/an351485/bin/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/test.py", line 407, in main
runBuscoTest(args)
File "/home/an351485/bin/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/test.py", line 200, in runBuscoTest
assert 1500 <= countGFFgenes(os.path.join(
File "/home/an351485/bin/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/test.py", line 45, in countGFFgenes
with open(input, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'test-busco_cc4bc53d-61d2-4825-b1bd-5dde79eb56b6/annotate/predict_results/Awesome_busco.gff3'

OS/Install Information

Checking dependencies for 1.8.17

You are running Python v 3.8.19. Now checking python packages...
biopython: 1.76
goatools: 1.4.12
matplotlib: 3.7.3
natsort: 8.4.0
numpy: 1.24.4
pandas: 2.0.3
psutil: 5.7.0
requests: 2.32.3
scikit-learn: 1.3.2
scipy: 1.10.1
seaborn: 0.13.2
All 11 python packages installed

You are running Perl v b'5.032001'. Now checking perl modules...
Carp: 1.50
Clone: 0.46
DBD::SQLite: 1.72
DBD::mysql: 4.050
DBI: 1.643
DB_File: 1.858
Data::Dumper: 2.183
File::Basename: 2.85
File::Which: 1.24
Getopt::Long: 2.58
Hash::Merge: 0.302
JSON: 4.10
LWP::UserAgent: 6.67
Logger::Simple: 2.0
POSIX: 1.94
Parallel::ForkManager: 2.03
Pod::Usage: 1.69
Scalar::Util::Numeric: 0.40
Storable: 3.15
Text::Soundex: 3.05
Thread::Queue: 3.14
Tie::File: 1.06
URI::Escape: 5.17
YAML: 1.30
local::lib: 2.000029
threads: 2.25
threads::shared: 1.61
All 27 Perl modules installed

Checking Environmental Variables...
$FUNANNOTATE_DB=/home/an351485/bin/miniconda3/envs/funannotate/funannotate_db/
$PASAHOME=/home/an351485/bin/miniconda3/envs/funannotate/opt/pasa-2.5.3
$TRINITY_HOME=/home/an351485/bin/miniconda3/envs/funannotate/opt/trinity-2.15.2
$EVM_HOME=/home/an351485/bin/miniconda3/envs/funannotate/opt/evidencemodeler-2.1.0
$AUGUSTUS_CONFIG_PATH=/home/an351485/bin/miniconda3/envs/funannotate/config/
ERROR: GENEMARK_PATH not set. export GENEMARK_PATH=/path/to/dir

Checking external dependencies...
PASA: 2.5.3
CodingQuarry: 2.0
Trinity: 2.15.2
augustus: 3.5.0
bamtools: bamtools 2.5.2
bedtools: bedtools v2.31.1
blat: BLAT v39x1
diamond: 2.1.10
emapper.py: 2.1.12
ete3: 3.1.3
exonerate: exonerate 2.4.0
fasta: 36.3.8g
glimmerhmm: 3.0.4
gmap: 2024-10-10
hisat2: 2.2.1
hmmscan: HMMER 3.4 (Aug 2023)
hmmsearch: HMMER 3.4 (Aug 2023)
java: 22.0.1-internal
kallisto: 0.46.1
mafft: v7.526 (2024/Apr/26)
makeblastdb: makeblastdb 2.16.0+
minimap2: 2.28-r1209
pigz: 2.8
proteinortho: 6.3.2
pslCDnaFilter: no way to determine
salmon: salmon 1.10.3
samtools: samtools 1.21
snap: 2006-07-28
stringtie: 2.2.3
tRNAscan-SE: 2.0.12 (Nov 2022)
tantan: tantan 50
tbl2asn: 25.8
tblastn: tblastn 2.16.0+
trimal: trimAl v1.5.rev0 build[2024-05-27]
trimmomatic: 0.39
ERROR: gmes_petap.pl not installed
ERROR: signalp not installed

Note that I'm not interested in using GeneMark-ES, nor SignalP, for the moment, so ignoring those errors for the time being (should still complete without them).

@ceneg
Copy link

ceneg commented Nov 7, 2024

I did a fresh install with Funannotate as well.
When doing a "funannotate test -t all --cpus 60" with both

  • augustus=3.5.0=pl5321h95201ac_4
  • augustus=3.5.0=pl5321heb9362c_5
    the test fails with many errors of the type
GBProcessor::getGeneList(): Intron has non-positive length.
Encountered error after reading 0 annotations.

Has there been any development on this issue?

Update: this only happens in the "BUSCO-mediated training unit testing", right after
Training Augustus using BUSCO gene models

The initial BUSCO prediction seems to run without problems:

[Nov 07 08:34 AM]: Running Augustus gene prediction using saccharomyces parameters
     Progress: 11 complete, 0 failed, 0 remaining        
[Nov 07 08:35 AM]: 1,485 predictions from Augustus
[Nov 07 08:35 AM]: Pulling out high quality Augustus predictions
[Nov 07 08:35 AM]: Found 371 high quality predictions from Augustus (>90% exon evidence)

and running only "funannotate test -t predict" finishes successfully.

@hassantarabai
Copy link

hassantarabai commented Nov 13, 2024

same eror here with frehs installation and using teh suggested augustus version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants