Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Augustus Training Error: Missing ExonModel File and UnboundLocalError in Funannotate v1.8.17 Despite Pretrained Species Use #1069

Open
kalonji08 opened this issue Sep 20, 2024 · 5 comments

Comments

@kalonji08
Copy link

Are you using the latest release?
Yes, I am using funannotate version v1.8.17, and the issue persists.

Describe the bug
While running the funannotate predict pipeline, I encountered an issue during the Augustus training step. Augustus fails to open the trametes_hisurta_exon_probs.pbl file during ab initio prediction. I tried running the pipeline with a pretrained species model as well, but the same error occurred. This pipeline worked fine with a previous version of funannotate, but with the current version, it results in the following error:

augustus: ERROR
        ExonModel: Couldn't open file /media/external/FA-Trametes/new_trametes_run/prediction_trametes_1/predict_misc/ab_initio_parameters/augustus/species/trametes_hisurta/trametes_hisurta_exon_probs.pbl

Additionally, I found that the following files are present in the species folder but not the one that caused the error:

trametes_hisurta_metapars.cfg  
trametes_hisurta_metapars.cgp.cfg  
trametes_hisurta_metapars.utr.cfg  
trametes_hisurta_parameters.cfg  
trametes_hisurta_weightmatrix.txt

The traceback error in funannotate is:

UnboundLocalError: local variable 'values1' referenced before assignment

What command did you issue?

funannotate predict -i Trametes-hirsuta-S2-contigs_masked.fasta -o prediction_trametes_1 -s "Trametes hisurta" --cpus 22

Logfiles
Here are the relevant portions from the log:

[Sep 20 12:48 PM]: OS: Ubuntu 20.04, 32 cores, ~ 82 GB RAM. Python: 3.8.19
[Sep 20 12:48 PM]: Running funannotate v1.8.17
[Sep 20 12:48 PM]: GeneMark not found and $GENEMARK_PATH environmental variable missing. Will skip GeneMark ab-initio prediction.
[Sep 20 12:48 PM]: Skipping CodingQuarry as no --rna_bam passed
[Sep 20 02:54 PM]: Training Augustus using BUSCO gene models
augustus: ERROR
        ExonModel: Couldn't open file /media/external/FA-Trametes/new_trametes_run/prediction_trametes_1/predict_misc/ab_initio_parameters/augustus/species/trametes_hisurta/trametes_hisurta_exon_probs.pbl

OS/Install Information
Ubuntu 20.04
Python 3.8.19
funannotate v1.8.17

Output of funannotate check --show-versions:

-------------------------------------------------------
Checking dependencies for funannotate v1.8.17
-------------------------------------------------------
You are running Python 3.8.19. Now checking python packages...
biopython: 1.79
goatools: 1.4.12
matplotlib: 3.5.2
natsort: 8.4.0
numpy: 1.24.4
pandas: 1.4.3
psutil: 5.9.4
requests: 2.32.3
scikit-learn: 1.3.2
scipy: 1.8.1
seaborn: 0.11.2
All 11 python packages installed

You are running Perl v5.032001. Now checking perl modules...
All 27 Perl modules installed

Checking Environmental Variables...
$FUNANNOTATE_DB=/home/kalonjilab/funannotate_db
$PASAHOME=/home/kalonjilab/miniconda3/envs/funannotate/opt/pasa-2.5.3
$TRINITY_HOME=/home/kalonjilab/miniconda3/envs/funannotate/opt/trinity-2.15.2
$EVM_HOME=/home/kalonjilab/miniconda3/envs/funannotate/opt/evidencemodeler-2.1.0
$AUGUSTUS_CONFIG_PATH=/home/kalonjilab/miniconda3/envs/funannotate/config/
        ERROR: GENEMARK_PATH not set. export GENEMARK_PATH=/path/to/dir
-------------------------------------------------------
Checking external dependencies...
PASA: 2.5.3
CodingQuarry: 2.0
Trinity: 2.15.2
augustus: 3.5.0
bedtools: v2.31.1
diamond: 2.1.8
exonerate: 2.4.0
samtools: 1.21
tbl2asn: 25.8
        ERROR: gmes_petap.pl not installed
        ERROR: signalp not installed
@kalonji08
Copy link
Author

I also run this :
funannotate test -t predict --cpus 12

And i got a similar error:
_misc/EVM/CP022973.1/CP022973.1_294245-357021/evm.out.log'],), **{})
error: [Errno 2] No such file or directory: '/home/kalonjilab/miniconda3/envs/funannotate/opt/evidencemodeler-2.1.0/evidence_modeler.pl' run(*(['/home/kalonjilab/miniconda3/envs/funannotate/opt/evidencemodeler-2.1.0/evidence_modeler.pl', '-G', '/home/kalonjilab/testing_FA/test_mamba/test-predict_b960c5d6-9f1a-433b-9fa5-d84404cf30ca/annotate/predict_misc/EVM/CP022973.1/CP022973.1_431922-576084/genome.softmasked.fa', '-g', '/home/kalonjilab/testing_FA/test_mamba/test-predict_b960c5d6-9f1a-433b-9fa5-d84404cf30ca/annotate/predict_misc/EVM/CP022973.1/CP022973.1_431922-576084/gene_predictions.gff3', '-w', '/home/kalonjilab/testing_FA/test_mamba/test-predict_b960c5d6-9f1a-433b-9fa5-d84404cf30ca/annotate/predict_misc/weights.evm.txt', '--min_intron_length', '10', '--exec_dir', '/home/kalonjilab/testing_FA/test_mamba/test-predict_b960c5d6-9f1a-433b-9fa5-d84404cf30ca/annotate/predict_misc/EVM/CP022973.1/CP022973.1_431922-576084', '-p', '/home/kalonjilab/testing_FA/test_mamba/test-predict_b960c5d6-9f1a-433b-9fa5-d84404cf30ca/annotate/predict_misc/EVM/CP022973.1/CP022973.1_431922-576084/protein_alignments.gff3', '/home/kalonjilab/testing_FA/test_mamba/test-predict_b960c5d6-9f1a-433b-9fa5-d84404cf30ca/annotate/predict_misc/EVM/CP022973.1/CP022973.1_431922-576084/evm.out', '/home/kalonjilab/testing_FA/test_mamba/test-predict_b960c5d6-9f1a-433b-9fa5-d84404cf30ca/annotate/predict_misc/EVM/CP022973.1/CP022973.1_431922-576084/evm.out.log'],), **{})
Progress: 38 complete, 0 failed, 0 remaining
[Sep 20 04:30 PM]: Converting to GFF3 and collecting all EVM results
[Sep 20 04:30 PM]: Evidence modeler has failed, exiting
#########################################################
Traceback (most recent call last):
File "/home/kalonjilab/miniconda3/envs/funannotate/bin/funannotate", line 10, in
sys.exit(main())
File "/home/kalonjilab/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/funannotate.py", line 717, in main
mod.main(arguments)
File "/home/kalonjilab/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/test.py", line 405, in main
runPredictTest(args)
File "/home/kalonjilab/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/test.py", line 160, in runPredictTest
assert 1500 <= countGFFgenes(os.path.join(
File "/home/kalonjilab/miniconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/test.py", line 45, in countGFFgenes
with open(input, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'test-predict_b960c5d6-9f1a-433b-9fa5-d84404cf30ca/annotate/predict_results/Awesome_testicus.gff3'

@hyphaltip
Copy link
Collaborator

hard to say - the test failure seems to be more EVM related not augustus,
do you have a partially failed augustus training run
I would remove
/media/external/FA-Trametes/new_trametes_run/prediction_trametes_1/predict_misc/ab_initio_parameters and rerun predict and see?

@hyphaltip
Copy link
Collaborator

hyphaltip commented Sep 23, 2024

The pre-training I wonder reflects different versions of augustus - it may be that pre-trained was for an earlier version of augustus than the one you have installed now since it sounds like the exon probability file not generated.

An alternative problem which the testing step is revealing, suggests augustus or something else is not really working in your install - can you test runnning augustus alone? It is reporting a version number so I assume it is partially ok.

@JWDebler
Copy link

JWDebler commented Oct 7, 2024

I'm running into the same Augustus BUSCO training error when running the test pipeline.
@kalonji08 , your EVM error is because of a filename or location change, it can be fixed by, depending on how you installed it, going to the evidencmodeler directory and creating a link to the perl file funannotate is looking for.

cd /data/mamba_envs/envs/funannotate/opt/evidencemodeler-2.1.0/
ln -s ./EvmUtils/evidence_modeler.pl ./evidence_modeler.pl

However, after fixing that you will likely run into the error discussed here.

[Oct 07 04:05 AM]: Training Augustus using BUSCO gene models
Error: In sequence CP022970.1_52703-54400: One CDS exon does not begin properly after the previous CDS exon.602 >= 600
GBProcessor::getGeneList(): Intron has non-positive length.
.
.
.

augustus: ERROR
        No genbank sequences found.

Traceback (most recent call last):
  File "/data/mamba_envs/envs/funannotate/bin/funannotate", line 10, in <module>
    sys.exit(main())
  File "/data/mamba_envs/envs/funannotate/lib/python3.9/site-packages/funannotate/funannotate.py", line 717, in main
    mod.main(arguments)
  File "/data/mamba_envs/envs/funannotate/lib/python3.9/site-packages/funannotate/predict.py", line 2094, in main
    lib.trainAugustus(
  File "/data/mamba_envs/envs/funannotate/lib/python3.9/site-packages/funannotate/library.py", line 10971, in trainAugustus
    train_results = getTrainResults(
  File "/data/mamba_envs/envs/funannotate/lib/python3.9/site-packages/funannotate/library.py", line 10708, in getTrainResults
    float(values1[1]),
UnboundLocalError: local variable 'values1' referenced before assignment
#########################################################
Traceback (most recent call last):
  File "/data/mamba_envs/envs/funannotate/bin/funannotate", line 10, in <module>
    sys.exit(main())
  File "/data/mamba_envs/envs/funannotate/lib/python3.9/site-packages/funannotate/funannotate.py", line 717, in main
    mod.main(arguments)
  File "/data/mamba_envs/envs/funannotate/lib/python3.9/site-packages/funannotate/test.py", line 407, in main
    runBuscoTest(args)
  File "/data/mamba_envs/envs/funannotate/lib/python3.9/site-packages/funannotate/test.py", line 200, in runBuscoTest
    assert 1500 <= countGFFgenes(os.path.join(
  File "/data/mamba_envs/envs/funannotate/lib/python3.9/site-packages/funannotate/test.py", line 45, in countGFFgenes
    with open(input, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'test-busco_e2cf0158-c60d-4c98-bc2c-41a5406a46d0/annotate/predict_results/Awesome_busco.gff3'

Deleting intermediate data of a failed test run fixed this problem.

@nextgenusfs
Copy link
Owner

nextgenusfs commented Oct 22, 2024

per #1071 lets try to see which Augustus specific conda builds are failing. @JWDebler has two environments where one is working and the other is not, my guess is that one of the Augustus builds from conda is the issue.

I develop on Mac, so I end up using a dockerized Augustus method because it's so highly sensitive to compiler issues that aren't detected in the normal build process. https://github.com/nextgenusfs/dockerized-augustus. It is highly annoying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants