Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

funannotate-predict.py: error: unrecognized arguments: --stopCodonExcludedFromCDS=False #1064

Open
wangpeng-design opened this issue Aug 28, 2024 · 3 comments

Comments

@wangpeng-design
Copy link

Are you using the latest release?
If you are not using the latest release of funannotate, please upgrade, if bug persists then report here.
v1.8.13
Describe the bug
A clear and concise description of what the bug is.
funannotate-predict.py: error: unrecognized arguments: --stopCodonExcludedFromCDS=False
What command did you issue?
Copy/paste the command used.
funannotate predict -i PGChrGenome.softmask.fasta --species "Pleurotus ostreatus" --transcript_alignments transcript_alignments.gff3:8 --protein_alignments protein_alignments.gff:4 --augustus_gff gene_predictions.gff:1 --trnascan tRNA.out -o output_folder --stopCodonExcludedFromCDS=False
Logfiles
Please provide relavent log files of the error.

OS/Install Information

  • output of funannotate check --show-versions

Checking dependencies for 1.8.13

You are running Python v 3.8.15. Now checking python packages...
biopython: 1.83
goatools: 1.2.3
matplotlib: 3.4.3
natsort: 8.4.0
numpy: 1.24.3
pandas: 1.4.2
psutil: 6.0.0
requests: 2.32.3
scikit-learn: 1.1.1
scipy: 1.10.1
seaborn: 0.13.2
All 11 python packages installed

You are running Perl v b'5.026002'. Now checking perl modules...
Carp: 1.38
Clone: 0.42
DBD::SQLite: 1.64
DBD::mysql: 4.046
DBI: 1.642
DB_File: 1.855
Data::Dumper: 2.173
File::Basename: 2.85
File::Which: 1.23
Getopt::Long: 2.5
Hash::Merge: 0.300
JSON: 4.02
LWP::UserAgent: 6.39
Logger::Simple: 2.0
POSIX: 1.76
Parallel::ForkManager: 2.02
Pod::Usage: 1.69
Scalar::Util::Numeric: 0.40
Storable: 3.15
Text::Soundex: 3.05
Thread::Queue: 3.12
Tie::File: 1.02
URI::Escape: 3.31
YAML: 1.29
threads: 2.15
threads::shared: 1.56
ERROR: local::lib not installed, install with cpanm local::lib

Checking Environmental Variables...
$FUNANNOTATE_DB=/public/home/bs20233171040/Genomic_data/funannotate_db
$PASAHOME=/public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/opt/pasa-2.4.1
$TRINITY_HOME=/public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/opt/trinity-2.8.5
$EVM_HOME=/public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/opt/evidencemodeler-1.1.1
$AUGUSTUS_CONFIG_PATH=/public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/config/
$GENEMARK_PATH=/public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/bin/gmes_petap.pl
All 6 environmental variables are set

Checking external dependencies...
samtools: /public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/bin/../lib/libtinfow.so.6: no version information available (required by samtools)
samtools: /public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/bin/../lib/libncursesw.so.6: no version information available (required by samtools)
samtools: /public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/bin/../lib/libncursesw.so.6: no version information available (required by samtools)
PASA: 2.4.1
CodingQuarry: 2.0
Trinity: 2.8.5
augustus: 3.4.0
bamtools: bamtools 2.5.1
bedtools: bedtools v2.30.0
blat: BLAT v35
diamond: 2.1.8
emapper.py: 2.1.12
ete3: 3.1.3
exonerate: exonerate 2.4.0
fasta: no way to determine
glimmerhmm: 3.0.4
gmap: 2017-11-15
gmes_petap.pl: 4.33
hisat2: 2.2.1
hmmscan: HMMER 3.3.2 (Nov 2020)
hmmsearch: HMMER 3.3.2 (Nov 2020)
java: 11.0.9.1-internal
kallisto: 0.46.1
mafft: v7.525 (2024/Mar/13)
makeblastdb: makeblastdb 2.2.31+
minimap2: 2.28-r1209
pigz: pigz 2.8
proteinortho: 6.0.34
pslCDnaFilter: no way to determine
salmon: salmon 0.14.1
samtools: samtools 1.15.1
signalp: environment.
snap: 2006-07-28
stringtie: 2.2.1
tRNAscan-SE: 2.0.9 (July 2021)
tantan: tantan 31
tbl2asn: no way to determine, likely 25.X
tblastn: tblastn 2.2.31+
trimal: trimAl v1.4.rev15 build[2013-12-17]
trimmomatic: 0.39
All 37 external dependencies are installed

@hyphaltip
Copy link
Collaborator

thats an augustus parameter not a funannotate parameter so you should not provide it.

The directions which mention that cmdline parameter are telling you how to run augustus OUTSIDE of funannotate and then provide a GFF file to funannotate if you want to do it in your own way. However if you are running funannotate normally where it will train and run augustus for your automatically then this parameter is already sent to augustus.

@hyphaltip
Copy link
Collaborator

here are all the cmdline options to predict:

funannotate predict

Usage:       funannotate predict <arguments>
version:     1.8.17

Description: Script takes genome multi-fasta file and a variety of inputs to do a comprehensive whole
             genome gene prediction.  Uses AUGUSTUS, GeneMark, Snap, GlimmerHMM, BUSCO, EVidence Modeler,
             tbl2asn, tRNAScan-SE, Exonerate, minimap2.
Required:
  -i, --input              Genome multi-FASTA file (softmasked repeats)
  -o, --out                Output folder name
  -s, --species            Species name, use quotes for binomial, e.g. "Aspergillus fumigatus"

Optional:
  -p, --parameters         Ab intio parameters JSON file to use for gene predictors
  --isolate                Isolate name, e.g. Af293
  --strain                 Strain name, e.g. FGSCA4
  --name                   Locus tag name (assigned by NCBI?). Default: FUN_
  --numbering              Specify where gene numbering starts. Default: 1
  --maker_gff              MAKER2 GFF file. Parse results directly to EVM.
  --pasa_gff               PASA generated gene models. filename:weight
  --other_gff              Annotation pass-through to EVM. filename:weight
  --rna_bam                RNA-seq mapped to genome to train Augustus/GeneMark-ET
  --stringtie              StringTie GTF result
  -w, --weights            Ab-initio predictor and EVM weight. Example: augustus:2 or pasa:10
  --augustus_species       Augustus species config. Default: uses species name
  --min_training_models    Minimum number of models to train Augustus. Default: 200
  --genemark_mode          GeneMark mode. Default: ES [ES,ET]
  --genemark_mod           GeneMark ini mod file
  --busco_seed_species     Augustus pre-trained species to start BUSCO. Default: anidulans
  --optimize_augustus      Run 'optimze_augustus.pl' to refine training (long runtime)
  --busco_db               BUSCO models. Default: dikarya. `funannotate outgroups --show_buscos`
  --organism               Fungal-specific options. Default: fungus. [fungus,other]
  --ploidy                 Ploidy of assembly. Default: 1
  -t, --tbl2asn            Assembly parameters for tbl2asn. Default: "-l paired-ends"
  -d, --database           Path to funannotate database. Default: $FUNANNOTATE_DB

  --protein_evidence       Proteins to map to genome (prot1.fa prot2.fa uniprot.fa). Default: uniprot.fa
  --protein_alignments     Pre-computed protein alignments in GFF3 format
  --p2g_pident             Exonerate percent identity. Default: 80
  --p2g_diamond_db         Premade diamond genome database for protein2genome mapping
  --p2g_prefilter          Pre-filter hits software selection. Default: diamond [tblastn]
  --transcript_evidence    mRNA/ESTs to align to genome (trans1.fa ests.fa trinity.fa). Default: none
  --transcript_alignments  Pre-computed transcript alignments in GFF3 format
  --augustus_gff           Pre-computed AUGUSTUS GFF3 results (must use --stopCodonExcludedFromCDS=False)
  --genemark_gtf           Pre-computed GeneMark GTF results
  --trnascan               Pre-computed tRNAscanSE results

  --min_intronlen          Minimum intron length. Default: 10
  --max_intronlen          Maximum intron length. Default: 3000
  --soft_mask              Softmasked length threshold for GeneMark. Default: 2000
  --min_protlen            Minimum protein length. Default: 50
  --repeats2evm            Use repeats in EVM consensus model building
  --keep_evm               Keep existing EVM results (for rerunning pipeline)
  --evm-partition-interval Min length between genes to make a partition: Default: 1500
  --no-evm-partitions      Do not split contigs into partitions
  --repeat_filter          Repetitive gene model filtering. Default: overlap blast [overlap,blast,none]
  --keep_no_stops          Keep gene models without valid stops
  --SeqCenter              Sequencing facilty for NCBI tbl file. Default: CFMR
  --SeqAccession           Sequence accession number for NCBI tbl file. Default: 12345
  --force                  Annotated unmasked genome
  --cpus                   Number of CPUs to use. Default: 2
  --no-progress            Do not print progress to stdout for long sub jobs
  --tmpdir                 Volume/location to write temporary files. Default: /tmp
  --header_length          Maximum length of FASTA headers. Default: 16

ENV Vars:  If not specified at runtime, will be loaded from your $PATH
  --EVM_HOME
  --AUGUSTUS_CONFIG_PATH
  --GENEMARK_PATH
  --BAMTOOLS_PATH

@wangpeng-design
Copy link
Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants