LoReAn (Long Read Annotation) for automated eukaryotic genome annotation incorporating long-reads

The LoReAn software is an automated annotation pipeline designed for eukaryotic genome annotation. It is built using previously defined annotation rationale and programs, but the key improvement is the incorporation of single-molecule cDNA sequencing data, such as that produced from Oxford Nanopore and from PacBio. We find this significantly improves automated annotations and reduces the requirments for time-consuming manual annotation.

We are working to improve LoReAn documentation. Meanwhile, some more LoReAn information can be found at bioRxiv. For those familar with the annotation process and with docker, there should be enough infomation to run the program. If you have problems, please open an issue.

This is how LoReAn works: LoReAn schematic view

HOW TO RUN

LoReAn requires three mandatory files:

Protein Sequences
Reference genome sequence
Genome name

To install the software:

Please see the installation instructions for details.

The software can be run after installing by:

lorean.py -pr protein.fasta -sp spacies genome.fasta

The full list of options can be found at option instructions or by:

lorean.py --help

LoReAn can run BRAKER to improve Augustus gene prediction;

To do so, short reads from RNA-seq or long reads RNA-seq need to be provided

EXAMPLE DATASET

We made available two datasets that can be used to test LoReAn. The 1st dataset is from Nanopore data of Verticillium dahliae strain JR2 while the second is from PacBio data of Plicaturopsis crispa. Both datasets can be dowloaded from LoReAn Examples

SOFTWARE USED IN THE PIPELINE

TransDecoder-3.0.1
samtools v0.1.19-96b5f2294a
bedtools v2.25.0
bowtie v1.1.2
bamtools v2.4.1
AATpackage r03052011
iAssembler v1.3.2.x64
GeneMark-ES/ET v.4.33 64bit (THIS SOFTWARE IS NOT FREE FOR EVERYONE, check installation instruction)
PASApipeline v2.1.0
augustus v3.3
trinityrnaseq v2.5.1
STAR v2.5.3a
gmap-gsnap v2017-06-20
fasta v36.3.8e
BRAKER v2.0
EVidenceModeler v1.1.1
gffread v0.9.9
genometools v1.5.9

AUTHORS:

Luigi Faino
David Cook
Jose Espejo

Name		Name	Last commit message	Last commit date
Latest commit History 529 Commits
code		code
third_party		third_party
.gitignore		.gitignore
Dockerfile		Dockerfile
INSTALL.md		INSTALL.md
LICENSE.md		LICENSE.md
LoReAn.png		LoReAn.png
OPTIONS.md		OPTIONS.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LoReAn (Long Read Annotation) for automated eukaryotic genome annotation incorporating long-reads

HOW TO RUN

EXAMPLE DATASET

SOFTWARE USED IN THE PIPELINE

AUTHORS:

About

Releases

Packages

Languages

License

eburgueno/LoReAn

Folders and files

Latest commit

History

Repository files navigation

LoReAn (Long Read Annotation) for automated eukaryotic genome annotation incorporating long-reads

HOW TO RUN

EXAMPLE DATASET

SOFTWARE USED IN THE PIPELINE

AUTHORS:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages