Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird formatting of column 9 in gff3 lift-over #8

Open
TDDB-limagrain opened this issue May 30, 2024 · 0 comments
Open

Weird formatting of column 9 in gff3 lift-over #8

TDDB-limagrain opened this issue May 30, 2024 · 0 comments
Assignees
Labels
bug Something isn't working feature_request

Comments

@TDDB-limagrain
Copy link

TDDB-limagrain commented May 30, 2024

Hi @Kuanhao-Chao ,
I was able to properly run Lifton using one plant reference genome and a new one to annotate from the same species.
The command was:

lifton -g ref.gff3 -o liftover.gff3 -P ref.pep.fasta -copies -sc 0.95 newgenome.fasta refgenome.fasta

The resulting lift-over file looks quite good for many gene models, but for some of them, there is a duplication in the exon names. See below the example a gene with 8 exons in the reference genome.

newgenomeLG00 LiftOn gene 279378 285818 . + . ID=newgenomeLG00g00180;Name=newgenomeLG00g00180;source=Liftoff
newgenomeLG00 LiftOn mRNA 279378 285818 . + . ID=newgenomeLG00g00180.1;Parent=newgenomeLG00g00180;Name=newgenomeLG00g00180.1;mutation=frameshift;protein_identity=0.999;dna_identity=1.000;status=LiftOn_chaining_anewgenomeLGorithm
newgenomeLG00 LiftOn exon 279378 280607 . + . ID=_newgenomeLG00_1g00140.1:exon:001;Parent=newgenomeLG00g00180.1
newgenomeLG00 LiftOn exon 281921 282006 . + . ID=_newgenomeLG00_1g00140.1:exon:001;Parent=newgenomeLG00g00180.1
newgenomeLG00 LiftOn exon 282142 282299 . + . ID=_newgenomeLG00_1g00140.1:exon:001;Parent=newgenomeLG00g00180.1
newgenomeLG00 LiftOn exon 282401 283715 . + . ID=_newgenomeLG00_1g00140.1:exon:001;Parent=newgenomeLG00g00180.1
newgenomeLG00 LiftOn exon 283912 284186 . + . ID=_newgenomeLG00_1g00140.1:exon:001;Parent=newgenomeLG00g00180.1
newgenomeLG00 LiftOn exon 284379 284585 . + . ID=_newgenomeLG00_1g00140.1:exon:001;Parent=newgenomeLG00g00180.1
newgenomeLG00 LiftOn exon 284668 284824 . + . ID=_newgenomeLG00_1g00140.1:exon:001;Parent=newgenomeLG00g00180.1
newgenomeLG00 LiftOn exon 285132 285818 . + . ID=_newgenomeLG00_1g00140.1:exon:008;Parent=newgenomeLG00g00180.1
newgenomeLG00 LiftOn CDS 280585 280607 . + 0 Parent=newgenomeLG00g00180.1
newgenomeLG00 LiftOn CDS 281921 282006 . + 1 Parent=newgenomeLG00g00180.1
newgenomeLG00 LiftOn CDS 282142 282299 . + 2 Parent=newgenomeLG00g00180.1
newgenomeLG00 LiftOn CDS 282401 283715 . + 0 Parent=newgenomeLG00g00180.1
newgenomeLG00 LiftOn CDS 283912 284186 . + 2 Parent=newgenomeLG00g00180.1
newgenomeLG00 LiftOn CDS 284379 284585 . + 0 Parent=newgenomeLG00g00180.1
newgenomeLG00 LiftOn CDS 284668 284824 . + 0 Parent=newgenomeLG00g00180.1
newgenomeLG00 LiftOn CDS 285132 285430 . + 2 Parent=newgenomeLG00g00180.1

In addition to that, would it be possible to automatically add a unique ID to each CDS? this can be mandatory for downstream applications.

Thanks!

@Kuanhao-Chao Kuanhao-Chao self-assigned this Jun 1, 2024
@Kuanhao-Chao Kuanhao-Chao added feature_request bug Something isn't working labels Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feature_request
Projects
None yet
Development

No branches or pull requests

2 participants