You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While running LiftOn for some genomes we noticed an ID feature in some gff3 files which causes LiftOn to silently fail. It occurs when the ID field of the mRNA ends with an underscore and integer (e.g. ID=GCA_013396205.1-transcript_rna-gnl-WGS:JAAOAN-mrna.FMUND_1). When corrected to an underscore, a string and an integer LiftOn runs successfully (e.g. ID=GCA_013396205.1-transcript_rna-gnl-WGS:JAAOAN-mrna.FMUND_X1). I've put a full example below:
When ran uncorrected, LiftOn appears to complete but the resulting gff3 file contains no "source=lifton" features, only miniprot. When corrected it contains both:
I was able to trace the issue to step 7 of LiftOn.py but I wasn't able to isolate the specific place it fails. My guess is it's something to do with how gffutils processes features during the chaining stage but I could be mistaken. The log files for both are similar but the run that fails terminates early (I've attached them).
I'm more than happy to share the data and commands we used. Probably best for me to ping over a dropbox link, let me know if that would be useful for you.
The text was updated successfully, but these errors were encountered:
Hi @Kuanhao-Chao ,
While running LiftOn for some genomes we noticed an ID feature in some gff3 files which causes LiftOn to silently fail. It occurs when the ID field of the mRNA ends with an underscore and integer (e.g. ID=GCA_013396205.1-transcript_rna-gnl-WGS:JAAOAN-mrna.FMUND_1). When corrected to an underscore, a string and an integer LiftOn runs successfully (e.g. ID=GCA_013396205.1-transcript_rna-gnl-WGS:JAAOAN-mrna.FMUND_X1). I've put a full example below:
Uncorrected:
Corrected:
When ran uncorrected, LiftOn appears to complete but the resulting gff3 file contains no "source=lifton" features, only miniprot. When corrected it contains both:
Uncorrected
Corrected
I was able to trace the issue to step 7 of LiftOn.py but I wasn't able to isolate the specific place it fails. My guess is it's something to do with how gffutils processes features during the chaining stage but I could be mistaken. The log files for both are similar but the run that fails terminates early (I've attached them).
out_LiftOn.Uncorrected.log
out_LiftOn.Corrected.log
I'm more than happy to share the data and commands we used. Probably best for me to ping over a dropbox link, let me know if that would be useful for you.
The text was updated successfully, but these errors were encountered: