Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NCBIGene:6052 doesn't have a label, although it should #246

Open
gaurav opened this issue Jan 5, 2024 · 1 comment
Open

NCBIGene:6052 doesn't have a label, although it should #246

gaurav opened this issue Jan 5, 2024 · 1 comment

Comments

@gaurav
Copy link
Contributor

gaurav commented Jan 5, 2024

NCBIGene:6052 doesn't have a label, but it does have one at https://www.ncbi.nlm.nih.gov/gene/?term=6052

It also has name and label information in gene_info.gz:

#tax_id GeneID  Symbol  LocusTag        Synonyms        dbXrefs chromosome      map_location    description     type_of_gene    Symbol_from_nomenclature_authority      Full_name_from_nomenclature_authority   Nomenclature_status     Other_designations      Modification_date       Feature_type
9606    6052    RNR1    -       -       MIM:180450|HGNC:HGNC:10082      13      13p12   RNA, ribosomal 45S cluster 1    other   RNR1    RNA, ribosomal 45S cluster 1    O       45S rDNA cluster 1|RNA, ribosomal 1|RNA, ribosomal cluster 1|Ribosomal RNA-1    20220508        -       

This appears to be because it is type other, which is skipped when generating label and synonym files:

https://github.com/TranslatorSRI/Babel/blob/f7ed8f0736c799c1389d3f5b0a2d045634da2e46/src/datahandlers/ncbigene.py#L60-L61

We should pick up a preferred label from OMIM, but we don't appear to use OMIM labels at all.

@gaurav
Copy link
Contributor Author

gaurav commented Aug 2, 2024

I guess the logic is that since this is a "ribosomal 45S cluster", we don't want it show up as a gene at all. But we're pulling it in from OMIM: it shows up in babel_downloads/OMIM/mim2gene.txt:

180450	gene	6052	RNR1

So we should either:

  1. Add OMIM labels so we can apply a label to this entry.
  2. Include other entries from NCBIGene so this looks more like a Gene in NodeNorm.
  3. Come up with some way to correctly classify this entry as a non-Gene thing based on OMIM information.

@cbizon Any thoughts? I'm considering just classifying this as "Non-urgent" to deal with later, but I don't know if non-gene OMIM entries are likely to show up in results, and therefore should be properly dealt with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant