Switch to use uPheno base and import phenotype ontologies separately #94

matentzn · 2024-11-18T18:43:15Z

But really is a refactoring of the full import system.

This ensures that we always have the latest phenotype ontologies in phenio, independent of any particular phenio release. I have eyeballed the resulting ontology and it has still waaaay too many dangling classes, I will try to clean those in subsequent commits.

@caufieldjh I made this a draft because I need to do quite a bit of clean up of phenio which has become quite messy in terms of dangling classes. Will come back to you updates, but feel free to look at the PR and tell me your general opinion of my suggestion

This ensures that we always have the latest phenotype ontologies in phenio, independent of any particular phenio release. I have eyeballed the resulting ontology and it has still waaaay too many dangling classes, I will try to clean those in subsequent commits

caufieldjh · 2024-11-18T22:26:50Z

makes sense to me conceptually - seems like it could become an issue if there's discrepancies between the most recent version of a phenotype ontology and upheno, but I will trust you on this

matentzn · 2024-11-19T16:48:33Z

src/ontology/phenio-odk.yaml

@caufieldjh @kevinschaper can you review just this file please while I am cleaning stuff? Ever line matters; if you don't understand a change better ask. Big implications for everything.

Is there a bridge file for upheno to OBA already?

Some of these newly imported ID resources like CGNC are new to me - are they coming in because Upheno uses them?

https://github.com/monarch-initiative/phenio/pull/94/files#r1848735596

matentzn · 2024-11-19T17:02:31Z

src/ontology/phenio-odk.yaml

+    - <http://identifiers.org/hgnc/*>
+    - <http://identifiers.org/ncbigene/*>
+    # From PRO:
+    - <http://www.ncbi.nlm.nih.gov/gene/*>


@caufieldjh all the above, thats why this list is important, are excluded ID spaces. If you feel something is not rightfully excluded, we need to properly import it.

ah! I see. In that case I will also suggest excluding:

http://purl.obolibrary.org/obo/STATO_*

DOID

Wikidata prefixes (WD_Entity and WD_Prop)

otherwise I'm happy with the list barring any objection from @kevinschaper

It has become too big

1. added custom goals for go and emapa to enable additional processing steps (in particular creating an artificial root term for emapa so the ontology looks better in a browser) 2. Remove the biolink node assignment entirely from phenio 3. Exclude many more namespaces 4. Removing all the uberon bridges in favour of amazingly simple approach using sssom:inject (with uberon and cl mappings).

src/ontology/config/mappings-to-uberon-bridge.rules

src/ontology/phenio-odk.yaml

gouttegd · 2024-11-23T13:27:43Z

src/ontology/phenio.Makefile

+		remove --select "<http://purl.obolibrary.org/obo/OPL_*>" \
+		remove --select "<https://bioregistry.io/lipidmaps*>" \
+		remove --term owl:Thing --term owl:Nothing \
+		rename --mappings config/property-map.tsv --allow-missing-entities true --allow-duplicates true \


Not wanting to push too hard for my own software, but if you want to use a real SSSOM file for the “property map”, you could use the sssom:rename command of the SSSOM plugin.

That’s what we do in Uberon, using this mapping set as source.

Previously we were using the Mondo gene imports as a basis, which led to a bunch of dangling gene classes. This should now be fixed.

This is a big refactor, removing all the uberon and upheno bridges as hard imports and replacing them by SSSOM based pipelines using java-sssom.

caufieldjh · 2024-11-25T15:59:22Z

What remains to be done on this PR before it's ready to merge? Just reviewing?

matentzn · 2024-11-25T17:10:13Z

@caufieldjh It is ready from my perspective!

caufieldjh · 2024-11-25T17:17:32Z

Great - thanks @matentzn and @gouttegd !

matentzn mentioned this pull request Nov 18, 2024

phenio drops xrefs #91

Closed

Update phenio-odk.yaml

5fed463

matentzn commented Nov 19, 2024

View reviewed changes

matentzn added 10 commits November 20, 2024 10:59

Update phenio-odk.yaml

b7569bf

Removing unnecessary namespaces from final phenio.owl

8d54f6d

Stripping the edit file of all things but metadata

4e7a5cc

Add CL and Uberon mappings

e43b533

Remove import from git index

7c09490

It has become too big

Update ODK config and related files

6e0a853

Create file for ROBOT renames

78b71f6

Added MAXO component

77c9a1f

Add MAXO to ODK config

14ab476

matentzn commented Nov 23, 2024

View reviewed changes

src/ontology/config/mappings-to-uberon-bridge.rules Show resolved Hide resolved