-
-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add upper- and lowercase prefix synonyms #969
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #969 +/- ##
==========================================
+ Coverage 40.57% 40.90% +0.32%
==========================================
Files 148 138 -10
Lines 8244 7916 -328
Branches 1910 1849 -61
==========================================
- Hits 3345 3238 -107
+ Misses 4690 4475 -215
+ Partials 209 203 -6 ☔ View full report in Codecov by Sentry. |
This PR gets rid of code that focuses on lists of `curies.Record` objects and instead works directly with `curies.Converter` objects. Along the way, this also identified issues with the data integrity on MIRIAM, N2T, and Prefix Commons with respect to the TAIR resources (`tair.gene` and `tair.protein`) which all used non-specific, overlapping URLs. Therefore, these needed to get cleaned out before being import. Why do this? If we work directly with converters, we can make use of the CURIE prefix reconciliation tooling to more cleanly refactor the Bioregistry to Converter pipeline (which is causing issues when adding prefix casing variants in a related PR #969)
The current issue why this isn't working is in the OBO context, the OBO synonyms are prioritized, which gives the Geographical Entity ontology the |
Closes #935
This PR automatically adds both the upper- and lowercase variants of all prefix synonyms for each record. This makes it much more simple to create comprehensive EPMs (instead of having to refer on programmatic logic for matching)
Depends on