-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distinguish core mappings from metadata and introduce core_mapping_id
#361
Comments
Not fan of splitting the discussion in multiple places, but oh well. Regardless of the “why” people might need a way to refer to a core mapping (curious to see the use cases people will bring in #360), I want to state again that we don’t need a An identifier for a triple (or a quadruple, if we want to include the predicate modifier) can always be derived on the fly from the triple itself, there is no need to explicitly store it. Let the spec define a standard derivation algorithm, let SSSOM-Py and SSSOM-Java provide helper methods to perform the derivation, but let’s not clutter the format with a field that would merely duplicate what is already contained in other fields. In fact the more I think about it, the more strongly I object to the creation of a I’ve quickly mentioned it in the original discussion, but I will expand more here on one of the reasons I object to such a field: It will make editing a set needlessly more difficult. Let’s say that I am creating this mapping record about the core mapping {FBbt:1234, skos:exactMatch, CL:5678} (pss, see what I did here? I just referred to a core mapping, and I didn’t need an identifier to do that):
And let’s say we decide that core mapping identifiers should be derived from the core mapping by concatenating the elements of the triple and hashing them with MD5, so the identifier for the core mapping above would be There’s no way I am going to derive the identifier myself (my interest for cryptography does not go far enough for me to know how to compute a hash in my head), so I’m going to need some tool to post-process the file in order to add the identifier:
And that, already, makes a OK, then maybe the derivation algorithm does not need to include a hashing step? Wouldn’t change much. Let’s say the core mapping identifier is generated instead by representing the triple as a canonical S-expression: |
Thanks for moving this over here. In addition "core vs record mapping" discussed in #359, there is also this axis of "globally unique vs not"; |
I accidentally derailed #359 to a discussion about identifying the core mapping, so I am moving this here now.
Adding
mapping_id
as a required slot is out of the question I think, because identifier management is too much churn for most users that just want to add a quick table with mappings to their repo.Lets start with discussing the why, then the how.
Because of the format, lets have the discussion here: #360
The text was updated successfully, but these errors were encountered: