-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
providers: provide DataCite-like DOI locally #125
providers: provide DataCite-like DOI locally #125
Conversation
e2ddff0
to
e8eb369
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!!! 💯 🚀
May I ask to slightly change the code to be even more reusable?
I would like to use base32
for normal recids, not only DOIs.
123456789
-> ABCD-EDFG
can we "plug it in" somehow in the recordid.py
? Optionally, to avoid to break backward-compatibility.
547cef3
to
df0febf
Compare
Had to add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, tho in the separate module as discussed IRL.
0c5c6ea
to
058dc1c
Compare
058dc1c
to
c467623
Compare
- Generate a random, configurable length, base32, URI-friendly, hyphen-separated, optionally checksummed DOI suffix - Cross-document - Closes inveniosoftware#124
c467623
to
9e569ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Good job! 👍 🚀 💯
Question, maybe for @lnielsen too: do we want in the long run in Invenio to have recids with this new base32
format? If yes, shall we create another RecordId provider that generate such IDs?
In our case, we will have to re-create that provider both in ILS and future CDS... I am wondering if it makes sense to put it here so that it is available to everyone...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an overall issue here, that I'm sorry I haven't detected before. We need a PIDProvider that's able to generate internal identifiers (base32) - a bit similar to RecordIdProvider. The DataCite provider should generate the DOI by taking the internal identifier and use it as suffix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 Great job!
A few minors, check in depth the test comments, there might be an inconsistency in the "default" length/split.
Approved: Assuming passing tests :)
db52720
to
28cbe26
Compare
Thanks for the review @ppanero . Changes made. Just needs an architect to also approve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Fantastic work! Thanks for this! :)
A couple of comments:
- I guess the
V2
naming/pattern and deprecation of currentRecordIdProvider
was agreed with @lnielsen and/or @ppanero? - I understand that the potential collision of a random generated base32 id is probably very minimal, but I am wondering if we should check if there is a collision, e.g. (pseudocode):
while RecordIdProviderV2.get(pid_value):
pid_value = cls.generate_id(options)
kwargs['pid_value'] = pid_value
I fear we might have the db raising exception if the generated id is already in the table...
28cbe26
to
8154467
Compare
Fixed the 1- Naming is always going to be hard. I think it's fine.
I don't fear it... I hope for it? We do want an exception if the id is already in the table and we |
8154467
to
20f37d0
Compare
cdfaf4b
to
f6aa1c6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding retrying on collisions: I would leave it as is, which is consistent with how we deal with uuid's for e.g. records.
@lnielsen Build is passing and it has been approved: this can be merged. |
@lnielsen : to be more explcit: neither Pablo or I can merge this, so if an architect could merge it, it would unblock us :) |
optionally checksummed DOI
It would probably be good to have an @inveniosoftware/architects reviewing this too because it generates a PID (DOI) which is a core element of Invenio and it does so in a non-trivial way (I think).