-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
include ids for taxon-name #68
Comments
the example we used |
TaxPub: use object-id note pending issue for addition of vocab attrs to |
@myrmoteras Here's a CoL ID example What Guido explained to me is that these have been added to new uploads since January 2023 and are now being retroactively added to the backlog via the Big Batch. |
And here's one with an ENA ID Source: |
Using current TaxPub markup,
|
Current sample does not contain CoL ids, so this issue will have to wait for development |
Which example are you talking about? It's true that we've only been linking the treatment taxa and cited taxa to CoL since the start of this year, mainly, as you correctly say, because the CoL name IDs have only been stable since late 2022. The linking of treatments that come into SRS originally was sort of preempting the Big Batch, and also serves as a means of adding the links after the fact, as at the time of the original IMF import, CoL might not have an ID for a given name just yet (and how could it in case of a newly published original description or new combination) ... this is why the linking on the way into SRS will stay active. |
I'm guessing Donat refers to Terry's "Current sample does not contain CoL ids" which, if memory serves, is from the list of papers I sent him during the last sprint, which means pre-Big Batch. |
Well, in that light, from more recent memory (Geneva in early November), the test set for articles is mainly focused on documents that don't contain treatments ... We can change that policy, of course, but please keep in mind that the taxon names that currently don't get linked are also subject to far less scrutiny in QC, and have lower error severities as well, so linking might not be just as reliable if we don't also change (increase) outside-treatment taxon name QC. |
only treatment taxa are linked with a COL-ID all the rest is not linked.
|
Taxon names in treatment citations are linked as well, as are type species ... the only thing strictly restricted to treatment taxa is the additional link to the ENA/NCBI taxonomic backbone. The cited effort mainly pertains to a huge number of API lookups (to ChecklistBank), not actual computations to be made ... to reduce the barrage towards the ChecklistBank API is the main objective of the current restriction to treatment taxa, treatment citations, and the likes of type species. |
@gsautter I don't understand the argument about lookups. I thought we have a local version of CLB and especially COL and thus would not have to use external lookups? |
We do have a local version of CoL, yes, built every year from the annual version ... however, CLB might well get ahead, so a miss in the local CoL needs following up with a CLB lookup. The ENA taxon ID is yet another thing, and always requires a CLB lookup, as said mapping is subject to change to too high a degree to include it in CoL local, and it would also inflate the data structure, and needlessly so for all applications except for this specific type of lookup ... I'm always thinking about Jeremy's laptop in this sort of context: CoL local has to stay sufficiently slim to work on such machines ... |
annotation id
when CoL;
The text was updated successfully, but these errors were encountered: