-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
minimal JATS/Taxpub for article #60
Comments
|
Hi Terry, The current version of the stylesheet is here: https://github.com/plazi/ggxml2taxpub/blob/master/xslt/gg2tp_l1.xsl And recent reports on validation errors in results are here; https://github.com/plazi/ggxml2taxpub/blob/master/errs/sample_500_errors_20240312_frq.txt some cursory analysis of the errors shows them to be
Given that, I suggest moving forward to implement a conversion pipeline using the current state of the transformation. Ideally the pipeline would include a post transformation validation phase which would pass through valid instances and route invalid ones elsewhere for further analysis. It's extremely unlikely that the conversion will ever be close to 100% correct, so it's essential to introduce error handling into the workflow. Further, it's likely that the source GGXML will sometimes undergo revision which might or might not require developing a procedure to handle. The newly added exporter is now ingesting the article collection and pushing out all articles that (a) don't have TaxPub originals (which are arguably preferable over the GG XML round-tripped version), (b) have no gatekeeper objections, and (c) transform into valid TaxPub. However, isn't it the case that we decided not (or not only) to provide them with JATS TaxPub, but BIOC JSON -- or at least the JSON was the preference? The JATS would be supplementary and also added to BLR as an additional file format. The BIOC should probably be generated from the GGXML anyway, relying as it does on offsets. So in the big picture the missing piece is the GGXML to BIOC conversion, so should that not be the priority? So, let me know which annotations and attributes to include, and maybe an example (e.g. based upon a treatment with a good deal of details, like https://tb.plazi.org/GgServer/html/03A8FF2FFD36FFE31FB2EB1EFDE6F7CB or https://tb.plazi.org/GgServer/html/C63D87EE5417FFFFFDCFFB34FAA0E778) Best, |
Hi Donat, We did decide to provide a JATS version, and whether we should also use a BIO-C, if we can produce and we thought to be doable. I will schedule a meeting next Monday 5pm to discuss how to coordinate the production of JATS XML for journals.
Is this ok? What we need to provide BIOC as well, please also refer to my previous mail ... which annotations to include, which attributes? We just had the GG XML overload in WADM, and I don't exactly expect BIOC to be less verbose, so using the two example treatments I linked in my previous mail to create an example BIOC with all the desired detail would be great. Best, |
issue
For this we need a minimal level of annotations. Which one is this?
goal
define a minimal level of JATS/Taxpub for articles in general, similar to what we have for treatment taxpub for SIBiLS
solution
dependencies
The text was updated successfully, but these errors were encountered: