Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata mappings for CORE OAI-PMH #866

Open
amyehodge opened this issue Dec 6, 2023 · 10 comments
Open

Metadata mappings for CORE OAI-PMH #866

amyehodge opened this issue Dec 6, 2023 · 10 comments
Assignees

Comments

@amyehodge
Copy link
Collaborator

Create metadata mappings so that we can provide access to the ETD materials (those submitted via the ETD application, not via H2 or other means) and items in the Stanford University Open Access Articles collection (druid:nk071jz2236) via OAI-PMH to the CORE service (http://core.ac.uk).

See Best practice for CORE harvesting of data providers-v3, particularly section 3.2 on metadata configuration.

@amyehodge
Copy link
Collaborator Author

Proposal for this work can be found in #795.

@arcadiafalcone
Copy link
Collaborator

@amyehodge Is there a desire to expand CORE harvesting to non-ETD items in future?

@amyehodge
Copy link
Collaborator Author

Yes. Definitely to the open access and other research publications, but potentially to other text content like non-ETD theses and capstones, grey literature, technical reports, etc. But I'd say OA publications would be top of that expanded list.

@arcadiafalcone
Copy link
Collaborator

Recommended CORE meta-tags:

  • Title
  • Author
  • Publication date
  • Journal title (or conference title) [not relevant for ETDs]
  • Publisher, that is, the institution (for theses and dissertations)

Title and author are already extracted for schema.org tags. Publication date will always be present in the descriptive metadata (an ETD record should include only two dates, publication and copyright, so the correct date may be easily identified). Publisher would be Stanford University for all objects.

@arcadiafalcone
Copy link
Collaborator

CORE metadata schemas:

  • Dublin Core/Extended Dublin Core (minimal)
  • OpenAIRE (supported)
  • RIOXX (recommended)

If the focus in on ETDs (metadata automatically generated from a template) and open-access articles deposited via H2 (user-created metadata restricted by interface), a relatively small and constant set of metadata fields need to be mapped.

@lwrubel
Copy link
Contributor

lwrubel commented Dec 18, 2023

Noting here that our current Dublin Core mapping is suboptimal. When proceeding, we should consider whether it is worth putting effort into improving that or if it would be more effective to use one of the other supported/recommended metadata schemas.

@arcadiafalcone
Copy link
Collaborator

@amyehodge @lwrubel Do we already have a sense of which of the three CORE metadata schemas we want to use?

@amyehodge
Copy link
Collaborator Author

@arcadiafalcone This topic came up in a meeting with Tom, Vivian, and Rochelle last week and was noted as a point that needs discussion, and, no we do not already have a sense of which of the three options we would want to use.

@arcadiafalcone
Copy link
Collaborator

@amyehodge What information do we need to make the decision? I can do analysis from the ease/completeness-of-mapping standpoint, but am less familiar with what other concerns might be.

@amyehodge
Copy link
Collaborator Author

@arcadiafalcone I think that information would be really helpful if you'd like to start on that. There is a meeting tomorrow where I might be able to at least start trying to suss this out. But ease/completeness-of-mapping will definitely factor in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants