Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Data Package License Machine Readability with SPDX #86

Open
clnsmth opened this issue Apr 10, 2024 · 6 comments
Open

Improve Data Package License Machine Readability with SPDX #86

clnsmth opened this issue Apr 10, 2024 · 6 comments

Comments

@clnsmth
Copy link
Contributor

clnsmth commented Apr 10, 2024

Background

Data package licensing is crucial for clearly communicating data usage terms to consumers. Currently, EDI assists data authors by presenting a couple license options aligned with open data principles, along with a third option for custom content. This information is stored in EML's intellectualRights element, which is required by the EML Congruence Checker.

However, the intellectualRights element accepts loosely formed TextType metadata. While human-readable, this format is largely unintelligible for machines and hinders automated license interpretation.

Proposed Changes

To enhance current practices, and align with Science-On-Schema.org practices for license interoperability, consider encouraging the use of EML's licensed element. This element accepts a URL to a machine-resolvable, linked data compliant license. EDI can offer SPDX license identifiers as linked data URIs alongside the current practices using the intellectualRights element. Eventually, phasing out the free-text intellectualRights element might be considered (see note below).

The current set of EDI-recommended licenses and their corresponding Creative Commons and SPDX identifiers are:

Creative Commons Zero v1.0 Universal

Attribution 4.0 International

Choosing between Creative Commons and SPDX may have future implications for supporting other licenses. SPDX encompasses all Creative Commons licenses and offers greater flexibility for accommodating additional licenses in the future.

Note: Phasing out the free-text intellectualRights element would eliminate potential contradictions with the licensed element and limit support for arbitrary, non-standardized data use terms. However, it may restrict the ability to express nuanced usage rights not yet formalized by the broader community.

Affected Systems

Some affected systems include:

@gremau
Copy link
Collaborator

gremau commented Nov 8, 2024

The upcoming release of the EML Best Practices document encourages use of the <licensed> EML element and will also recommend using the SPDX vocabulary of license URLs in the licensed/url child element. See chapter 6. It does not recommend doing away with <intellectualRights> since the ECC requires it and it will probably continue to be used by LTER data contributors in the forseeable future.

We can consider resolving this once the new version of the document goes to production

@clnsmth
Copy link
Contributor Author

clnsmth commented Nov 8, 2024

Thanks for the heads-up @gremau.

I'll raise this issue at the next developer meeting to let everyone know it is moving forward.

@srearl
Copy link
Collaborator

srearl commented Nov 9, 2024

@clnsmth do you have handy any examples of data packages that include this element?

@clnsmth
Copy link
Contributor Author

clnsmth commented Nov 11, 2024

You bet @srearl! BLE-LTER makes use of this pattern. For example:

Beaufort Lagoon Ecosystems LTER and V. Lougheed. 2020. Carbon flux from aquatic ecosystems of the Arctic Coastal Plain along the Beaufort Sea, Alaska, 2010-2018 ver 7. Environmental Data Initiative. https://doi.org/10.6073/pasta/e6c261fbd143e720af5a46a9a131a616.

where in the EML you'll find:

<licensed>
  <licenseName>Creative Commons Zero v1.0 Universal</licenseName>
  <url>https://spdx.org/licenses/CC0-1.0.html</url>
  <identifier>CC0-1.0</identifier>
</licensed>

listed alongside the <intellectualRights>, and which displays on the data package summary page as:

Screenshot 2024-11-11 at 6 57 26 AM

@clnsmth
Copy link
Contributor Author

clnsmth commented Nov 25, 2024

Note: The provided BLE example (#86 (comment)) doesn't fully adhere to the recommended approach. The example's <url> value includes an .html extension (https://spdx.org/licenses/CC0-1.0.html), which is incompatible with linked data principles. For a linked data URI, the .html extension should be removed (correct format: https://spdx.org/licenses/CC0-1.0).

@twhiteaker
Copy link
Collaborator

Noted. We've fixed Metabase and future datasets will use the correct format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants