Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add identifiers and splits #3

Closed
wants to merge 14 commits into from
Closed
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
144 changes: 143 additions & 1 deletion pull.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,20 @@
'CC-BY-ND-3.0',
'CC-BY-ND-4.0',
],
'CC-BY': [ # any version
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any CC-BY slug on their page. This link should not take you to any specific entry: https://www.gnu.org/licenses/license-list.html#CC-BY

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added checks in master's pull.py to watch out for this sort of thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed a correction for this and CC-BY-SA to the pull request branch

'CC-BY-1.0',
'CC-BY-2.0',
'CC-BY-2.5',
'CC-BY-3.0',
'CC-BY-4.0',
],
'CC-BY-SA': [ # any version
'CC-BY-SA-1.0',
'CC-BY-SA-2.0',
'CC-BY-SA-2.5',
'CC-BY-SA-3.0',
'CC-BY-SA-4.0',
],
'FDL': [
'FDLv1.1',
'FDLv1.2',
Expand All @@ -57,8 +71,40 @@
'FDLv1.2',
'FDLv1.3',
],
'FreeArt': [ # any version
'LAL-1.2',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is another case where the FSF doesn't say “any version”. The version they link is 1.3. Until we get clarity on this, can we drop the SPLITS entry (and the associated LAL-1.2 IDENTIFIERS entry) and come back to them in follow-up work?

They also link the English translation, while we currently include only the canonical French version. Until we sort out translations (spdx/license-list-XML#438), we should remove the LAL-1.3 IDENTIFIERS entry as well.

'LAL-1.3',
],
'FreeBSDDL': ['FreeBSD'], # unify (multi-tag)
# FIXME: still working through this
'NPL': [ #any version
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FSF wording for this isn't “any version”, it's “versions 1.0 and 1.1”.

'NPL-1.0',
'NPL-1.1',
],
'OSL': [ # any version through 3.0
'OSL-1.0',
'OSL-1.1',
'OSL-2.0',
'OSL-2.1',
'OSL-3.0',
],
'RPL': [ # any version - Note that FSF website does not state any version, but references version 1.3 in the URL. It is assumed that it also covers version 1.1 and 1.5, but this should be verified with FSF.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FSF at least occasionally says “any version” when that's their intention. Until we get clarity on this, can we drop it from this PR (and the associated RPL-* IDENTIFIERS entries) and come back to them in follow-up work?

'RPL-1.1',
'RPL-1.3',
'RPL-1.5',
],
'Unicode': [ # any version
'Unicode-DFS-2015',
'Unicode-DFS-2016',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is another case where the FSF doesn't say “any version”. The version they link is from 2012 (based on the copyright) and differs from Unicode-DFS-2015 at least in:

the above copyright notice(s) and this permission notice appear with all copies of the Data Files or Software,

vs. Unicode-DFS-2015's

this copyright and permission notice appear with all copies of the Data Files or Software,

Until we get clarity on this, can we drop it from this PR (and the associated Unicode-DFS-* IDENTIFIERS entries) and come back to them in follow-up work?

],
'W3C': [ # any version
'W3C',
'W3C-20150513',
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is another case where the FSF doesn't say “any version”. The version they link is from 2002-12-31 (based on the URI) which matches our W3C. Our W3C-19980720, on the other hand, has “This W3C Work”, and our W3C-20150513 has “Permission to copy, modify, and distribute this work”. Until we get clarity on this, can we drop the SPLITS entry from this PR (and all but the W3C IDENTIFIERS entries) and come back to them in follow-up work?

'W3C-19980720',
],
'Zope2.0': [ # Versions 2.0 and later
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FSF wording for this isn't “Versions 2.0 and later”, it's “versions 2.0 and 2.1”.

'ZPL-2.0',
'ZPL-2.1',
],
}

IDENTIFIERS = {
Expand All @@ -70,9 +116,21 @@
'AcademicFreeLicense2.1': {'spdx': 'AFL-2.1'},
'AcademicFreeLicense3.0': {'spdx': 'AFL-3.0'},
'Aladdin': {'spdx': 'Aladdin'},
'apache1.1': {'spdx': 'Apache-1.1'},
'apache1': {'spdx': 'Apache-1.0'},
'apache2': {'spdx': 'Apache-2.0'},
'apsl1': {'spdx': 'APSL-1.0'},
'apsl2': {'spdx': 'APSL-2.0'},
'ArtisticLicense': {'spdx': 'Artistic-1.0'},
'ArtisticLicense2': {'spdx': 'Artistic-2.0'},
'BerkeleyDB': {'spdx': 'Sleepycat'},
'bittorrent': {'spdx': 'BitTorrent-1.1'},
'boost': {'spdx': 'BSL-1.0'},
'CC-BY-1.0': {'spdx': 'CC-BY-1.0'},
'CC-BY-2.0': {'spdx': 'CC-BY-2.0'},
'CC-BY-2.5': {'spdx': 'CC-BY-2.5'},
'CC-BY-3.0': {'spdx': 'CC-BY-3.0'},
'CC-BY-4.0': {'spdx': 'CC-BY-4.0'},
'CC-BY-NC-1.0': {'spdx': 'CC-BY-NC-1.0'},
'CC-BY-NC-2.0': {'spdx': 'CC-BY-NC-2.0'},
'CC-BY-NC-2.5': {'spdx': 'CC-BY-NC-2.5'},
Expand All @@ -83,16 +141,25 @@
'CC-BY-ND-2.5': {'spdx': 'CC-BY-ND-2.5'},
'CC-BY-ND-3.0': {'spdx': 'CC-BY-ND-3.0'},
'CC-BY-ND-4.0': {'spdx': 'CC-BY-ND-4.0'},
'CC-BY-SA-1.0': {'spdx': 'CC-BY-SA-1.0'},
'CC-BY-SA-2.0': {'spdx': 'CC-BY-SA-2.0'},
'CC-BY-SA-2.5': {'spdx': 'CC-BY-SA-2.5'},
'CC-BY-SA-3.0': {'spdx': 'CC-BY-SA-3.0'},
'CC-BY-SA-4.0': {'spdx': 'CC-BY-SA-4.0'},
'CC0': {'spdx': 'CC0-1.0'},
'CDDL': {'spdx': 'CDDL-1.0'},
'CPAL': {'spdx': 'CPAL-1.0'},
'CeCILL': {'spdx': 'CECILL-2.0'},
'CeCILL-B': {'spdx': 'CECILL-B'},
'CeCILL-C': {'spdx': 'CECILL-C'},
'ClarifiedArtistic': {'spdx': 'ClArtistic'},
'clearbsd': {'spdx': 'BSD-3-Clause-Clear'},
'CommonPublicLicense10': {'spdx': 'CPL-1.0'},
'cpol': {'spdx': 'CPOL-1.02'},
'Condor': {'spdx': 'Condor-1.1'},
'ECL2.0': {'spdx': 'ECL-2.0'},
'eCos11': {'spdx': 'RHeCos-1.1'},
'eCos2.0': {'spdx': 'GPL-2.0+ WITH eCos-exception-2.0'},
'EPL': {'spdx': 'EPL-1.0'},
'EPL2': {'spdx': 'EPL-2.0'}, # not in license-list-XML yet
'EUDataGrid': {'spdx': 'EUDatagrid'},
Expand All @@ -103,8 +170,83 @@
'FDL1.2': {'spdx': 'GFDL-1.2'},
'FDL1.3': {'spdx': 'GFDL-1.3'},
'FreeBSD': {'spdx': 'BSD-2-Clause-FreeBSD'},
'freetype': {'spdx': 'FTL'},
'GNUAllPermissive': {'spdx': 'FSFAP'},
'GNUGPLv3': {'spdx': 'GPL-3.0'},
'gnuplot': {'spdx': 'gnuplot'},
'GPLv2': {'spdx': 'GPL-2.0'},
'HPND': {'spdx': 'HPND'},
'IBMPL': {'spdx': 'IPL-1.0'},
'iMatix': {'spdx': 'iMatix'},
'imlib': {'spdx': 'Imlib2'},
'ijg': {'spdx': 'IJG'},
'intel': {'spdx': 'Intel'},
'IPAFONT': {'spdx': 'IPA'},
'ISC': {'spdx': 'ISC'},
'JSON': {'spdx': 'JSON'},
'LAL-1.2': {'spdx':'LAL-1.2'},
'LAL-1.3': {'spdx':'LAL-1.3'},
'LGPLv3': {'spdx': 'LGPL-3.0'},
'LGPLv2.1': {'spdx': 'LGPL-2.1'},
'LPPL-1.2': {'spdx': 'LPPL-1.2'},
'LPPL-1.3a': {'spdx': 'LPPL-1.3a'},
'lucent102': {'spdx': 'LPL-1.02'},
'ModifiedBSD': {'spdx': 'BSD-3-Clause'},
'MPL': {'spdx': 'MPL-1.1'},
'MPL-2.0': {'spdx':'MPL-2.0'},
'ms-pl': {'spdx': 'MS-PL'},
'ms-rl': {'spdx': 'MS-RL'},
'NASA': {'spdx': 'NASA-1.3'},
'NCSA': {'spdx':'NCSA'},
'newOpenLDAP': {'spdx': 'OLDAP-2.7'},
'Nokia': {'spdx': 'Nokia'},
'NoLicense': {'spdx': 'NONE'},
'NOSL': {'spdx': 'NOSL'},
'NPL-1.0': {'spdx': 'NPL-1.0'},
'NPL-1.1': {'spdx': 'NPL-1.1'},
'ODbl': {'spdx': 'ODbL-1.0'},
'oldOpenLDAP': {'spdx': 'OLDAP-2.7'},
'OpenPublicL': {'spdx': 'OPL-1.0'},
'OpenSSL': {'spdx': 'OpenSSL'},
'OriginalBSD': {'spdx': 'BSD-4-Clause'},
'OSL-1.0': {'spdx': 'OSL-1.0'},
'OSL-1.1': {'spdx': 'OSL-1.1'},
'OSL-2.0': {'spdx': 'OSL-2.0'},
'OSL-2.1': {'spdx': 'OSL-2.1'},
'OSL-3.0': {'spdx': 'OSL-3.0'},
'PHP-3.01': {'spdx': 'PHP-3.01'},
'Python': {'spdx': 'Python-2.0'}, # Note: references 'later versions which are not in the SPDX license list
'QPL': {'spdx': 'QPL-1.0'},
'RPL-1.1': {'spdx': 'RPL-1.1'},
'RPL-1.5': {'spdx': 'RPL-1.5'},
'RPSL': {'spdx': 'RPSL-1.0'},
'Ruby': {'spdx': 'Ruby'}, # Note that the text linked is 'not an exact match to the SPDX license list
'SGIFreeB': {'spdx': 'SGI-B-2.0'},
'SILOFL': {'spdx': 'OFL-1.1'},
'SISSL': {'spdx': 'SISSL'}, # Note that the header on the 'FSF website states version 1.0, but the link points to 'version 1.1. The SPDX license is version 1.1
'SPL': {'spdx': 'SPL-1.0'},
'StandardMLofNJ': {'spdx': 'SMLNJ'},
'Unicode-DFS-2015': {'spdx': 'Unicode-DFS-2015'},
'Unicode-DFS-2016': {'spdx': 'Unicode-DFS-2016'},
'Unlicense': {'spdx': 'Unlicense'},
'UPL': {'spdx': 'UPL-1.0'},
'Vim': {'spdx': 'Vim'},
'W3C': {'spdx': 'W3C'},
'W3C-20150513': {'spdx': 'W3C-20150513'},
'W3C-19980720': {'spdx': 'W3C-19980720'},
'Watcom': {'spdx': 'Watcom-1.0'},
'WTFPL': {'spdx': 'WTFPL'},
'X11License': {'spdx': 'X11'},
'XFree861.1License': {'spdx': 'XFree86-1.1'},
'xinetd': {'spdx': 'xinetd'},
'Yahoo': {'spdx': 'YPL-1.1'},
'Zend': {'spdx': 'Zend-2.0'},
'Zimbra': {'spdx': 'Zimbra-1.3'},
'ZLib': {'spdx': 'Zlib'},
'Zope': {'spdx': 'ZPL-1.1'}, # Note the FSF refers to version 1.0 and SPDX uses version 1.1 - it should be verified that 1.1 should be included
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather leave these inexact matches off. If someone is looking up GPL-compat in this API, for example, false negatives are much less problematic than false positives.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to take a more conservative approach, I would remove the SPDX ID's for Zope and Ruby. I consider SISSL to be an internal inconsistency on the FSF website and I believe they intend it to match the referenced SPDX license.

In your review comment, you included Zlib, Zimbra and Zend. I'm assuming your comment only referred to Zope - let me know that is incorrect or if there are other license matches that concern you.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your review comment, you included Zlib, Zimbra and Zend. I'm assuming your comment only referred to Zope - let me know that is incorrect or if there are other license matches that concern you.

I'd rather have this bulk PR only cover exact matches. Can you drop anything you consider questionable and file those in per-license(-group) PRs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - I already removed the matches I consider questionable and added them to the list of FSF license ids without an associated SPDX license ID.

'ZPL-2.0': {'spdx': 'ZPL-2.0'},
'ZPL-2.1': {'spdx': 'ZPL-2.1'},

# FIXME: still working through this
}

Expand Down
47 changes: 47 additions & 0 deletions unassociated-license-ids.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# The following FSF license tags did not have any obvious match to an SPDX license
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this in version control, because you can generate it from licenses.json with jq. Can we remove it here? And if you like open an issue with a list of unchecked boxes?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jq for this is:

$ jq -r 'to_entries[] | select(.value.identifiers.spdx | not) | .key' licenses.json

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll replace it with an issue since some of the items have comments that would not be retained by using jq.

* ACDL
* Arphic
* ATTPublicLicense
* CryptixGeneralLicense
* DOR
* dsl
* ecfonts
* GNUVerbatim
* GPL-PA
* GPLFonts- this may be an exception?
* GPLOther
* HESSLA
* informal
* Jahia
* josl
* ksh93
* Lha
* Ms-SS
* PerlLicense - Not including due to the description on the FSF website does not seem to represent this license
RPL- References version 1.3 which is not in the SPDX license list. SPDX has versions 1.1 and 1.5
* OculusRift
* OpenContentL
* OpenPublicationL
* PINE
* Plan9
* PPL
* PublicDomain
* Phorum
* Python1.6a2
* PythonOld
* Scilab
* Scratch
* SML
* Squeak
* SunCommunitySourceLicense
* SunSolarisSourceCode
* SystemC-3.0
* Truecrypt-3.0
* UtahPublicLicense
* WebM
* YaST