-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document the XML schema somewhere in this repository #391
Comments
Thanks Trevor, those are definitely in the plans. As you may have gleaned, SPDX is in the midst of two migrations: one to GitHub (from LF-hosted git + bugzilla) and another to XML. Neither migrations are complete yet. The XML schema isn't finalized yet (and this repo is marked not-authoritative in its readme), so it is likely the implicit schema embodied in these files will change as the schema attribute and element names are finalized. At the same time, we are working on documentation for contributors and license requestors, which we hope to have ready once the schema is stable and all the existing licenses have been imported. You can find an overview of the XML project here: And a fairly recently updated status on the XML project (with outstanding action items) here: |
On Wed, Feb 22, 2017 at 11:53:44AM -0800, Bradlee H. Edmndson wrote:
The XML schema isn't finalized yet (and this repo is marked
not-authoritative in its readme)…
Yup, non-authoritative-ness is well marked. That doesn't mean you
need to keep the schema out of version control, and in fact, is
probably even more reason to keep the schema *in* version control ;).
I'm happy to try and assemble a pull request for the currently-landed
licenses if that would be useful. Do you have a preference for
tooling? RELAX NG? XSD?
And a fairly recently updated status on the XML project (with outstanding action items) here:
http://wiki.spdx.org/view/Legal_Team/Templatizing/ActionPlan
How can I get involved in this? For example, I'd rather have a
different solution for deprecated licenses (#392) than the one listed
in the action plan, but I'm not sure how/where to advocate for it.
|
Following the wiki lead turned up existing docs for the XML in [1]
which links the existing field notes in [2]. I recommend moving [1]
from the wiki into this repository as a first pass at resolving this
issue. The wiki page has a browsable history [3], but the UI for that
is a lot harder to use than Git history, and wiki changes are probably
not peer-reviewed before they land (while Git commits can be reviewed
before they land). If this sounds useful, I'm happy to provide a pull
request, with or without preserving the wiki's history, just let me
know.
If, on the other hand, the intention is to keep the schema
documentation in the wiki, then I think we should at least link to [1]
from this repo's README.
[1]: http://wiki.spdx.org/view/Legal_Team/Templatizing/tags-matching
[2]: https://spdx.org/spdx-license-list/license-list-overview#fields
[3]: http://wiki.spdx.org/index.php?title=Legal_Team/Templatizing/tags-matching&action=history
|
@wking I've been working on a draft XML schema based on the current proposed XML property and attribute names the results are documented at https://docs.google.com/document/d/1z9n44xLH2MxT576KS_AbTOBtecyl5cw6RsrrQHibQtg If you have bandwidth and expertise in Schema definitions, I could use some help on a couple areas - just let me know. Once I have something close, I plan to add it to the Git repository for the XML licenses under a folder "schema" with a file named "ListedLicense.xsd". |
On Thu, Feb 23, 2017 at 09:49:10AM -0800, goneall wrote:
If you have bandwidth and expertise in Schema definitions, I could
use some help on a couple areas - just let me know.
Not much expertise, but I don't mind rooting through docs ;).
Once I have something close, I plan to add it to the Git repository
for the XML licenses under a folder "schema" with a file named
"ListedLicense.xsd".
license.xsd is probably sufficient (we're unlikely to want a separate
schema for unlisted licenses ;). Can you push whatever WIP you have
to a branch somewhere? I'll file pull-requests against it as I have
time to move things forward, and once it looks good enough to you we
can land it in master here.
|
I went ahead and pushed the schema to a new branch "schemadev". Where I got stuck was representing the license where it can contain text, bold, lists, etc in any order. I was doing some reading on the standard to see how to represent that type of structure. Very early stage attempt. |
@wking I found the solution to the mixed type: https://www.w3schools.com/xml/schema_complex_mixed.asp I'll update the schema file with a solution in the next few hours. |
@wking @bradleeedmondson I just pushed a first pass at the XML schema to the schemadev branch. If you could please review and create pull requests for any improvements. There are quite a few additional restrictions we could add to the attributes and elements, but I thought it would be good to start simple. |
On Fri, Feb 24, 2017 at 07:28:13PM -0800, goneall wrote:
@wking @bradleeedmondson I just pushed a first pass at the XML
schema to the schemadev branch.
I had been kicking this around locally on Friday afternoon, but was
getting hung up on the license body. If you have time to look over my
WIP before I have time to look over your recent changes, I've pushed
them up to [1]. As I make adjustments, I've been working on the
Apache 2.0 license example to test validation [2].
How have you been testing against the XSD? xmllint doesn't seem to
like my inlining the XHTML spec, and I haven't had enough time to test
Xerces X++ StdInParse example. Or are we intending to redefine the
markup we need for the license text instead of handing that off to
XHTML?
[1]: https://github.com/wking/license-list-XML/tree/schema
[2]: https://github.com/wking/license-list-XML/blob/schema/src/Apache-2.0.xml
|
Back online (sorry about the delay responding) I have been testing the XSD that is pushed to the schemadev branch using the online W3C validator. I have not tested it against any licenses. |
On Wed, Mar 22, 2017 at 10:29:34AM -0700, goneall wrote:
It looks like you are using some of the current element and
attribute names as opposed to the terms we are planning to move to
(e.g. LicenseCollection as opposed to SPDXLicenseCollection).
…
One of the challenges with testing is the current licenses are not
using the correct element and attribute names.
This is one reason I wasn't using all the updated names ;). But in
some cases (e.g. LicenseCollection vs. SPDXLicenseCollection), I think
the name in [1] is overly specific. A document like:
<?xml version="1.0" encoding="UTF-8"?>
<LicenseCollection xmlns="http://www.spdx.org/license" …>
…
</LicenseCollection>
is already SPDX-namespaced (via the xmlns attribute) without needing
to prefix the element name itself.
I could do a manual update to Apache 2.0 and post it to my local
repo for testing.
I have a somewhat-updated Apache 2.0 in my branch, if that helps.
As far as next steps - I would suggest we decide if we want to use
the XHTML notation, merge the 2 schema's and create a test license.
+1. I'm fine skipping XHTML for now, since I was having trouble
getting a validator to handle the inclusion. If we do roll our own
markup for license text,we'll need to be very clear about what
elements we do allow where though.
[1]: https://docs.google.com/document/d/1z9n44xLH2MxT576KS_AbTOBtecyl5cw6RsrrQHibQtg
|
We discussed this on legal call today. Mark had mentioned DTD vs XSD, but none of us on the call know enough about XML to say one way or the other. In the end we think we're good with @goneall and @wking proceeding toward an XSD, but we can always discuss further when we finish up the current merges, do the schema conversion, and do any further cleanup. |
I'm OK with DTD's, but generally XSD's are more accepted due to the XML syntax. I'll be working on the XSD next week. If anyone sees a need for a DTD representation and would like to contribute a DTD in addition to the XSD, that would be fine as well. |
I just updated the schema file in the schemadev branch. I also update Apache-2.0.xml to use the new schema. I believe it is done, but I am not sure the LicenseType should have a "" child. I changed it to choice from any since the formatted text elements can occur more than once. @wking @bradleeedmondson If you could review the schema definition and the Apache-2.0.xml in the schemadev branch. |
I'm going to close this since it was originally intended to document the fact we need a schema. This is now resolved. There are some other issues which were discussed in the comments. If there are any issues remaining, I suggest we open a separate issue to track it and make the specifics more visible. |
Also documenting more about how this project is supposed to work (e.g. what content is under which licenses? How folks should go about contributing, etc.) would be nice ;).
The text was updated successfully, but these errors were encountered: