Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DFXML Schema could use XML Schema 1.1 validator #1

Open
ajnelson opened this issue Sep 11, 2013 · 3 comments
Open

DFXML Schema could use XML Schema 1.1 validator #1

ajnelson opened this issue Sep 11, 2013 · 3 comments

Comments

@ajnelson
Copy link
Member

Initial development of the schema was based on Schema validation as performed by xmllint. This tool reports this compatibility:

Unfortunately, that version of the XML Schema language constrains the choice of element children.

  • Arbitrary order is allowed with the <all> child element specifier, but each element is only allowed to occur 0 or 1 times. OR
  • Multiple -- 0 to unbounded -- instances of an element are allowed with the <sequence> child element specifier, but they must occur exactly in the order specified in the schema.

This makes generating DFXML awkward, particularly with some extensions I'd like to propose. (One of which will be adding additional timestamp sources for NTFS files; code to output timestamps would be more awkward-looking if all <mtime>s, <atime>s, etc., must occur contiguously.)

The XML Schema 1.1 language allows for a more sensible middle ground, but finding a free, open-source validator has proven a challenge. More on that below.

Can somebody supply a working example of an XML Schema 1.1 validator? With an example validator invocation, we can relax the rigidity in the current DFXML schema (v1.1.0rfc0 at time of this writing).

@ajnelson
Copy link
Member Author

I tried running Xerces on the DFXML schema, paired with a DFXML instance that xmllint would validate. I coded the whole experiment as a Makefile here. The results (below) tell me that the beta Xerces just fails in too mysterious a way to be useful. I welcome input from anybody more experienced with that code base that could help.

Version of the Xerces sample validator program (libxerces2-java-doc package): 2.11.0-6
Version of the "Beta" tarball that supports the XML Schema language v1.1: 2.11.0-xml-schema-1.1-beta

Shell transcript of the default rule:

alex@alex-virtual-machine:~/6527288$ ./try_xerces_on_dfxml.mk 
/usr/bin/xmllint --noout --schema dfxml_schema/dfxml.xsd sample.dfxml
sample.dfxml validates
java jaxp/SourceValidator -a dfxml_schema/dfxml.xsd -i sample.dfxml
sample.dfxml: 11ms
java jaxp/SourceValidator -l http://www.w3.org/XML/XMLSchema/v1.1 -a dfxml_schema/dfxml.xsd
error: Parse error occurred - http://www.w3.org/XML/XMLSchema/v1.1
java.lang.IllegalArgumentException: http://www.w3.org/XML/XMLSchema/v1.1
    at javax.xml.validation.SchemaFactory.newInstance(Unknown Source)
    at jaxp.SourceValidator.main(SourceValidator.java:413)
Note that Java will exit status 0 even if the above commands failed.

Shell transcript of the beta rule (which shows yet another discrepancy between the beta tarball and the documentation on the Xerces site that fails to mention it's about the beta tarball):

alex@alex-virtual-machine:~/6527288$ ./try_xerces_on_dfxml.mk results_using_beta_tarball
java jaxp/SourceValidator -xsd11 -a dfxml_schema/dfxml.xsd -i sample.dfxml
error: unknown option (xsd11).
error: Parse error occurred - http://www.w3.org/2001/XMLSchema
java.lang.IllegalArgumentException: http://www.w3.org/2001/XMLSchema
    at javax.xml.validation.SchemaFactory.newInstance(Unknown Source)
    at jaxp.SourceValidator.main(SourceValidator.java:413)
Note that Java will exit status 0 even if the above commands failed.

Edited, Sep. 12: Updated Makefile to use a DFXML sample that validates; found the example invocation that doesn't try to use v1.1 actually works. How to specify v1.1 is still a mystery to me.

@Raanelom
Copy link

Does an XML-validator 1.0 still work? I have done many attempts to validate an XML-file against the provided dfxml.xsd file, however each parser I used (Xerces, Jaxb, xjc) keeps telling me that the file violates the "'Unique Particle Attribution'. During validation against this schema, ambiguity would be created for those two particles." => amongst others at line 52 and line 285 of the file

@ajnelson-nist
Copy link
Contributor

@Raanelom : This commit should address the issue you encountered. Thank you for the report, and apologies that I missed the comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants