Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix for Blank node ordering, RDF lists and DecimalFormat #15

Merged
merged 14 commits into from
Sep 13, 2024

Conversation

fkleedorfer
Copy link
Contributor

@fkleedorfer fkleedorfer commented Sep 7, 2024

fixes #13 - A quick fix. The writeListResource method was not handling non-blank nodes properly

fixes #14 - Another quick fix, setting the locale to US

fixes #8 (!) - Not so quick. After some tinkering I found out that neither model nor graph retain any actually useful information about (blank) node ordering, but it's actually quite simple to intercept the parsing code and just keep track of newly created blank nodes during RDF parsing, which gives you their ordering. At the same time, I found out that you can also grab any labels. However, this requires formatting the RDF string representation, not the model, so the TurtleFormatter.apply(Model) will not give you preservation of blank node ordering (and the accept(Model, OutputStream) method won't either).

In while fixing #8 I found a few other bugs related to blank nodes :

  • handling of cycles
  • when blank nodes need to be labeled (more than one incoming arcs or part of a blank node cycle, but in each cycle, only one needs to be labeled)

Oh and it may be able to address #2, at least partly, based on the approach introduced here.

EDIT: a blank node cycle is e.g.:

_:a :has _:b .
_:b :has _:a .

@fkleedorfer fkleedorfer changed the title Bugfix list and numberformat Bugfix for RDF lists and DecimalFormat Sep 7, 2024
@fkleedorfer fkleedorfer changed the title Bugfix for RDF lists and DecimalFormat Bugfix for Blank node ordering, RDF lists and DecimalFormat Sep 9, 2024
@fkleedorfer
Copy link
Contributor Author

There is still a bug in there regarding blank node labeling, commit coming up

@fkleedorfer
Copy link
Contributor Author

@atextor Being new to gradle, I did not see an option to format the source code (ironically ;-). Is there any build target for formatting?

@fkleedorfer
Copy link
Contributor Author

fkleedorfer commented Sep 10, 2024

@atextor I understand this is quite an intrusive PR - I tried to minimize that but it comes with the territory.

In that vein, two more refactorings that I'm planning now :

  • put the BlankNodeMetadata into the State (so we don't clutter the method signatures with it)
  • change the ComparableRDFNodeFactory such that it returns a comparator, not a Comparable, so that we don't create a new Object for each comparison

@fkleedorfer
Copy link
Contributor Author

I think this is ready to merge.

I would appreciate publication to maven central soon, so I can then finish the contribution of the RDF format to spotless, which uses turtle-formatter for TTL

@atextor
Copy link
Owner

atextor commented Sep 12, 2024

Hi @fkleedorfer,
first of all, thank you very much for your contribution! 🎆

@atextor Being new to gradle, I did not see an option to format the source code (ironically ;-). Is there any build target for formatting?

Unfortunately there is no such thing with the project being mostly a small one-person project (until now that is :-)). I have created issue #25 to document/check/automate this. There is a plan to (1) turn the owl-cli project into a Maven project and (2) merge the turtle-formatter project into owl-cli, this would enable usage of e.g. the formatter-maven-plugin. I think this is along the lines of what you have in mind. For now, I'd just ignore formatting issues in your PR and fix it manually subsequently.

@atextor I understand this is quite an intrusive PR - I tried to minimize that but it comes with the territory.

I was aware that is no small fix when creating issue #8, so no problem from my side. I'd rather compliment you on going all the way of fixing it, even when it turned out to be no small thing.

I think this is ready to merge.

In #8 (comment) you mention checking for collisions with other generated blank node labels, is this already addressed?
Update: I've seen it in the PR

I would appreciate publication to maven central soon, so I can then finish the contribution of the RDF format to spotless, which uses turtle-formatter for TTL

I'm happy to do a release as soon as the PR is merged.

One more thing: If you like, please feel free to add yourself to the CONTRIBUTORS file (need to pull main first, I just created it).

@atextor
Copy link
Owner

atextor commented Sep 12, 2024

Please edit the PR description so that it does not contain "fix #2" any more, because that would lead to GH automatically closing that issue with this PR, which is not what we want (since it's only one step towards the solution).

- stable ordering
- preserve label if present
- detect blank node cycles
Hard (impossible) to test at this point because labels are only retained when formatting an TTL source string, and in TTL, you cannot have anonymous blank nodes in those constellations that would cause them to be labeled by the algorithm.
@fkleedorfer
Copy link
Contributor Author

ready

@fkleedorfer
Copy link
Contributor Author

(really ready)

@fkleedorfer
Copy link
Contributor Author

There is a plan to (1) turn the atextor/owl-cli#19 and (2) atextor/owl-cli#20, this would enable usage of e.g. the formatter-maven-plugin.There is a plan to (1) turn the atextor/owl-cli#19 and (2) atextor/owl-cli#20, this would enable usage of e.g. the formatter-maven-plugin.

Have you looked into writing a plugin for the formatter-maven-plugin? I chose spotless because we already use it. It's not entirely straightforward to contribute a language, though. Allows for maven, gradle and sbt support if you can pull it off, which is nice.

Copy link
Owner

@atextor atextor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@atextor
Copy link
Owner

atextor commented Sep 13, 2024

There is a plan to (1) turn the atextor/owl-cli#19 and (2) atextor/owl-cli#20, this would enable usage of e.g. the formatter-maven-plugin.There is a plan to (1) turn the atextor/owl-cli#19 and (2) atextor/owl-cli#20, this would enable usage of e.g. the formatter-maven-plugin.

Have you looked into writing a plugin for the formatter-maven-plugin? I chose spotless because we already use it. It's not entirely straightforward to contribute a language, though. Allows for maven, gradle and sbt support if you can pull it off, which is nice.

I was only talking about using the formatter-maven-plugin to format the Java code, to make contributions like yours more convenient. I've not looked into enhancing it to support Turtle.

@fkleedorfer
Copy link
Contributor Author

fkleedorfer commented Sep 13, 2024

I was only talking about using the formatter-maven-plugin to format the Java code, to make contributions like yours more convenient. I've not looked into enhancing it to support Turtle.

The spotless project provides java formatting for gradle as well. Might be worth looking into.

@atextor atextor merged commit bfcd99d into atextor:main Sep 13, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants