-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RDF object mapping #186
Comments
@frensjan Thanks for this comprehensive treatment! I'll discuss with some of my colleagues. I'll just comment on the output format:
|
Thanks @VladimirAlexiev. I think the main 'contribution' of the example is to a) flexibly perform sub-queries and b) generate 'objects' for easy consumption and not be restricted by the tabular / tuple form of SELECT or the RDF form of CONSTRUCT. I like GraphQL a lot for it's simplicity and flexibility to express (simple) queries. But for the RDF use-cases I'm working on, it's not powerful enough without resorting to a complex schema with lot of 'custom' fields to express computations. I think the proposed line of object construction could provide a powerful foundation for a GraphQL layer. But it pushes a large chunk of the implementation into the SPARQL layer. In my view in such a way that the interface provides a lot more expressive power. As for GraphQL and JSON-LD: the proposed can be used to generate JSON-LD, just not 'automatically'. For instance the blog post example can be queried with something like: select object {
"@context": "https://schema.org",
"@type": "BlogPosting",
"@id": ?post,
"headline": ?headline,
"alternativeHeadline": ?alternativeHeadline,
"image": ?image,
"award": ?award,
...
"author": {
"@type": "Person",
"name": ?authorName
}
}
where {
?post a :BlogPosting ;
:headline ?headline ;
:alternativeHeadline ?alternativeHeadline ;
:image ?image ;
:award ?award ;
...
:author [
a :Person,
:name ?authorName
] .
} Perhaps if the verbosity needs to come down, some abbreviation can be introduced. E.g. referencing statement objects (values) through path expressions: select object {
"@context": "https://schema.org",
"@type": "BlogPosting",
"@id": ?post,
:headline,
:alternativeHeadline,
:image,
:award,
...
"author": {
"@type": "Person",
"name": :author/:name
}
}
where {
?post a :BlogPosting
} Although it's a bit implicit what the focus node is in this case I'd say. The GraphQL with JSON-LD combination is attractive. But GraphQL is also a little bit restrictive on the other. SELECT queries have a lot of flexibility in terms of what to pull from the database, but the tabular form is restrictive. CONSTRUCT queries allow for more flexibility but being tied to the RDF model is a tad bit restrictive in form but very much limited in terms of integration with other technologies. As for JSON-LD: I'm not a user, so probably it's a lack of knowledge. I can see it as very convenient to generate from other data environments. Mapping objects in Javascript, Java, python,etc. to JSON-LD is probably pretty easy and useful. Consuming it as RDF is easy if there's a parser, but I don't see much benefit of parsing a JSON-LD input over parsing say a turtle input. JSON-LD output to me is also not that useful unless you guarantee some sort of shape: i.e., if a client can generate code that matches the expected structure, this really helps a developer. E.g. as with OpenAPI / JSON-schema or GraphQL. But I don't see that with just JSON-LD. |
We have created a GraphQL based approach that generates SPARQL queries and a corresponding mapping of the result set to JSON. Our design decision was to make the approach field centric, so you start with the GraphQL field and define it with a SPARQL graph pattern (using The following examples should give an expression for how to use this approach. Note that after executing the GraphQL query of the demos, you can click on the Documentation and demos can be found here. Feedback is very welcome. Example: MoviesThe Movie Browser Demo creates JSON from Wikidata movie data and renders it with plain JavaScript. For the sake of example, the source code of the demo is completely framework free. You can e.g. enter query movies @debug
@prefix(map: {
rdfs: "http://www.w3.org/2000/01/rdf-schema#",
xsd: "http://www.w3.org/2001/XMLSchema#",
schema: "http://schema.org/",
wd: "http://www.wikidata.org/entity/"
wdt: "http://www.wikidata.org/prop/direct/"
})
{
Movies(limit: 10) @pattern(of: "SELECT ?s { ?s wdt:P31 wd:Q11424 . FILTER (exists { ?s rdfs:label ?l . FILTER(langMatches(lang(?l), 'en')) FILTER(CONTAINS(LCASE(STR(?l)), LCASE(''))) }) }") {
id @bind(of: "?s")
label @one @pattern(of: "?s rdfs:label ?l. FILTER(LANG(?l) = 'en')")
genres @pattern(of: "SELECT DISTINCT ?s (STR(?l) AS ?x) { ?s wdt:P136/rdfs:label ?l . FILTER(langMatches(lang(?l), 'en')) }")
}
} {
"data": {
"Movies": [
{
"id": "http://www.wikidata.org/entity/Q1000094",
"label": "You\u0027re Dead",
"genres": [
"comedy film",
"thriller film"
]
}
]
}
} Example: Nested subjects, predicates and objectsquery moviesSPO @debug
@prefix(map: {
wd: "http://www.wikidata.org/entity/"
wdt: "http://www.wikidata.org/prop/direct/"
})
{
subjects(limit: 10) @pattern(of: "?s1 wdt:P31 wd:Q11424", to: "s1") @index(by: "?s1", oneIf: "true") {
predicates @pattern(of: "?s2 ?p2 ?o2", from: "s2", to: ["s2", "p2"]) @index(by: "?p2", oneIf: "true") {
objects @pattern(of: "?s3 ?p3 ?o3", from: ["s3", "p3"], to: "o3")
}
}
} {
"data": {
"http://www.wikidata.org/entity/Q1000094": {
"http://schema.org/description": {
"objects": [
"1999 film directed by Andy Hurst"
]
},
"http://www.w3.org/2000/01/rdf-schema#label": {
"objects": [
"You\u0027re Dead"
]
},
"http://www.wikidata.org/prop/direct/P136": {
"objects": [
"http://www.wikidata.org/entity/Q157443",
"http://www.wikidata.org/entity/Q2484376"
]
}
}
} Example: Custom GeoJSON MappingA demo for creating custom JSON structures, such as GeoJSON. |
I'd like to add yet another option to the (already long) list of suggestions to make consuming data from RDF data(bases) easier.
Why?
As indicated already elsewhere: capturing data in RDF and making it accessible through SPARQL has its merits. However, integration with other ecosystems is sometimes challenging because of the technology mismatch. Most of the time a 'simple' JSON interface is requested which results in hand-written query to object mapping.
Previous work
A lot of related work is provided in: #39, #48, #126, #127, #128
The proposed solution also bears resemblance with JSON construction functions in SQL databases. E.g. jsonb_build_array, json_object_agg and others in PostgreSQL.
Making RDF data accessible through GraphQL is another example. It has the benefit of riding it's hype cycle. However, the mapping is not completely straightforward. Especially when more complex graph patterns are required, aggregations are involved, etc.
The proposed solution also draws from the ability to express (perhaps implicitly) correlated sub queries as described in the proposal for LATERAL. As also linked in other suggestions for SPARQL 1.2, Stardog supports the array aggregation function which relates to this.
Proposed solution
The meat of this proposal is to construct trees of objects (akin to object trees from GraphQL queries) based on solutions from SPARQL queries.
The solution would (probably) require a new query (sub) type, e.g. indicated by
SELECT OBJECT
,CONSTRUCT OBJECT
,CONSTRUCT FRAMED
,JSON
or something else. For the examples below I took the liberty to introduce the 'SELECT OBJECT' keyword.Scalars, lists and objects/maps
The abstract data model for results could consist of elements like scalars, lists and objects / maps.
Scalars could be selected by:
Examples of the syntax:
Lists could be selected by:
Examples of the syntax:
Objects / maps could be selected by scalar and list expressions keyed by strings. Examples of the syntax:
Partitioning / grouping
I must not that I haven't fully figured out what the semantics of the (implicit) partitioning / aggregation should be. I'm not that enthusiastic about solutions that depend on sorting the solutions before grouping / partitioning / aggregation.
A possible route here could to define all variables outside of a list as the composite key for the aggregation of the contents of the list. For example given the query:
with solutions:
:p1
John Abrams
:p2
Tim Brown
:p2
Touchdown Timmy
:p3
William Clark
could be queried with
with solutions
As indicated this could also work for composite keys such as with the following query:
to generate a result such as:
Note that in this case it's a bit muddy as here also the key in the object is
Further examples
The following query is an example of this idea:
The intent of this query is to query threads in forum with title, the total number of posts in it as well as information on the five latest messages posted in it on the one hand and the top three authors posting in the thread on the other.
Also, I can imagine a syntax where nested sub queries are part of the projection as supported by some SQL database such as MySQL.
An example of the later could be something like:
The intent of this query is to select the last three posts and retrieve information from the thread it was posted in as well as the author that create it.
Considerations for backward compatibility
None directly with regards to SPARQL.
Output serialisation
It should be considered whether the output of such a query would always imply a JSON serialisation (as e.g. with the JSON syntax from Apache Jena) or that this a more generic object / tree structure which could also map to other serialisation formats.
JSON-compatible formats
Mapping to newline separated JSON, CBOR, YAML or other 'JSON-compatible' formats formats is probably easily supported through content negotiation.
JSON-LD
Compatibility with JSON-LD could be considered, but it was not my intent to address this. JSON-LD compatible output could definitely be produced by something like the proposed solution.
XML
Mapping to XML is perhaps a bit more difficult in the face of root tags / objects and XML namespaces.
The text was updated successfully, but these errors were encountered: