-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize prefix over HTTP and HTTPS URIs #156
Comments
I think this is a very nice usability enhancement. And wonder if this should be a generic adaptation not just to the prefix declaration but to all IRI equality testing or as an special kind of entailment. |
Isn't this a bad practice by schema.org? If so, why should it be normalized? |
I agree that this could be viewed as special kind of entailment, but I am very skeptical about using the
If this kind of entailment is desired, I think it would be cleaner to treat it explicitly as a form of entailment, using existing mechanisms for specifying entailment regimes. |
The problem is not specific to Schema.org, but relevant for all vocabularies etc. that want to migrate their URIs from HTTP to HTTPS.
I agree with the downsides that you mention. Most important to me is solving this issue on the SPARQL-level, not which particular SPARQL solution is picked. So if we forget about the
|
I have always viewed this problem as part of the usual need to normalize one's data as part of the data intake or ETL process. In other words, normalize those URIs to The normalization could also be done within a SPARQL server, using URI pattern matching and rewriting, etc., and storing the normalized result to a separate graph, but the SPARQL code that's needed to do that is a bit messy. URI munging is not SPARQL's strong suit. |
For me this is an usability issue. It's easy to forget which ontology dataset uses https and which ones http. Once federating queries it is even harder. |
Ideally, this would be fixed on the data intake and ideally, we would use entailment. However, as a query client, you have no guarentees over the dataset or the entailment regime. Also, entailment is a rather complex way to solve such a common issue and it can yield results that are suprising to the client ("how did these URIs get in here? They are nowhere in my query."). You definitely can have both of course. So I agree with @JervenBolleman: this is about improving usability for the one who's writing the query. I wonder whether you could have something like a UNION PREFIX similar to what graphql has for types? |
I agree this is a usability issue that should be solved somehow, but I'm not a big fan of solutions that are based on modifying the query syntax (for the reasons listed by @dbooth-boston). If I understand correctly, the suggested PREFIX schema: <schema.org/>
SELECT * WHERE {
SERVICE <urn:endpoint1> { ?s a ?type }
SERVICE <urn:endpoint2> { ?s a ?type }
} I think introducing a dedicated (and lightweight) entailment regime might be acceptable for this. Especially since the implementation of this feature will require entailment in any case. |
I agree that handling it at data ingestion and in implementation feature is a better route. (The relative URI syntax is already legal!) I'm also not keen on addressing migration issues as a permanent feature of the language. What would be good is a "practice and experience" note. |
@JervenBolleman makes a very important point #156 (comment): this is only one aspect of IRI equality testing. Sadly, the same IRI written with and without percent-encoding is neither equal nor equivalent: select (?iri1=?iri2 as ?equal) (sameTerm(?iri1,?iri2) as ?same) {
values (?iri1 ?iri2) {(<urn:foo%2Dbar> <urn:foo-bar>)}
} Most modern websites redirect http to https, for any resource. I think this is the good behavior. I think that schema.org gives a mixed signal by promoting https variants of their semantic terms. But no matter this mixed signal, thousands of website admins will use https in their data, and thousands more will use http. |
#158: IANA rebukes
Curiously, it fails to render such rebuke for |
Why?
Vocabularies are transitioning from HTTP to HTTPS URIs, for example Schema.org and CreativeCommons. Because the HTTP scheme is – unhappily – part of the URI, this change has implications for SPARQL queries. This problem will become even more widespread in the future when more vocabularies change their HTTP scheme.
When executing SPARQL queries against resources for which it can’t be predicted whether they will be using HTTP or HTTPS URIs, workarounds as well as normalization are necessary in client applications. For example:
UNION
, once withPREFIX schema: <http://schema.org/>
and once withPREFIX schema: <https://schema.org/>
;Previous work
None that I know of.
Proposed solution
Solve the problem generically on the SPARQL-level, so client-side workarounds are no longer necessary. For example SPARQL could accept prefixes without HTTP schema that then work on both HTTP and HTTPS URIs:
Considerations for backward compatibility
The text was updated successfully, but these errors were encountered: