-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unions revisited #88
Comments
On 2021-03-26, at 17:25, Stefan Goessner ***@***.***> wrote:
$['a','b'].u $['a'].u | $['b'].u
So
$[‘a’,’b’][‘u’,’v’]
➔
$[‘a’][‘u’,’v’] | $[’b’][‘u’,’v’]
➔
$[‘a’][‘u’] | $[‘a’][‘v’] |
$[‘b’][‘u’] | $['b’][‘v’]
(i.e., ausmultiplizieren)?
(Please excuse the smartquotes)
Grüße, Carsten
|
Alternatively, we could spec the behaviour for a single JSONPath query and leave it up to implementations to support lists of queries. Since we don't seem to be in the business of standardising the API to implementations, that keeps our job a bit simpler. And we could do away with or at least simplify "unions".
Note that whether to remove duplicate nodes would still need to be decided for cases such as |
@cabo wrote:
Indeed. Sparing key strokes ... and not have to mentally compute what all the single paths are. I don't see "unions" as merely syntactic sugar. Daniel |
@goessner wrote:
I don't agree with the conclusion because the notational convenience of unions has value, the convenience of not having to repeat the first part of the path is significant, especially with combinations such as in @cabo's example. I don't think the performance concerns are as important, singularized expression evaluation can be optimized, and unions can also execute in parallel. Although with wildcards in the path, I think the edge goes to unions. But I think this way of looking at the problem is helpful, because it suggests what the allowed items in a union expression should be. The proposal notes that union expressions can always be replaced by a set of singularized expressions. If conversely we require that a a set of singularized expressions can always be replaced by a single expression with unions, it suggests that the allowed items in a union should include all of indices, identifiers, slices, wildcards, and relative path expressions beginning with For example, given the root value,
and single expressions
the result would be
A corresponding expression with unions could be
and the result would be the same. An alternative equivalent union would be
which suggests that a convenient way to provide a set of singular paths is through a union. This understanding of unions is supported in the jsoncons implementation, and its author thinks it's a natural generalization of the union concept. "Variable Expressions" don't really fit into this dual view (although the jsoncons implementation supports them with the parentheses providing disambiguation). Personally, I think "Variable Expressions" could be dropped, or kept for historical reasons only. Daniel |
@goessner wrote:
I'm not convinced. I think the issue of duplicates is orthogonal to looking at unions in this way. Consider the root value
and singular paths
The resulting values and paths are
The issue of whether to remove the duplicate item Note that it would be possible for an implementation to provide an option to return results with duplicates or without duplicates. The implementation jsoncons supports both options. |
I think "union" is fine for "multiple indices combined into a single bracket-notation selector." Previous usages of this term had been as for what we now call a "selector." This was my primary argument in #21.
While I'm happy to have multiple indices, I think each index needs to be valid unto itself. The current syntax wouldn't accept Additionally, I'm not sure I like the idea of paths being indices, whether or not they use
Agreed. |
@gregsdennis wrote:
I don't fully understand this point (putting aside concerns about the @ notation.) The grammar as currently presented in the draft doesn't distinguish between bracketed expressions with one entry and unions. Bracketed expressions are defined entirely in terms of union elements. The grammar is currently incomplete, but I'm interested in what the grammar does with "*" and filters. |
As another data point on unions and filters, @cburgmer's Proposal A has a clever restricted syntax within filters which disallows "*" and ensures that comparisons inside filters are only operating on single values. |
I'm perfectly happy with the union (more than one index in a bracket). My concern was the |
@glyn wrote:
@glyn, Thanks for the link. As far as I can tell the grammar in the draft hasn't changed since your original upload, it would be nice to see it move forward :-) Or do you feel that it's gone as far as it can before other issues are resolved? |
I'd like to see a PR for For the record, I agreed with the WG chairs to focus on the compliance test suite and reference implementation (and thereby provide a counterbalance to, and critique of, the "pure" spec work) rather than doing more spec work. Also, I'd like the spec details to genuinely be a product of multiple minds and I don't have much written evidence that many others have yet engaged with the details of how selectors are combined into JSONPaths. When I see the PRs I just mentioned start to appear, I'll be very happy... |
@goessner wrote:
But also note that in XPath, the union and | operators are equivalent, and parentheses are supported, so I think the XPath style | operator equivalent of
|
hmm ... if we agreed, that would break existing implementations ... !? |
@goessner wrote:
Assuming this is replying to this, of course, I'm not proposing that notation :-) That notation would change the JSONPath parse tree from a simple list of selectors to a full tree with operands and operator precedence, as it is in XPath and JMESPath. I think the existing union notation is fine. I'm only suggesting that On this point, there would be no break to existing implementations, it would be a generalization only. |
@gregsdennis wrote:
Okay, putting aside the specific notation, I'll just note that the union in XPath that inspired the union in JSONPath allows expressions as union elements (in the same way as I suggested with the @ notation, they're evaluated against the current item), and the somewhat analogous These are some motivations. I think @goessner's thoughts about the equivalence of an | operator and a union provide additional motivation if interpreted in the right way, meaning the way in which I want them to be interpreted :-) There are a few implementations in JSONPath comparisons that support an example of this but without the leading '@'. There have been requests for this feature on stackoverflow and elsewhere. I think that covers the reasons in favour. |
@gregsdennis wrote:
I think the issue here is with the bracket notation being overloaded in JSONPath for both indexes, on the one hand, and XPath style unions, on the other. For "indexes" interpreted broadly, it's natural to restrict to numbers, slices and wildcards, and perhaps identifiers. For unions, it's natural to allow paths, as XPath does. JMESPath also uses brackets for both indices and multi-select-list (analogous to unions), but in the grammar distinguishes between them. It distinguishes between a bracket specifier, with one element,
and a multi-select-list, which only allows comma separated expressions (paths). In JMESPath, identifiers cannot start with a number, so there is no ambiguity. But even without that grammatical distinction, there would be no ambiguity to allow both indexes and paths as union elements in JSONPath unions. The reason for raising this in this issue is that it fits naturally with @goessner's discussion about the equivalence of the 'or' operator and unions, and clearly in the |
I think we need to put what qualifies as an "index" (used here to describe a single element inside the bracket notation, e.g. I am fine with combining more than one "index" separated by commas. Further, whatever is decided to qualify as an "index" should be unionable in this way. For example, these should all be valid:
where |
Putting this here b/c I'm not sure where else to put it. A StackOverflow question regarding support for dotted paths inside brackets. The main point of confusion here is the idea that a dotted path could legitimately represent a key. Consider the following JSON: {"foo.bar": 0, "foo": {"bar": 1}} What would the path The same argument applies whether a path starts with a I think this is something that we need to cover in the spec, even if we decide not to support it. (This would be one of the options for a key or index or whatever we end up calling a thing inside the brackets.) |
Starting to interpret indexing strings as if they were JSONPath syntax leads to a slippery slope. Strong opinion against that. |
Following @gregsdennis' hint I raised a new issue PS: Excuse me for introducing PPS: in slightly provocative |
I'm happy to close this issue. |
I don't think anything actionable remains. Closing. |
Consider following JSON value ...
1. Basic Usage
Examples are ...
$['a','b'].u
[[1,2],[11,12]]
$.c[0,1]
[[21,22],[23,24]]
$.a..['u',1]
[[1,2],2,4]
2. Unions are Syntactic Sugar
Example expressions above can easily be singularized, i.e.
$['a','b'].u
$['a'].u | $['b'].u
$.c[0,1]
$.c[0] | $.c[1]
$.a..['u',1]
$.a..['u'] | $.a..[1]
using XPath operator
|
foror
.So if all union expressions are only syntactic sugar, why do they exist? Possible answers are:
Last point is the more serious one. Only implementors can show individually, if processing union expressions is more performant than multiple invocations of their JSONPath command / function with singularized expressions or otherwise round.
Maybe this is motivation enough to slightly change the spec from
to
in order to allow implementors applying a list of queries to a JSON value and thus improving their performance.
3. Duplicates and Ordering
Accepting equivalence of an union expression and its set of singularized expressions according to Ch.2, neither duplicates nor ordering needs to be discussed anymore with unions, since
$[0,0]
yields the same result as$[0] | $[0]
, which is obviously twice the same value.$[1,2]
should yield the same result as$[1] | $[2]
. Exact order might still be not deterministic.4. Variable Expressions
I know only a handful examples using the root selector, which are of practicle value. One of them is internal referencing ala
$.a[($.id)] === $.a['v'] => [[3,4]]
. Examples with the current node selector are questionable at best, like$.a.u[(@[0])] === $.a.u[1] => [2]
.Results of those expressions are always interpreted as names or indices (JSON literals string and number) according to the nature of their parents.
Now there are rising some questions regarding this:
$.a[$.id]
and$.a.u[@[0]]
.$.a[@.u[@[0]]]
?$.a['u',$.id]
?5. Path Expressions
@danielaparker linked to an interesting discussion regarding unions containing paths, as in
$..[id,a.u]
wanting to get$.id | $.a.u
. Applying singularization principle yields$..[id]
... ok, if'id'
was used.$..[a.u]
... nonsense, since$.a.u
was meant.Using
$..['id',@.a.u]
according to Ch.4 is completely different.6. Resumee
I simply took the term "union" from XPath 1.0. Now I also agree with most of others here, that "union" should be replaced by a better term.
The text was updated successfully, but these errors were encountered: