Support for math operators in expressions #419

gregsdennis · 2023-03-08T05:27:16Z

I think mathematical operators would be a beneficial addition (😏) to the expression syntax. It would allow things like

$[[email protected][email protected][email protected]]

to check for model consistency. (Arguably, you just wouldn't serialize c as it should be a calculated field, but people do stranger things.)

There are doubtless other use cases.

I have support for this currently in my library. It's really easy to implement, and I don't think it would be too hard to specify.

I think this is within our charter as @goessner's original implementations supported "underlying scripting language" for expressions, which undoubtedly supported these operators.

The text was updated successfully, but these errors were encountered:

gregsdennis · 2023-03-08T05:27:39Z

(I'm happy to defer this until after we've sorted out our function typing issues.)

goessner · 2023-03-08T08:59:34Z

Well ... this might be useful indeed. But implementing arithmethics and specifying it in a clean way are two very different shoes.

We might deal then with:

0.1+0.2 == 0.3 problem.
Division by Zero.
Define EPSILON
should we allow + operator to also concatenate strings ?
Explicite number type
rounding
sqrt ... where to stop ?

Alternatively, I can imagine, that a function similar to CSS calc would be easy to implement and easier to specify.

glyn · 2023-03-08T10:45:18Z

Yes, first class support for mathematical operators will entail a lot of spec work. Function extensions could be used instead.

I suggest we defer this issue and tag it "revisit-after-base-done".

gregsdennis · 2023-03-09T01:01:41Z

The comparison indicates that many implementations support a path like $[?(@.key+50==100)], but it's split about 50/50 between reading that as

a math operation: @.key + 50
a "key+50" key

I wonder how adding in a couple spaces would do: $[?(@.key + 50==100)]. This should differentiate whether math operations are supported.

cabo · 2023-03-09T07:47:18Z

member-name-shorthand cannot contain a +, so recognizing @.key is not a problem.
(The problem is that adding math adds a ton of additional considerations.
E.g., what if @.key is "50" and not 50, etc.)

gregsdennis · 2023-03-09T08:30:23Z

member-name-shorthand cannot contain a +

Yeah, it's understood that those implementations aren't spec-compliant.

The problem is that adding math adds a ton of additional considerations. E.g., what if @.key is "50" and not 50, etc.

Yeah, it's understood that we'd have to do that stuff. I don't think we should shy away from it, though.

I still think this is within our charter.

ohler55 · 2023-03-11T00:05:26Z

Personal bias here but I've found simple math operators (-, +, *, /) very useful in practice. With a decision on what to return for a divide by zero I think most end users would like the extra flexibility time math operators provide.

The one limitation I've had users question is why a - character can not be in a token since it can be confused with a minus sign when the token is used in an expression.

gregsdennis · 2023-03-11T01:35:42Z

We currently forbid - in the shorthand name syntax (requiring the brackets syntax instead), so that's not a problem.

glyn · 2023-03-12T16:28:27Z

Deferring until after base done.

goessner · 2023-03-23T13:18:03Z

Follow up of #449:

Take the following arithmetic example: (a + b + c)*d/e <= 42, where a,b,c,d,e are members of the current node.

Using a set of small (binary) functions results in the query

$.arr[?div(prod(sum(sum(@.a,@.b),@.c),@.d),@.e) <= 42]

whereas using a calc function looks like

$.arr[?calc('(@[email protected][email protected])*@.d/@.e') <= 42]

I predict, most users will prefer the latter syntax.

We need here a function calc

expecting a single argument of type string.
returning the resulting number value or false (or Nothing) in case of an invalid argument.
having access to its environment via closure concept.

The string argument must contain a pure arithmetic expression, that means

only a limited set of (binary?) arithmetic operators is allowed (+,-,*,/,%,**).
operands need to be
- number literals
- singular nodelists containing number values
- functions returning number values or singular nodelists containing number values

When Greg says regarding inline arithmetic:

I have support for this currently in my library. It's really easy to implement, and I don't think it would be too hard to specify.

Then implementing the calc function would even be more easier due to encapsulation. An implementation being able to parse JSONPath queries shouldn't find parsing isolated arithmetic expressions extremely challenging. Specifying that function should be a lot easier than specifying inline arithmetic with all its side effects.

Then there is another charming aspect of this approach.

Imagine the following scenario: A user is supplying a set of parts of simple geometry, holding the part-descriptions in a JSON array.

Each part description is redundancy-free and holds geometric and material properties. The part mass might be a measure of the selling price. So if we want to find all cuboids with a mass less than 20 (kg), we can start the query

$.parts[[email protected]=='cuboid' && calc('@.a*@.b*@.c*@.rho') < 20]

where a,b,c in [m] are the cuboid dimensions and rho its density in [kg/m^3].

In case we know - as the JSON author - that the part mass is frequently requested, we can even put into the header section of the JSON data

{  mass: {
     cuboid:"@.a*@.b*@.c*@.rho",
     sphere:"4/3*3.14*@r**3*@.rho",
     cylinder:"3.14*@.r**2*@.h*@rho"
   },
   parts: [...]
}

also the arithmetic expressions for other part masses. This way we can reformulate the query above to

$.parts[?calc($.mass[?index(@)!=''][email protected]) < 20]

which of course then requires the useful index function most recently discussed in #156.

Apart from that, having simple strings holding arithmetic expressions allows us to store them in JSON for reuse in the same way, as we can do it with JSONPath queries or preferrably with normalized pathes as strings.

That you cannot do conceptually with the barely readable mult/div/sum approach.

@gregdennis:

I fail to see how a calc() function would be any different than just including math operators in expressions. You'd still have to specify what is valid as a parameter to calc() and how that works. It seems easier to just define math operators and be done with it.

... no, due to strong encapsulation and sharp restricted syntax explained above.

@cabo:

Of course, this would break any attempt to have an extensible function interface, ...

I don't see this, please elaborate.

... because calc would need to include half of JSONPath’s syntax and would need access to all the related functionality as well.

... again no, due to strong encapsulation and sharp restricted syntax of pure arithmetic expressions, implementation should be easy, as Greg already mentioned above.

Stefan

ohler55 · 2023-03-23T13:43:59Z

If we are considering the ease of use for the end user I would think $.parts[?(@.x == @y + 3)] or $.parts[?(@.x == (@y + 3))] would be the most natural.

It shouldn't really matter how hard it is to implement if it is better for the end users. Anyone undertaking the task of implement the spec will have to be competent anyway so a little more work shouldn't be that large a hurdle. (IMHO)

goessner · 2023-03-23T14:09:07Z

@ohler55 ... I do understand this very well from a user's point of view. But on the way there will be a lot of spec work to be done. So we are discussing here a way, how functions - in which form - can help to add arithmetic expressions to queries, while having sufficient user acceptance.

I would applaud if some implementers gain experience meanwhile by implementing side by side

query inline arithmetic.
encapsulate it in a calc function.

Then they can help to identify edge cases, type collisions and handling of numeric anomalies.

danielaparker · 2023-03-23T14:35:47Z

Follow up of #449:

Take the following arithmetic example: (a + b + c)*d/e <= 42, where a,b,c,d,e are members of the current node.

Using a set of small (binary) functions results in the query
$.arr[?div(prod(sum(sum(@.a,@.b),@.c),@.d),@.e) <= 42]
whereas using a calc function looks like
$.arr[?calc('(@[email protected][email protected])*@.d/@.e') <= 42]

But you don't need a calc function to support that notation, it's very straightforward to incorporate numeric operators into the script expression language, with the usual precedence and associativity. For example, for two C++ and .Net implementations described here, given the following document,

{"arr":[{"a":2,"b":3,"c":5,"d":8,"e":2},{"a":2,"b":3,"c":5,"d":10,"e":2}]}

and query

$.arr[?(@[email protected][email protected])*@.d/@.e <= 42]

the result is

[{"a":2,"b":3,"c":5,"d":8,"e":2}]

That is, it's very straight forward if @.a, @.b, etc, evaluate to values, not sure what it would mean if they were to evaluate to nodelists.

Daniel

ohler55 · 2023-03-23T15:01:20Z

I took the approach described by @danielaparker in OjG but there is no reason all three of the proposed approaches could not be implemented. Having said that, picking one approach as the minimum and offering the others are extensions might be a way to resolve this.

gregsdennis · 2023-03-23T20:11:56Z

I agree with @danielaparker and @ohler55: these operators need to be supported in general expressions, not merely inside some function.

Then implementing the calc function would even be more easier due to encapsulation... Specifying that function should be a lot easier than specifying inline arithmetic with all its side effects. - @goessner

I don't see how the level of effort for supporting them in a function is any less than to support them in general expressions. If anything I think it's more effort because you have to explain why this syntax is valid only inside of this function.

$.parts[[email protected]=='cuboid' && calc('@.a*@.b*@.c*@.rho') < 20]

From a parsing perspective, this is much more complicated than

$.parts[[email protected]=='cuboid' && @.a*@.b*@.c*@.rho < 20]

From a user perspective, calc() is unnecessary.

Regarding the "expressions in data" concept, we don't currently support data specifying a path anywhere, and doing so opens a whole new can of worms that we'd need to consider. It's paving the way for an exec() function that executes code.

That you cannot do conceptually with the barely readable mult/div/sum approach.

No one is advocating for this approach. Sure calc() is better than these, but calc() is measurably worse that just supporting math in expressions.

goessner · 2023-03-24T08:19:01Z

Hmm ... as an outcome of this discussion the realisation matures, that inline arithmetic develops as a de-facto standard in current implementations, which is also the natural thing, users expect.

It seems to be best, to defer activities into that direction until after base done, which in fact was the reason, why Glyn closed this issue.

gregsdennis · 2023-04-19T23:52:52Z

I came up with this for basic math support:

math-expr = binary-math-expr / unary-math-expr
binary-math-expr = math-operand binary-math-operator math-operand
unary-math-expr = unary-math-operator (number / singular-query / value-function-expr / math-group)
math-operand = number / singular-query / value-function-expr / math-expr / math-group
math-group = "(" math-expr ")"
binary-math-operator = "+" / "-" / "*" / "/"
unary-math-operator = "-"

We'd then add math-expr as an option on comparable

comparable = literal / singular-query / value-function-expr / math-expr

I believe this gives support for addition, subtraction, multiplication, division, and grouping, though it doesn't give operator precedence as yet (I'm working on that).

It does allow multiple negations (e.g. ----4), which is weird. There's also an ambiguity in -4 now between

negative 4 as a number
positive 4 that has been negated

In the end, I'm not sure it makes much of a difference; maybe it saves an operation to have it as "negative 4." Given the outcome is the same, maybe we just let implementations decide how they want to handle it.

It also doesn't prevent division by zero, but we'd have to contend with a path or a function returning zero anyway. I think the math-expr evaluating to Nothing is fine. That would result in a "false" comparison which just wouldn't select the node.

Similarly any path or ValueType function which returns a non-number could result in a Nothing evaluation as well.

This doesn't support string concatenation (yet).

gregsdennis · 2023-04-20T00:05:51Z

Does the ABNF need to give operator precedence?

4+5*6 is syntactically valid whether or not the syntax understands that * should be performed before +.

cabo · 2023-12-20T11:52:39Z

Does the ABNF need to give operator precedence?

The principle of least surprise says yes: Implementers will expect the AST they derive from the ABNF to be directly useful for a tree interpreter.

cabo added the enhancement New feature or request label Mar 9, 2023

glyn closed this as completed Mar 12, 2023

glyn added the revisit-after-base-done label Mar 12, 2023

gregsdennis mentioned this issue Mar 22, 2023

Types for Implementators vs. Consumers of JSON Path #449

Closed

glyn reopened this Dec 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for math operators in expressions #419

Support for math operators in expressions #419

gregsdennis commented Mar 8, 2023

gregsdennis commented Mar 8, 2023

goessner commented Mar 8, 2023

glyn commented Mar 8, 2023

gregsdennis commented Mar 9, 2023

cabo commented Mar 9, 2023

gregsdennis commented Mar 9, 2023

ohler55 commented Mar 11, 2023

gregsdennis commented Mar 11, 2023

glyn commented Mar 12, 2023

goessner commented Mar 23, 2023

ohler55 commented Mar 23, 2023

goessner commented Mar 23, 2023

danielaparker commented Mar 23, 2023

ohler55 commented Mar 23, 2023

gregsdennis commented Mar 23, 2023 •

edited

Loading

goessner commented Mar 24, 2023

gregsdennis commented Apr 19, 2023 •

edited

Loading

gregsdennis commented Apr 20, 2023

cabo commented Dec 20, 2023

Support for math operators in expressions #419

Support for math operators in expressions #419

Comments

gregsdennis commented Mar 8, 2023

gregsdennis commented Mar 8, 2023

goessner commented Mar 8, 2023

glyn commented Mar 8, 2023

gregsdennis commented Mar 9, 2023

cabo commented Mar 9, 2023

gregsdennis commented Mar 9, 2023

ohler55 commented Mar 11, 2023

gregsdennis commented Mar 11, 2023

glyn commented Mar 12, 2023

goessner commented Mar 23, 2023

ohler55 commented Mar 23, 2023

goessner commented Mar 23, 2023

danielaparker commented Mar 23, 2023

ohler55 commented Mar 23, 2023

gregsdennis commented Mar 23, 2023 • edited Loading

goessner commented Mar 24, 2023

gregsdennis commented Apr 19, 2023 • edited Loading

gregsdennis commented Apr 20, 2023

cabo commented Dec 20, 2023

gregsdennis commented Mar 23, 2023 •

edited

Loading

gregsdennis commented Apr 19, 2023 •

edited

Loading