Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter with index in the parser does not filter #3260

Closed
dvd101x opened this issue Sep 1, 2024 · 13 comments
Closed

filter with index in the parser does not filter #3260

dvd101x opened this issue Sep 1, 2024 · 13 comments
Labels

Comments

@dvd101x
Copy link
Collaborator

dvd101x commented Sep 1, 2024

Hi,

Just found out this edge case in the parser on v13.1.1

filter([10, 20], f(x, index) = index == 1)     # yields [10, 20]
# expected [10]
filter([10, 20], f(x, index) = index == 1000)  # yields [10, 20]
# expected []

Outside the parser there is no issue.

math.filter([10, 20], (x, index) => index == 0)    // yields [10]
math.filter([10, 20], (x, index) => index == 1000) // yields []

I found it while working on #3256, I'm still trying to find the source of the error.

@josdejong
Copy link
Owner

That is a nice one. The tricky thing is that the index of the map and filter functions of mathjs is not a number but an array with one or multiple numbers, depending on the number of dimensions of the matrix. So the right way to do this is:

filter([10, 20], f(x, index) = index[1] == 1)    # [10] as expected
filter([10, 20], f(x, index) = index[1] == 1000) # []   as expected

The JavaScript code accidentally gives the right results because [0] == 0 and [1] == 1 returns true in JavaScript, whereas anything else like [0] === 1000 returns false. Apparently an array with one value is compared by taking the value from the array and compare that with the right hand side.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Sep 7, 2024

Thank you Jos!

My bad. So I missed that the index is an array of numbers, not a number and the type conversion that happens with == instead of ===. I wasn't aware that [10] == 10 is true in js.

Something curious is that filter only takes 1D arrays, so I think in filter the index is always of the form [i] and never [i, j, ...]. Thus for the filter to work in the parser with an index, the index should always be index[1] and in js when comparing with === it should be index[0].

Returning to the cases I supplied I think it went like this:

filter([10, 20], f(x, index) = index == 1)     # yields [10, 20]
# [1] == 1    yields [true] which is thruthy
# [2] == 1    yields [false] which is thruthy

filter([10, 20], f(x, index) = index == 1000)  # yields [10, 20]
# [1] == 1000    yields [false] which is thruthy
# [2] == 1000    yields [false] which is thruthy

Now I understand it's not an issue of mathjs as the convention is for the index to be an array of numbers because in most contexts it can take n dimensional arrays.

Understanding this is not an issue, I don't know if for convenience, it would be nice for contexts that are 1D arrays (like filter) to accept an index that is a number and internally converting it to an array of one number.

@josdejong josdejong mentioned this issue Sep 11, 2024
7 tasks
@josdejong
Copy link
Owner

That is a good point, filter only supports 1d arrays so having the index being an array with a single value doesn't make much sense except for consistency with the map method 🤔. Changing this would be a breaking change though, and in general it is good to minimize breaking changes as much as possible. Let's think this though a bit more.

I've added this to the list with breaking changes for v14 as a reminder.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Oct 7, 2024

I gave it some more thought.

What about an automatic conversion, if the index is a number, convert it to an array with only that number. Similar to what js already does where 123 == [123].

Would that be a breaking change?

@josdejong
Copy link
Owner

So then:

  • the index is a number for one dimensional arrays/matrices
  • the index is an array for 2d or multi-dimensional arrays/matrices

It is an interesting idea. I'm not sure if it is a good idea though, it is kind of an inconsistent behavior. This requires the callback to have knowledge about the number of dimensions of the array that it is operating on 🤔. Maybe it would be better to create a different version of map and filter that explicitly only work on 1d arrays, like map1 and filter1 or so? On the other hand, the issue may not be that big that it needs a solution. Using these functions from mathjs may give a WTF the first time you use it, but the API is constent and well documented so you probably figure it out quite fast.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Nov 7, 2024

Hi Jos,

My intention was to make a proposal for filter to automatically convert a value that is not an array to an array containing that value automatically.

I did some attempts to use the original callback but couldn't make it work and then got occupied with other stuff.

Just to let you know it's on the back of my mind but I don't think I will be able to make a proof of concept until December.

@josdejong
Copy link
Owner

Thanks for the update. Before we put effort in a PR or POC I think we should decide on whether we want to adjust this behavior or not. I'll give that some more thought soon.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Nov 8, 2024

Ok, thanks, I'm not sure if it was possible, so I was checking if there was an easy way to do both.

As a summary the idea was for both methods to work.

X = [10, 20, 30];
filter(X, f(x, i) = x + i[1]> 11)  # as index is always an array of one number it should always be i[1]
# Yields [20, 30]
filter(X, f(x, i) = x + i > 11)
# Currently yields [10, 20, 30] but the idea is to yield [20, 30]

In a similar fashion [101] == 101 yields true.

I understand there are more implications regarding the integrity of the language. I'll postpone any POC until we have a clearer view of the desired behavior.

@josdejong
Copy link
Owner

I understand your idea, though I don't have an idea on how to make that work 🤔.

@josdejong
Copy link
Owner

I've been thinking about pros and cons of changing index of function filter to be a number instead of array containing one number.

Pros:

  1. Simpler to use. There is no use case for index as an array, everyone has to read the first item from the array before being able to do something.
  2. Consistent with JavaScript's filter function

Cons:

  1. It is not consistent with map and forEach. When using filter and map in a chain for example, it is odd that in one case the index is a number and in an other case it is an array.
  2. Changing this behavior is a breaking change. It is best to minimize breaking changes.

I think the downsides are bigger than the upsides, I value consistency and non-breaking changes quite highly. Also, when using the index wrongly, it definitely results in a WTF moment, but I think in most cases it is not that hard to debug the issue. Are you ok with leaving the behavior as is @dvd101x? Or do you see more pros/cons?

I think we should at least improve the documentation of map, filter, and forEach: describe what index is and how to use it. Also, we should improve the type definitions of index for these three functions: that prevents the issue when in a TypeScript environment.

@dvd101x
Copy link
Collaborator Author

dvd101x commented Nov 15, 2024

Hi Jos, thanks for the through consideration and review.

I don't have further comments in favor of this behavior other than what was previously discussed. Specially since I was unable to make it work for both cases I tend to agree with your decision and I have no strong objections about keeping it what it is, since I seldom use filter with an index.

I agree further documentation of index is needed for the callbacks in map, forEach and filter.

As a side note, I noticed there is a function called index that isn't found on functions even though there is documentation for the index function and can be accessed from the see also on subset.

@josdejong
Copy link
Owner

josdejong commented Nov 20, 2024

I've improved the documentation and type definitions via 67ddc72. This is no breaking change.

@josdejong
Copy link
Owner

The improvements are published now in v13.2.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants