Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Slicing an array backwards beyond the start doesn't include first item #38768

Closed
jorisvandenbossche opened this issue Nov 17, 2023 · 1 comment · Fixed by #39240
Closed

Comments

@jorisvandenbossche
Copy link
Member

From pandas-dev/pandas#55832

When slicing an array "backwards" (with negative step, -1 in the example), and when the stop is "beyond" the start (larger negative number than number of elements), the first item is not included:

In [9]: arr = pa.array([0, 1, 2, 3, 4])

In [10]: arr[4:-6:-1]
Out[10]: 
<pyarrow.lib.Int64Array object at 0x7faab0eaf8e0>
[
  4,
  3,
  2,
  1
]

While if we omit the stop (meaning, slice until the end, in this case of backwards slicing until the start), it works correctly:

In [11]: arr[4::-1]
Out[11]: 
<pyarrow.lib.Int64Array object at 0x7faab0eac340>
[
  4,
  3,
  2,
  1,
  0
]

I haven't yet checked if this is an issue in the underlying C++ slicing, or in the Python layer, eg in our "normalize_slice" logic:

def _normalize_slice(object arrow_obj, slice key):

@LucasG0
Copy link
Contributor

LucasG0 commented Dec 7, 2023

Hi, I am interested in working on this issue. I will have a look and submit a PR.

LucasG0 added a commit to LucasG0/arrow that referenced this issue Dec 15, 2023
jorisvandenbossche pushed a commit that referenced this issue Mar 15, 2024
…cludes first item. (#39240)

### What changes are included in this PR?

Minor changes in `_normalize_slice` so `start` and `stop` are both computed in a single if/else block instead of having them modified later in case of a negative `step`.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Fixing wrong data returned in an edge case.
* Closes: #38768

Authored-by: LucasG0 <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
@jorisvandenbossche jorisvandenbossche added this to the 16.0.0 milestone Mar 15, 2024
jorisvandenbossche added a commit that referenced this issue Apr 9, 2024
…is now empty (#40682)

### What changes are included in this PR?

`_normalize_slice` now relies on `slice.indices` (https://docs.python.org/3/reference/datamodel.html#slice.indices).

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Fixing wrong data returned in an edge case.
* GitHub Issue: #40642
* GitHub Issue: #38768

Lead-authored-by: LucasG0 <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
raulcd pushed a commit that referenced this issue Apr 10, 2024
…is now empty (#40682)

### What changes are included in this PR?

`_normalize_slice` now relies on `slice.indices` (https://docs.python.org/3/reference/datamodel.html#slice.indices).

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Fixing wrong data returned in an edge case.
* GitHub Issue: #40642
* GitHub Issue: #38768

Lead-authored-by: LucasG0 <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
verma-kartik pushed a commit to verma-kartik/arrow that referenced this issue Apr 11, 2024
…start is now empty (apache#40682)

### What changes are included in this PR?

`_normalize_slice` now relies on `slice.indices` (https://docs.python.org/3/reference/datamodel.html#slice.indices).

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Fixing wrong data returned in an edge case.
* GitHub Issue: apache#40642
* GitHub Issue: apache#38768

Lead-authored-by: LucasG0 <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
vibhatha pushed a commit to vibhatha/arrow that referenced this issue Apr 15, 2024
…start is now empty (apache#40682)

### What changes are included in this PR?

`_normalize_slice` now relies on `slice.indices` (https://docs.python.org/3/reference/datamodel.html#slice.indices).

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Fixing wrong data returned in an edge case.
* GitHub Issue: apache#40642
* GitHub Issue: apache#38768

Lead-authored-by: LucasG0 <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
tolleybot pushed a commit to tmct/arrow that referenced this issue May 2, 2024
…start is now empty (apache#40682)

### What changes are included in this PR?

`_normalize_slice` now relies on `slice.indices` (https://docs.python.org/3/reference/datamodel.html#slice.indices).

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Fixing wrong data returned in an edge case.
* GitHub Issue: apache#40642
* GitHub Issue: apache#38768

Lead-authored-by: LucasG0 <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
…start is now empty (apache#40682)

### What changes are included in this PR?

`_normalize_slice` now relies on `slice.indices` (https://docs.python.org/3/reference/datamodel.html#slice.indices).

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Fixing wrong data returned in an edge case.
* GitHub Issue: apache#40642
* GitHub Issue: apache#38768

Lead-authored-by: LucasG0 <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Joris Van den Bossche <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants