Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Series.str.find fix for arrow strings when start < 0 #56412

Merged
merged 2 commits into from
Dec 8, 2023

Conversation

rohanjain101
Copy link
Contributor

@rohanjain101 rohanjain101 commented Dec 8, 2023

@@ -1835,7 +1835,7 @@ def test_str_fullmatch(pat, case, na, exp):

@pytest.mark.parametrize(
"sub, start, end, exp, exp_typ",
[["ab", 0, None, [0, None], pa.int32()], ["bc", 1, 3, [2, None], pa.int64()]],
[["ab", 0, None, [0, None], pa.int32()], ["bc", 1, 3, [1, None], pa.int64()]],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>>> s = pd.Series(["abc"], dtype=pd.StringDtype())
>>> s.str.find("bc", 1, 3)
0    1
dtype: Int64
>>>

This would now match the result produced by pd.StringDtype()

@phofl phofl merged commit 9893c43 into pandas-dev:main Dec 8, 2023
44 checks passed
@phofl
Copy link
Member

phofl commented Dec 8, 2023

thx @rohanjain101

@phofl phofl added this to the 2.2 milestone Dec 8, 2023
@phofl phofl added the Strings String extension data type and string data label Dec 8, 2023
@rohanjain101 rohanjain101 deleted the arrow_find branch December 8, 2023 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Series.str.find with negative start produces unexpected results for arrow strings
2 participants