Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incosistent behavior of loc-set with 2 indexes to Series #59333 #60052

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

anzber
Copy link

@anzber anzber commented Oct 15, 2024

Comments about pull-request:

  1. Fixed case with inconsistent behavior of different indexes
>>> df = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]], columns=list("ABC"))
>>> df.loc[:, ['A', 'B', 'C']] = pd.Series([10, 20, 30]) 
#before fix
>>> df
    A   B   C
0  10  20  30
1  10  20  30
2  10  20  30
#after fix
>>> df
    A   B   C
0  10  10  10
1  20  20  20
2  30  30  30

Inconsistance with case:

#before and after fix (no changes)
>>> df = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]], columns=list("ABC"))
>>> df.loc[df['A']>0, ['A', 'B', 'C']] = pd.Series([10, 20, 30]) 
>>> df
    A   B   C
0  10  10  10
1  20  20  20
2  30  30  30
  1. Fixed initial issue. And now it works consistently with test_multi_assign_broadcasting_rhs like this
>>> df = pd.DataFrame([[1, 2, 3], [4, 5, 6]], columns=list("ABC"))
>>> df.loc[:, pd.Series(['A', 'B', 'C'])] = pd.Series([10, 20, 30])
#before fix - FAIL
#after fix 
>>> df
    A   B   C
0  10  10  10
1  20  20  20

def test_multi_assign_broadcasting_rhs(self):
# broadcasting on the rhs is required
df = DataFrame(
{
"A": [1, 2, 0, 0, 0],
"B": [0, 0, 0, 10, 11],
"C": [0, 0, 0, 10, 11],
"D": [3, 4, 5, 6, 7],
}
)
expected = df.copy()
mask = expected["A"] == 0
for col in ["A", "B"]:
expected.loc[mask, col] = df["D"]
df.loc[df["A"] == 0, ["A", "B"]] = df["D"].copy()
tm.assert_frame_equal(df, expected)

@anzber anzber marked this pull request as ready for review October 16, 2024 08:01
@rhshadrach
Copy link
Member

CI is fixed now, can you merge main.

@mroeschke mroeschke requested a review from rhshadrach October 30, 2024 19:32
Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mroeschke - I'm not too familiar with this code, this looks okay but am not certain. Can you take a look as well?

ser_values = ser.reindex(obj.axes[0][indexer[0]])._values

# single indexer
if len(indexer) > 1 and not multiindex_indexer:
len_indexer = len(indexer[1])
if isinstance(indexer[1], slice):
len_indexer = len(obj.loc[indexer[1]].axes[1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use axes[1][indexer[1]] instead of .loc.

Copy link
Author

@anzber anzber Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made changes requested, but failed test appeared

@rhshadrach
Copy link
Member

@anzber - test failures look related. Can you also give this PR an informative title? It is helpful to have titles that give devs a sense of what the PR is doing without having to go open the issue.

@anzber
Copy link
Author

anzber commented Nov 5, 2024

@anzber - test failures look related. Can you also give this PR an informative title? It is helpful to have titles that give devs a sense of what the PR is doing without having to go open the issue.

@rhshadrach added more information to title. Thank You for reviewing the code.

@anzber anzber requested a review from rhshadrach November 15, 2024 20:00
Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added more information to title

The title of this PR is still Fix issue #59333.

Will also need a note in the whatsnew for v3.0.0.

pandas/tests/indexing/test_loc.py Outdated Show resolved Hide resolved
@anzber anzber changed the title Fix issue #59333 Fix incosistent behavior of multiindexed locset to Series #59333 Nov 17, 2024
@anzber anzber changed the title Fix incosistent behavior of multiindexed locset to Series #59333 Fix incosistent behavior of loc-set with 2 indexes to Series #59333 Nov 17, 2024
@anzber
Copy link
Author

anzber commented Nov 17, 2024

@rhshadrach
I've found a new problem related the issue

This code fails, but shouldn't :

df33 = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]], columns = list("ABC"))
df33.loc[:, pd.Series({"A": True, "B": True, "C": False})] = pd.Series([10, 20, 30])
df33

Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Dec 18, 2024
@mroeschke
Copy link
Member

Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen.

@mroeschke mroeschke closed this Dec 29, 2024
@anzber anzber mentioned this pull request Dec 29, 2024
5 tasks
@rhshadrach rhshadrach reopened this Dec 30, 2024
@rhshadrach
Copy link
Member

Reopening (ref: #60621)

.gitignore Outdated Show resolved Hide resolved
@anzber anzber requested a review from rhshadrach December 30, 2024 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: locset with Series as column key fails inconsistently
3 participants