Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CoW warning mode: enable chained assignment warning for DataFrame setitem in default mode #56230

Conversation

jorisvandenbossche
Copy link
Member

Follow-up on #55522 (comment)

This ensures the chained assignment warning is shown in the default mode as well.
This does mean you currently get some noise cases of both this new warning and the existing SettingWithCopyWarning. But either this SettingWithCopy was a false positive and it was setting on a view (eg a far-fetched case like df[0:3][0:2] = 10), and then we really need to have the new warning as well because it will change behaviour. Or either it was a correct warning (e.g. df[mask][other_indexer] = .., and in that case this has never been working, and adding an additional warning for a case that never works does no harm (and so a user can only have by accident, or potentially unaware it doesn't do what they intend).

xref #56019

@@ -452,7 +452,8 @@ def test_detect_chained_assignment_undefined_column(
df.iloc[0:5]["group"] = "a"
else:
with pytest.raises(SettingWithCopyError, match=msg):
df.iloc[0:5]["group"] = "a"
with tm.raises_chained_assignment_error():
df.iloc[0:5]["group"] = "a"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These few test changes illustrate cases where you now get both SettingWithCopyWarning and the new chained assignment warning.

But for example the one above doesn't actually change df (so nobody should be doing that right now), so let's not care about the double warning?

(in theory I could specifically not raise the warning when the key is a single column name, because assigning to a column is never inplace and thus can never change behaviour)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah let's just live with both warnings, that shouldn't be something that users are doing anyway as you said

warn_copy_on_write()
or (
not warn_copy_on_write()
and any(b.refs.has_reference() for b in self._mgr.blocks)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't special case between the warning and non warning mode, just

any(b.refs.has_reference() for b in self._mgr.blocks)

(another PR of mine has a utility for this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this from what I did for Series setitem (https://github.com/pandas-dev/pandas/pull/55522/files#diff-a5257444a1b322d619680fc77361cc6ea11ef36b363b4bb2289fdef0f41feb70R1231-R1233). I don't remember exactly what the reason was that I did it there, but it was to get some test passing. Will have a look again

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quickly checking which tests fail if I remove the special case for the warning mode. For example, then a case like df[mask]["col"] = 10 doesn't raise any warning. In the default mode, it also raise the SettingWithCopyWarning to alert you, but that SettingWithCopyWarning isn't present in the warning mode. And since mask gives a copy, there is no reference, and so with the has_reference() check, we wouldn't warn here.

I think it is good to keep at least one warning in this case. Of course, it's not super important, as in theory no-one should be developing in the warning mode (but only use that to check existing code). But since it's easy to always warn in CoW-warn mode, I thought better to do so (it's also a bit easier to test, otherwise I would have to special case the warning mode as the only one that does not raise the warning ;))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I didn't consider that the settingwithcopy warning is missing in the warning case, fine to keep in then

@jorisvandenbossche jorisvandenbossche merged commit 5165102 into pandas-dev:main Dec 4, 2023
44 checks passed
@jorisvandenbossche jorisvandenbossche deleted the cow-warnings-frame-setitem branch December 4, 2023 08:29
@jorisvandenbossche
Copy link
Member Author

Thanks for the fixup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants