-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CoW warning mode: enable chained assignment warning for DataFrame setitem in default mode #56230
CoW warning mode: enable chained assignment warning for DataFrame setitem in default mode #56230
Conversation
…item in default mode
@@ -452,7 +452,8 @@ def test_detect_chained_assignment_undefined_column( | |||
df.iloc[0:5]["group"] = "a" | |||
else: | |||
with pytest.raises(SettingWithCopyError, match=msg): | |||
df.iloc[0:5]["group"] = "a" | |||
with tm.raises_chained_assignment_error(): | |||
df.iloc[0:5]["group"] = "a" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These few test changes illustrate cases where you now get both SettingWithCopyWarning and the new chained assignment warning.
But for example the one above doesn't actually change df
(so nobody should be doing that right now), so let's not care about the double warning?
(in theory I could specifically not raise the warning when the key is a single column name, because assigning to a column is never inplace and thus can never change behaviour)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah let's just live with both warnings, that shouldn't be something that users are doing anyway as you said
pandas/core/frame.py
Outdated
warn_copy_on_write() | ||
or ( | ||
not warn_copy_on_write() | ||
and any(b.refs.has_reference() for b in self._mgr.blocks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't special case between the warning and non warning mode, just
any(b.refs.has_reference() for b in self._mgr.blocks)
(another PR of mine has a utility for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this from what I did for Series setitem (https://github.com/pandas-dev/pandas/pull/55522/files#diff-a5257444a1b322d619680fc77361cc6ea11ef36b363b4bb2289fdef0f41feb70R1231-R1233). I don't remember exactly what the reason was that I did it there, but it was to get some test passing. Will have a look again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quickly checking which tests fail if I remove the special case for the warning mode. For example, then a case like df[mask]["col"] = 10
doesn't raise any warning. In the default mode, it also raise the SettingWithCopyWarning to alert you, but that SettingWithCopyWarning isn't present in the warning mode. And since mask gives a copy, there is no reference, and so with the has_reference() check, we wouldn't warn here.
I think it is good to keep at least one warning in this case. Of course, it's not super important, as in theory no-one should be developing in the warning mode (but only use that to check existing code). But since it's easy to always warn in CoW-warn mode, I thought better to do so (it's also a bit easier to test, otherwise I would have to special case the warning mode as the only one that does not raise the warning ;))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, I didn't consider that the settingwithcopy warning is missing in the warning case, fine to keep in then
Thanks for the fixup |
Follow-up on #55522 (comment)
This ensures the chained assignment warning is shown in the default mode as well.
This does mean you currently get some noise cases of both this new warning and the existing SettingWithCopyWarning. But either this SettingWithCopy was a false positive and it was setting on a view (eg a far-fetched case like
df[0:3][0:2] = 10
), and then we really need to have the new warning as well because it will change behaviour. Or either it was a correct warning (e.g.df[mask][other_indexer] = ..
, and in that case this has never been working, and adding an additional warning for a case that never works does no harm (and so a user can only have by accident, or potentially unaware it doesn't do what they intend).xref #56019