-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: pct_change method/limit keyword #53491
Comments
+1 |
The default for |
For 2.1.0, how to achieve the behavior of About this new change, the What's New page says to explicitly call |
Thanks @aalyousfi; agreed we should allow the operation without warning. I propose we do not warn when |
is there an option that doesn't require a 2-step deprecation cycle? thats kind of heavy for fairly small method |
I don't think so. I'm curious why you describe this as heavy. I would only think of it as being heavy if it's holding up other improvements - is that the case here? Agreed it can be painfully slow, but if it's not holding up other work then I think we should put user experience at the forefront. |
Poor choice of words. I just don't like the 2-step deprecation pattern and would prefer to avoid it if at all possible. |
Since I worked on deprecating this keyword, should I now make another PR implementing the first stage of the 2-step deprecation? |
The warning message of this is really hard to get rid of since the series warns if any value is null and I have logic where I am expecting some series to be only NaNs, or where values at the very start of series are NaN. |
@Charlie-XIAO - I'm +1 on doing so. |
@rhshadrach @jbrockmendel I'm thinking that Also, I think we will need extra specific warning messages since this deprecation seems to complicated. I will open a PR for this once the above is confirmed. |
take |
i have just upgraded and now I get tons of warnings. Even when I'm sure my data is already ffilled I still get the warnings. How do I turn off these warnings? |
@yellow1912 - https://docs.python.org/3/library/warnings.html. If you believe pandas is not behaving correctly, can you open a new issue. |
@rhshadrach I believe what @yellow1912 said was exactly related to this issue. We are raising warnings for |
Can you post a reproducible example? This issue is about the deprecation itself. If there are incorrect behaviors in the implementation of that deprecation they should go in a separate issue. |
@rhshadrach This is a reproducible example (main branch). >>> ser = pd.Series([np.nan, 1, 2, 3, np.nan])
>>> ser
0 NaN
1 1.0
2 2.0
3 3.0
4 NaN
dtype: float64
>>> ser.ffill().pct_change()
0 NaN
1 NaN
2 1.0
3 0.5
4 0.0
dtype: float64
>>> ser.bfill().pct_change()
<stdin>:1: FutureWarning: The default fill_method='pad' in Series.pct_change is deprecated and will be removed in a future version. Call ffill before calling pct_change to retain current behavior and silence this warning.
0 NaN
1 0.0
2 1.0
3 0.5
4 0.0
dtype: float64
|
quick comment, the warning message is confusing when using df.groupby('columnA')['values'].pct_change(fill_method='pad') != df.groupby('columnA')['values'].ffill().pct_change() an example:
perhaps we can retain the original behaviour for groupby objects? |
A possible workaround might be: import numpy as np
import pandas as pd
from pandas.testing import assert_series_equal
df = pd.DataFrame(
{
"a": range(1, 11),
"b": range(1, 11),
"c": range(1, 11),
}
)
df.iloc[5] = np.nan
df = df.melt()
df["pct"] = df.groupby("variable")["value"].pct_change()
df["pct2"] = df.groupby("variable")["value"].transform(lambda x: x.ffill().pct_change())
assert_series_equal(df.pct, df.pct2, check_names=False) # This should pass Maybe there are better workarounds but at least the one above doesn't seem very clean... @jbrockmendel @rhshadrach What do maintainers think here? Should the deprecation on groupby objects be reverted, or should we just update the warning message? BTW @CaioLC I think you cannot use |
That's true, thank you for pointing that out :) Lambda functions is a valid workaround. My only concern is that the warning message states that simply adding ffill() before pct_change() will suffice and that's not entirely true. And a wrong pct_change is hard to debug, because it will compute without faillure most of the time |
Yeah we shouldn't require lambdas for this use case, |
|
Requires this calculating the grouper twice? |
Having to call groupby twice for a single operation, and for a fairly common func, seems unintuitive and a worse experience overall. The ideal would be GroupBySeries.fill() to return a GroupBySeries object instead of a regular Series, then concatenation should be straight forward. But I understand this could be a major break and possibly not that trivial to implement... |
Because you can modify the DataFrame in place, you can actually get away with using the same grouper. As long as you're just adding columns, this is safe - but other modifications (e.g. removing rows) may not be.
I would not describe this as a single operation.
And then for use cases that just want to fill? |
Yeah I think the your solution seems fine to me, so the deprecation is fine as well |
Maybe just a small adjustment to the warning message then? so that users don't try to inadvertently chain function calls coming from a groupby object |
Anything actionable here? |
@CaioLC - the warning message seems okay to me, it merely tells users to fill values rather than telling users an incorrect way of filling the values. But if you have an alternative suggestion to the message, we can certainly consider it. If you'd like, add it here and we can reopen. Closing for now. |
We should deprecate these and have users do obj.fillna(...).pct_change() instead.
The text was updated successfully, but these errors were encountered: