-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does pandarallel not support parallel_apply with multiple columns groupby? #253
Comments
I have the exact use-case, weirdly enough it was working till last week. Is this supported? |
@akaymd what are the versions of Python, Pandas, Numpy, and Pandarallel? |
Pandaral·lel is looking for a maintainer! |
Is there a minimally working code example to share? As well as the versions of Python and the relevant packages? |
Using groupby on multiple columns works fine for me. import pandas as pd
import pandarallel
pandarallel.pandarallel.initialize()
df = pd.DataFrame({"foo": range(20), "bar": range(20, 40)})
df["even"] = df["foo"] % 2 == 0
df["four"] = df["foo"] % 4 == 0
assert df.groupby(["even", "four"]).apply(lambda x: x+1).equals(df.groupby(["even", "four"]).parallel_apply(lambda x: x+1)) Python: 3.10.13 |
In my environment, df.groupby(args).apply(func) with single column can be replaced by parallel_apply as follows
df.groupby("col1").apply(func)
However, groupby with multiple columns did not work in parallel.
df.groupby(["col1", "col2", ...]) .apply(func)
Does pandarallel not support parallel_apply with multiple columns groupby?
The text was updated successfully, but these errors were encountered: