Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does pandarallel not support parallel_apply with multiple columns groupby? #253

Open
akaymd opened this issue Oct 13, 2023 · 5 comments
Open

Comments

@akaymd
Copy link

akaymd commented Oct 13, 2023

In my environment, df.groupby(args).apply(func) with single column can be replaced by parallel_apply as follows

df.groupby("col1").apply(func)

However, groupby with multiple columns did not work in parallel.

df.groupby(["col1", "col2", ...]) .apply(func)

Does pandarallel not support parallel_apply with multiple columns groupby?

@perveen-shaheen
Copy link

I have the exact use-case, weirdly enough it was working till last week. Is this supported?

@AymanElsayeed
Copy link

@akaymd what are the versions of Python, Pandas, Numpy, and Pandarallel?

@nalepae
Copy link
Owner

nalepae commented Jan 23, 2024

Pandaral·lel is looking for a maintainer!
If you are interested, please open an GitHub issue.

@shermansiu
Copy link

Is there a minimally working code example to share? As well as the versions of Python and the relevant packages?

@shermansiu
Copy link

Using groupby on multiple columns works fine for me.

import pandas as pd
import pandarallel


pandarallel.pandarallel.initialize()


df = pd.DataFrame({"foo": range(20), "bar": range(20, 40)})
df["even"] = df["foo"] % 2 == 0
df["four"] = df["foo"] % 4 == 0
assert df.groupby(["even", "four"]).apply(lambda x: x+1).equals(df.groupby(["even", "four"]).parallel_apply(lambda x: x+1))

Python: 3.10.13
Pandarallel: 1.6.5
Pandas: 2.2.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants