Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: weird behaviour for returning group in groupby.apply #22546

Closed
h-vetinari opened this issue Aug 30, 2018 · 5 comments
Closed

BUG: weird behaviour for returning group in groupby.apply #22546

h-vetinari opened this issue Aug 30, 2018 · 5 comments
Labels
Apply Apply, Aggregate, Transform, Map Bug Groupby

Comments

@h-vetinari
Copy link
Contributor

I was trying to document my experiences with the inconsistencies of DataFrame.groupby.apply (see #22545), and one of them was the following:

N = 5
df = pd.DataFrame(index=range(N), columns=['id', 'x', 'y', 'z'])
df.loc[:, ['x', 'y', 'z']] = np.arange(N*3).reshape(N, 3)
df.id = np.random.randint(0, 3, (N,)) + 10

df
#    id   x   y   z
# 0  11   0   1   2
# 1  10   3   4   5
# 2  10   6   7   8
# 3  12   9  10  11
# 4  12  12  13  14

Then, even though the result returned by the function is exactly the same, the following outputs are different:

df.groupby('id', as_index=True).apply(lambda gr: gr))
#    id   x   y   z
# 0  11   0   1   2
# 1  10   3   4   5
# 2  10   6   7   8
# 3  12   9  10  11
# 4  12  12  13  14

df.groupby('id', as_index=True).apply(lambda gr: gr.iloc[:10 ** 6])
#       id   x   y   z
# id                  
# 10 1  10   3   4   5
#    2  10   6   7   8
# 11 0  11   0   1   2
# 12 3  12   9  10  11
#    4  12  12  13  14

The first one just returns the original frame as-is, with no attempt to actually group the results like the second output. Furthermore, both outputs should not have the id column anymore, which is now ambiguous between the index and the columns (e.g. in case one may continue with groupby after some further transformations)

Desired output of both:

#        x   y   z
# id              
# 10 1   3   4   5
#    2   6   7   8
# 11 0   0   1   2
# 12 3   9  10  11
#    4  12  13  14
@gfyoung
Copy link
Member

gfyoung commented Sep 1, 2018

That looks very weird indeed! Investigation and PR are welcome!

@smithto1
Copy link
Member

smithto1 commented Aug 7, 2020

take

@smithto1
Copy link
Member

smithto1 commented Aug 8, 2020

I think this is another issue that falls under #34998

@grsjar
Copy link

grsjar commented Aug 12, 2022

I think I've encountered this bug, had groupby().apply() and groupby().transform() statements both "with no attempt to actually group the results like the second output." anyone know what is happening here?

@rhshadrach
Copy link
Member

Thanks @smithto1 - agreed. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform, Map Bug Groupby
Projects
None yet
Development

No branches or pull requests

6 participants