-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: GroupBy.value_counts sorting order #56016
BUG: GroupBy.value_counts sorting order #56016
Conversation
@pytest.mark.parametrize("normalize", [True, False]) | ||
def test_value_counts_sort(sort, vc_sort, normalize): | ||
# GH#55951 | ||
df = DataFrame({"a": [2, 1, 1, 1], 0: [3, 4, 3, 3]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also tests with Categorical
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - good call. I added a test and found a bug calling .sortlevel()
in the categorical section. But the categorical behavior is still buggy here; namely
pandas/pandas/core/groupby/groupby.py
Lines 2824 to 2826 in ef2c61a
multi_index, _ = MultiIndex.from_product( | |
levels_list, names=[ping.name for ping in groupings] | |
).sortlevel() |
is constructing the index by taking the cartesian product across all groupings; it should only be doing this for the categorical variables. That's not an easy fix, but will be handled in #55738 where we can just remove that whole block of code.
Thanks @rhshadrach |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.