Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update nlargest nsmallest doc #56369

Merged
merged 6 commits into from
Dec 12, 2023
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 30 additions & 6 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -7505,8 +7505,8 @@ def nlargest(

- ``first`` : prioritize the first occurrence(s)
- ``last`` : prioritize the last occurrence(s)
- ``all`` : do not drop any duplicates, even it means
selecting more than `n` items.
- ``all`` : keep all the ties of the smallest item even if it means
selecting more than ``n`` items.

Returns
-------
Expand Down Expand Up @@ -7568,7 +7568,9 @@ def nlargest(
Italy 59000000 1937894 IT
Brunei 434000 12128 BN

When using ``keep='all'``, all duplicate items are maintained:
When using ``keep='all'``, the number of element kept can go beyond ``n``
if there are duplicate values for the smallest element, all the
ties are kept:

>>> df.nlargest(3, 'population', keep='all')
population GDP alpha-2
Expand All @@ -7578,6 +7580,16 @@ def nlargest(
Maldives 434000 4520 MV
Brunei 434000 12128 BN

However, ``nlargest`` does not keep ``n`` distinct largest elements:

>>> df.nlargest(5, 'population', keep='all')
MainHanzo marked this conversation as resolved.
Show resolved Hide resolved
population GDP alpha-2
France 65000000 2583560 FR
Italy 59000000 1937894 IT
Malta 434000 12011 MT
Maldives 434000 4520 MV
Brunei 434000 12128 BN

To order by the largest values in column "population" and then "GDP",
we can specify multiple columns like in the next example.

Expand Down Expand Up @@ -7614,8 +7626,8 @@ def nsmallest(

- ``first`` : take the first occurrence.
- ``last`` : take the last occurrence.
- ``all`` : do not drop any duplicates, even it means
selecting more than `n` items.
- ``all`` : keep all the ties of the largest item even if it means
selecting more than ``n`` items.

Returns
-------
Expand Down Expand Up @@ -7669,14 +7681,26 @@ def nsmallest(
Tuvalu 11300 38 TV
Nauru 337000 182 NR

When using ``keep='all'``, all duplicate items are maintained:
When using ``keep='all'``, the number of element kept can go beyond ``n``
if there are duplicate values for the largest element, all the
ties are kept.

>>> df.nsmallest(3, 'population', keep='all')
population GDP alpha-2
Tuvalu 11300 38 TV
Anguilla 11300 311 AI
Iceland 337000 17036 IS
Nauru 337000 182 NR

However, ``nsmallest`` does not keep ``n`` distinct
smallest elements:

>>> df.nsmallest(4, 'population', keep='all')
population GDP alpha-2
Tuvalu 11300 38 TV
Anguilla 11300 311 AI
Iceland 337000 17036 IS
Nauru 337000 182 NR

To order by the smallest values in column "population" and then "GDP", we can
specify multiple columns like in the next example.
Expand Down
Loading