Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: deprecate returning a tuple from a callable in iloc indexing #53769

Merged
merged 12 commits into from
Nov 7, 2023

Conversation

wence-
Copy link
Contributor

@wence- wence- commented Jun 21, 2023

The current semantics are that tuple-destructuring of the key is
performed before unwinding any callables. As such, if a callable
returns a tuple for iloc, it will be handled incorrectly. To avoid
this, explicitly deprecate support for this behaviour.

The current semantics are that tuple-destructuring of the key is
performed before unwinding any callables. As such, if a callable
returns a tuple for iloc, it will be handled incorrectly. To avoid
this, explicitly deprecate support for this behaviour.

Closes pandas-dev#53533.
@wence-
Copy link
Contributor Author

wence- commented Jun 21, 2023

I would think it more natural that the order of events in producing "normalized" indices would be:

  1. apply callable if appropriate
  2. apply tuple destructuring to obtain row and column keys
  3. apply normal (non-callable/tuple) indexing rules to the row and column keys

...

This PR deprecates (by explicitly warning in case (2)) returning a tuple from a callable. With the intention that it will in the future always raise IndexError.

Currently callables operate on axis 0 only in loc/iloc. I'd be open to discussion on letting them operate using normal deconstruction, but that's maybe a bigger discussion, especially whether we should do the same for .loc. (For loc, there may be issues with that approach when an index is a MultiIndex).

@wence- wence- force-pushed the wence/dep/53533 branch from d75d758 to 268f5e5 Compare June 21, 2023 11:55
@mroeschke mroeschke added Indexing Related to indexing on series/frames, not to indexes themselves Deprecate Functionality to remove in pandas labels Jun 21, 2023
@mroeschke mroeschke requested a review from topper-123 June 21, 2023 17:56
Copy link
Contributor

@topper-123 topper-123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments.

doc/source/whatsnew/v2.1.0.rst Outdated Show resolved Hide resolved
pandas/core/indexing.py Outdated Show resolved Hide resolved
pandas/core/indexing.py Outdated Show resolved Hide resolved
pandas/tests/frame/indexing/test_indexing.py Outdated Show resolved Hide resolved
@wence-
Copy link
Contributor Author

wence- commented Jun 22, 2023

[Aside: I did not write this quoted comment, please don't edit my comments to put words in my mouth]

Currently callables operate on axis 0 only in loc/iloc.

This is not true, LocationIndexer has this implementation in __getitem__:

    def __getitem__(self, key):
        check_dict_or_set_indexers(key)
        if type(key) is tuple:
            key = tuple(list(x) if is_iterator(x) else x for x in key)
            key = tuple(com.apply_if_callable(x, self.obj) for x in key)
            if self._is_scalar_access(key):
                return self.obj._get_value(*key, takeable=self._takeable)
            return self._getitem_tuple(key)

So, if the key is a tuple, we destructure, and then apply the callable separately on the destructured axis-wise indexers if necessary.

Witness:

In [2]: df = pd.DataFrame({"a": [1, 2, 3], "b": [1, 2, 3], "c": [2, 3, 4]})

In [3]: df.iloc[[0, 1], lambda df: [1, 2]]
Out[3]: 
   b  c
0  1  2
1  2  3

In [4]: df.loc[[1, 0], lambda df: ["a", "c"]]
Out[4]: 
   a  c
1  2  3
0  1  2

@wence- wence- force-pushed the wence/dep/53533 branch from a854393 to d2004ed Compare June 22, 2023 11:27
@wence-
Copy link
Contributor Author

wence- commented Jun 26, 2023

Ready for another look, thanks

@wence-
Copy link
Contributor Author

wence- commented Jul 18, 2023

Any thoughts on this change?

pandas/core/indexing.py Outdated Show resolved Hide resolved
Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small nit and a merge conflict otherwise looks good

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Aug 24, 2023
@mroeschke mroeschke removed the Stale label Nov 7, 2023
@mroeschke mroeschke added this to the 2.2 milestone Nov 7, 2023
@mroeschke mroeschke merged commit 0e2277b into pandas-dev:main Nov 7, 2023
33 of 34 checks passed
@mroeschke
Copy link
Member

Thanks @wence-

@wence- wence- deleted the wence/dep/53533 branch November 7, 2023 17:36
@wence-
Copy link
Contributor Author

wence- commented Nov 7, 2023

Thanks, and sorry for not doing a better job babysitting this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: iloc-indexing with callable has inconsistent semantics if the return type is a tuple
3 participants