Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type args and kwargs in pipe #823

Merged
merged 4 commits into from
Dec 27, 2023
Merged

Type args and kwargs in pipe #823

merged 4 commits into from
Dec 27, 2023

Conversation

paw-lu
Copy link
Contributor

@paw-lu paw-lu commented Dec 7, 2023

Follow up to pandas-dev/pandas#56359. Type out args and kwargs in pipe methods, and make all instances of the pipe method consistent in their typing ( #738).

@paw-lu
Copy link
Contributor Author

paw-lu commented Dec 7, 2023

Started out typing out frame's pipe to test things out, and already running into issues 😅. Surprisingly to me, mypy is complaining on the return type of the func used in pipe.

tests/test_frame.py:1393: error: Expression is of type <nothing>, not "DataFrame"  [assert-type]
tests/test_frame.py:1394: error: Argument 1 to "pipe" of "DataFrame" has incompatible type "Callable[[DataFrame, int, str], DataFrame]"; expected "Callable[[DataFrame, int, str], <nothing>] | tuple[Callable[..., <nothing>], str]"  [arg-type]

Looking into this!

tests/test_frame.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an issue here with respect to having pipe defined in core/frame.pyi and core/series.pyi . I think if you took the concept proposed in this PR and used it in generic.pyi and removed it from core/frame.pyi, things would work for both Series and DataFrame.

So can you make that change, and then have tests for both Series.pipe() and DataFrame.pipe() ?

pandas-stubs/core/frame.pyi Outdated Show resolved Hide resolved
pandas-stubs/core/frame.pyi Outdated Show resolved Hide resolved
@paw-lu
Copy link
Contributor Author

paw-lu commented Dec 11, 2023

I think if you took the concept proposed in this PR and used it in generic.pyi and removed it from core/frame.pyi
So can you make that change, and then have tests for both Series.pipe() and DataFrame.pipe() ?

Got it, that sounds good to me!

@paw-lu
Copy link
Contributor Author

paw-lu commented Dec 20, 2023

Just to try things out, implemented the more complicated function with positional-only args, keyword-only args, etc. This fails in current tests.

tests/test_frame.py:1401: error: Expression is of type <nothing>, not "DataFrame"  [assert-type]
tests/test_frame.py:1403: error: Argument 1 to "pipe" of "DataFrame" has incompatible type "Callable[[DataFrame, int, list[float], str, NamedArg(tuple[int, int], 'keyword_only')], DataFrame]"; expected "Callable[[DataFrame, int, list[float], str, tuple[int, int]], <nothing>] | tuple[Callable[..., <nothing>], str]"  [arg-type]
Found 2 errors in 1 file (checked 224 source files)

However, when I upgrade mypy to latest version (1.7.1), the test does pass. However, it also raises a bunch of new errors elsewhere in code, I think all of them are of now unused type: ignore comments (easy fix).

mypy 1.7.1 errors
pandas-stubs/core/series.pyi:181: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:226: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:239: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:249: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:259: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:272: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:651: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:710: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:846: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:1179: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:1462: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:1496: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:1509: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:1523: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:1532: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:1556: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/series.pyi:1922: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/frame.pyi:280: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/frame.pyi:296: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/frame.pyi:304: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/frame.pyi:312: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/frame.pyi:320: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/frame.pyi:565: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/algorithms.pyi:28: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/algorithms.pyi:30: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/algorithms.pyi:32: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/reshape/tile.pyi:53: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/interval.pyi:215: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/interval.pyi:292: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/interval.pyi:294: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/interval.pyi:300: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/interval.pyi:302: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/interval.pyi:311: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/base.pyi:69: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/base.pyi:80: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/base.pyi:91: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/base.pyi:102: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/base.pyi:113: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/base.pyi:126: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/base.pyi:138: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/base.pyi:149: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/base.pyi:160: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/base.pyi:171: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/base.pyi:182: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/base.pyi:193: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/base.pyi:204: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/base.pyi:215: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/indexes/base.pyi:424: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/groupby/generic.pyi:148: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/core/groupby/generic.pyi:170: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/interval.pyi:202: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/interval.pyi:204: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/interval.pyi:206: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/interval.pyi:210: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/interval.pyi:212: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/interval.pyi:214: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/timestamps.pyi:216: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/timestamps.pyi:218: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/timestamps.pyi:220: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/timestamps.pyi:224: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/timestamps.pyi:226: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/timestamps.pyi:228: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/timedeltas.pyi:311: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/timedeltas.pyi:313: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/timedeltas.pyi:315: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/timedeltas.pyi:322: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/timedeltas.pyi:324: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/timedeltas.pyi:326: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/period.pyi:102: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/period.pyi:104: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/period.pyi:106: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/period.pyi:136: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/period.pyi:138: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]
pandas-stubs/_libs/tslibs/period.pyi:140: error: Unused "type: ignore" comment, use narrower [overload-overlap] instead of [misc] code  [unused-ignore]

So in order to implement this I'm thinking we need to:

  1. Update mypy
  2. Fix the new errors that come with this upgrade (which I think is just deleting a bunch of # type: ignore's)

Happy to attempt this in another PR first if this is something we are interested in.


Trying this out still in the frame.pyi, will move to generic.py (along with other feedback here) once we have minimal implementation up and passing here.

@twoertwein
Copy link
Member

Main uses mypy 1.7.1 now, so you can just rebase/merge and all the mypy warnings should disappear.

@paw-lu paw-lu force-pushed the pipe-typing branch 3 times, most recently from 07fc875 to 4fb7876 Compare December 21, 2023 05:53
@paw-lu
Copy link
Contributor Author

paw-lu commented Dec 21, 2023

Cool, rebased on top of the latest changes and things mostly seem to pass. Added a lot of tests for frame.pyi, they all seem to pass as expected now.

I have some local failures right now on:

1. pytest
========================================== short test summary info ===========================================
ERROR tests/test_api_types.py - DeprecationWarning: datetime.datetime.utcfromtimestamp() is deprecated and scheduled for removal in a fut...
ERROR tests/test_config.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_dtypes.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_errors.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_extension.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_frame.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_holidays.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_indexes.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_interval.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_interval_index.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_io.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_merge.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_pandas.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_plotting.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_resampler.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_scalars.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_series.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_styler.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_testing.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_timefuncs.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_utility.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
ERROR tests/test_windowing.py - AttributeError: partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most like...
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 22 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
============================================= 22 errors in 1.74s =============================================

===========================================
Step: 'Run pytest' failed!
===========================================
2. pyright
Poe => pyright
/Users/pawlu/Documents/personal/pandas-stubs/pandas-stubs/_libs/tslibs/timestamps.pyi
  /Users/pawlu/Documents/personal/pandas-stubs/pandas-stubs/_libs/tslibs/timestamps.pyi:103:9 - error: Method "fromtimestamp" overrides class "datetime" in an incompatible manner
    Parameter 2 name mismatch: base parameter is named "timestamp", override parameter is named "t" (reportIncompatibleMethodOverride)
1 error, 0 warnings, 0 informations
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/pawlu/Documents/personal/pandas-stubs/scripts/test/run.py", line 15, in pyright_src
    subprocess.run(cmd, check=True)
  File "/opt/homebrew/Cellar/[email protected]/3.12.1/Frameworks/Python.framework/Versions/3.12/lib/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['pyright']' returned non-zero exit status 1.

I think they're unrelated to these changes, but I'll switch back to main to check if they pass locally on my machine later before proceeding with the rest of the changes.

@paw-lu paw-lu force-pushed the pipe-typing branch 2 times, most recently from 15bbd54 to 19a230b Compare December 22, 2023 05:38
@paw-lu
Copy link
Contributor Author

paw-lu commented Dec 22, 2023

Ok, seems that test that are current failing on pyright and pytest were failing in the commit prior to mine, so I'm going to ignore those. I've completed tests for frame and implementation.

I plan to just do one more thing from here: add tests for pipe to series.pyi.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks pretty good. Can you add tests in test_series.py as well similar to what you have for test_frame.py ?

tests/test_frame.py Outdated Show resolved Hide resolved
@paw-lu
Copy link
Contributor Author

paw-lu commented Dec 22, 2023

Added tests in test_series.py.

Now that it seems like

will make overload work better with pyright (thanks @Dr-Irv !), I'll probably move the implementation over to that.

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Dec 22, 2023

Added a tests in test_series.py.

Now that it seems like

will make overload work better with pyright (thanks @Dr-Irv !), I'll probably move the implementation over to that.

Yes, so make the change in your PR to use the overloads (as in the example at microsoft/pyright#6796 (comment) ), and you might get a failure in the tests if you don't include an ignore comment, so we'll wait until the new pyright is released, bump up the pyright version, and then all will be well.

We probably will have to wait until 12/26 or later to get this all done.

@paw-lu paw-lu force-pushed the pipe-typing branch 2 times, most recently from f4807a9 to abc5748 Compare December 25, 2023 23:51
@paw-lu paw-lu marked this pull request as ready for review December 25, 2023 23:51
tests/test_frame.py Outdated Show resolved Hide resolved
tests/test_series.py Outdated Show resolved Hide resolved
@paw-lu paw-lu mentioned this pull request Dec 26, 2023
2 tasks
Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @paw-lu

@Dr-Irv Dr-Irv merged commit 117e97a into pandas-dev:main Dec 27, 2023
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GroupBy.pipe() missing types in arguments and correct return type
3 participants