Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TYP: Correct type annotation for to_dict. #55130

Merged
merged 8 commits into from
Oct 2, 2023

Conversation

jsspencer
Copy link
Contributor

@jsspencer jsspencer commented Sep 13, 2023

The into argument of DataFrame.to_dict and Series.to_dict can be either a class or instance of a class of dict; this is covariant - subclasses of dict can also be used. The argument was annotated as type[dict] though, so type checkers marked passing initialized objects (required for collections.defaultdict) as an incorrect argument type.

Fix by annotating into to take either a subclass of dict or an initialized instance of a subclass of dict.

The `into` argument of DataFrame.to_dict and Series.to_dict can be
either a class or instance of a class of dict; this is covariant -
subclasses of dict can also be used. The argument was annotated as
`type[dict]` though, so type checkers marked passing initialized objects
(required for collections.defaultdict) as an incorrect argument type.

Fix by annotating `into` to take either a subclass of dict or an
initialized instance of a subclass of dict.
@jsspencer jsspencer force-pushed the to_dict_into_type_annotate branch from afbc799 to 449eb46 Compare September 13, 2023 21:29
@jsspencer jsspencer changed the title TYP: Correct type annotation for DataFrame.to_dict. TYP: Correct type annotation for to_dict. Sep 13, 2023
pandas/core/frame.py Outdated Show resolved Hide resolved
Unfortunately a generic type annotation with a default triggers an
existing mypy limitation (python/mypy#3737).
The current workaround is to use overloads and then not annotate the
implementation containing the default parameter; this still enables mypy
to deduce correct return types.

Two overloads are added for Series.to_dict, even though they could be
combined using a Union type, as at least two overloads are required for
a single method.
@jsspencer jsspencer requested a review from Dr-Irv as a code owner September 15, 2023 17:32
Copy link
Contributor

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to defer to @twoertwein on reviewing this one

index: bool = True,
) -> dict | list[dict]:
):
Copy link
Member

@twoertwein twoertwein Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the annotations for the non-overload definition? If it triggers a mypy error within the function, it is okay to add an ignore comment.

edit: having the ignore comment will function as a TODO enforced by mypy (when a future mypy allows TypeVars+default )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsspencer I pushed to your branch so that pd.DataFrame([]).to_dict("dict") works correctly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should now work for all cases:

import pandas as pd
from typing import MutableMapping

class MyDict(dict): ...

def test(into: MutableMapping):
    # MutableMapping
    reveal_type(pd.DataFrame([]).to_dict("dict", into=into))
    reveal_type(pd.Series([]).to_dict(into=into))

    # dict
    reveal_type(pd.DataFrame([]).to_dict("dict"))
    reveal_type(pd.Series([]).to_dict())

    # MyDict
    reveal_type(pd.DataFrame([]).to_dict("dict", into=MyDict))
    reveal_type(pd.Series([]).to_dict(into=MyDict))

@mroeschke mroeschke added the Typing type annotations, mypy/pyright type checking label Sep 18, 2023
Copy link
Contributor

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created pandas-dev/pandas-stubs#784 so we make a similar change in the stubs

# error: Incompatible default for argument "into" (default has type "type[
# dict[Any, Any]]", argument has type "type[MutableMappingT] | MutableMappingT")
@deprecate_nonkeyword_arguments(
version="3.0", allowed_args=["self"], name="to_dict"
Copy link
Member

@twoertwein twoertwein Sep 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this okay @mroeschke? It would make it consistent with DataFrame.to_dict (and allow the dict default case for typing) but it might be a bit "odd" to require keyword-only for a function that takes exactly one argument

(need to also fix the failing doc test for this change)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup the consistency argument makes sense to go forward with this deprecation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Green now

@mroeschke mroeschke added this to the 2.2 milestone Oct 2, 2023
@mroeschke mroeschke merged commit a0a6e04 into pandas-dev:main Oct 2, 2023
33 checks passed
@mroeschke
Copy link
Member

Thanks @jsspencer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants