Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Pin matplotlib < 3.8 #55210

Merged
merged 1 commit into from
Sep 20, 2023
Merged

CI: Pin matplotlib < 3.8 #55210

merged 1 commit into from
Sep 20, 2023

Conversation

lithomas1
Copy link
Member

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

@lithomas1 lithomas1 added the CI Continuous Integration label Sep 20, 2023
@lithomas1 lithomas1 added this to the 2.1.1 milestone Sep 20, 2023
@lithomas1 lithomas1 merged commit d303b7c into pandas-dev:main Sep 20, 2023
35 checks passed
@lithomas1 lithomas1 deleted the pin-mpl branch September 20, 2023 10:53
meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Sep 20, 2023
MarcoGorelli pushed a commit that referenced this pull request Sep 20, 2023
Backport PR #55210: CI: Pin matplotlib < 3.8

Co-authored-by: Thomas Li <[email protected]>
@mroeschke
Copy link
Member

cc @twoertwein I believe the new matplotlib version created new typing issues if you have any familiarity

@twoertwein
Copy link
Member

cc @twoertwein I believe the new matplotlib version created new typing issues if you have any familiarity

I will look into it (I can't promise how soon). Is the only reason to pin it for typing reasons?

We also had a few issues in pandas-stubs: previously, mypy/pyright resolved any matplotlib types as Any/Unknown since it wasn't declared a py.typed library (I believe it is now). Technically, the new matplotlib release isn't causing new type issues but just exposing existing issues that previously weren't caught ;)

@twoertwein
Copy link
Member

I think two main sources triggering these many typing errors are:

  • pandas might be using private methods
  • the type annotations of matplotlib are sometimes too strict, might not yet cover the entire public API

Here are some type errors that seem to indicate that pandas is doing something fishy:

pandas/plotting/_matplotlib/core.py:485: error: "Axes" has no attribute "_get_lines"  [attr-defined]
pandas/plotting/_matplotlib/core.py:486: error: "Axes" has no attribute "_get_patches_for_fill"  [attr-defined]
pandas/plotting/_matplotlib/core.py:487: error: "Axes" has no attribute "right_ax"  [attr-defined]
pandas/plotting/_matplotlib/core.py:487: error: "_AxesBase" has no attribute "left_ax"  [attr-defined]
pandas/plotting/_matplotlib/core.py:1438: error: "Axes" has no attribute "_plot_data"; maybe "plot_date"?  [attr-defined]
pandas/plotting/_matplotlib/core.py:1586: error: "Axes" has no attribute "_stacker_pos_prior"  [attr-defined]
pandas/plotting/_matplotlib/core.py:1588: error: "Axes" has no attribute "_stacker_neg_prior"  [attr-defined]

@ksunden
Copy link

ksunden commented Sep 22, 2023

Here are my annotations on the type errors I was able to look at:

Unrelated to mpl/maybe need a config/mypy plugin/other optional install?:

- pandas/core/arrays/masked.py:1010: error: Slice index must be an integer or None  [misc]
- pandas/core/generic.py:4259: error: Slice index must be an integer or None  [misc]
- pandas/core/generic.py:11816: error: Slice index must be an integer or None  [misc]
- pandas/core/indexes/multi.py:3035: error: Slice index must be an integer or None  [misc]
- pandas/io/excel/_calamine.py:78: error: Unused "type: ignore" comment
- pandas/conftest.py:199: error: "Type[HealthCheck]" has no attribute "differing_executors"  [attr-defined]

Legitimate type conflict in usage of mpl (though perhaps rare cases):

- pandas/plotting/_matplotlib/tools.py:55: error: "Text" has no attribute "set_ha"  [attr-defined]
  • Uses dynamic alias behavior-- resolve by switching to cannonical version, set_horizontalalignment
  • Since these are dynamically generated, we don't statically type hint them
    • technically we probably could, but would be painting ourselves into a corner regarding if we ever want to inline type hints instead of stubs
    • dynamic behavior also affects e.g. docstrings and such
- pandas/plotting/_matplotlib/tools.py:58: error: Argument 1 to "maybe_adjust_figure" has incompatible type "Optional[Figure]"; expected "Figure"  [arg-type]
  • technically, ax.get_figure() can return None, though admittedly only for axes that cannot actually be drawn, so?
  • Threading through the | None to maybe adjust_figure/do_adjust_figure should work, may want to explicitly handle in do_adjust_figure to keep type checker happy (it'll work with the hasattr, but type won't narrow)
- pandas/plotting/_matplotlib/tools.py:372: error: Incompatible types in assignment (expression has type "List[Any]", variable has type "GrouperView[Any]")  [assignment]
- pandas/plotting/_matplotlib/tools.py:378: error: "List[Any]" has no attribute "get_position"  [attr-defined]
  • Probably should be resolved by using separate variables for the get_shared_[xy]_axes calls (which return GrouperView) and the get_siblings call on the GrouperView (which returns list[Axes])
- pandas/plotting/_matplotlib/converter.py:189: error: Argument 2 of "__call__" is incompatible with supertype "Formatter"; supertype defines the argument type as "Optional[int]"  [override]
- pandas/plotting/_matplotlib/converter.py:1068: error: Argument 2 of "__call__" is incompatible with supertype "Formatter"; supertype defines the argument type as "Optional[int]"  [override]
- pandas/plotting/_matplotlib/converter.py:1102: error: Argument 2 of "__call__" is incompatible with supertype "Formatter"; supertype defines the argument type as "Optional[int]"  [override]
  • arg pos is unused anyway, for consistency with mpl, should probably default to None and be typed as int | None
- pandas/tests/plotting/common.py:125: error: Item "Axes" of "Union[Axes, Sequence[Axes]]" has no attribute "__iter__" (not iterable)  [union-attr]
  • pandas _flatten_visible/flatten_axes method is actually type narrowing there, but type checker doesn't know that
- pandas/plotting/_matplotlib/timeseries.py:119: error: Incompatible types in assignment (expression has type "None", variable has type "str")  [assignment]
  • changing the type of a variable
  • should declare the variable with expanded type
- pandas/plotting/_matplotlib/core.py:1251: error: "None" not callable  [misc]
- pandas/plotting/_matplotlib/core.py:1251: error: Item "None" of "Optional[Colormap]" has no attribute "N"  [union-attr]
  • if you know cmap can't be None, you need to inform the type checker here, I don't see anything preventing it
  • or handle the None case explicitly

New mpl typing provides better way of handling this, though backwards compatibility is a potential problem:

- pandas/plotting/_matplotlib/style.py:272: error: Argument 1 to "to_rgba" of "ColorConverter" has incompatible type "Union[str, Sequence[float]]"; expected "Union[Union[Tuple[float, float, float], str], Union[str, Tuple[float, float, float, float], Tuple[Union[Tuple[float, float, float], str], float], Tuple[Tuple[float, float, float, float], float]]]"  [arg-type]
  • We have mpl.typing.ColorType which specifies all the things that we actually mean when we say Color, and is more precise than Sequence[float] and also includes the new tuple[Color, alpha] form

Pandas dynamically adding things to MPL objects that aren't there by default:

- pandas/plotting/_matplotlib/timeseries.py:127: error: "Axes" has no attribute "_plot_data"; maybe "plot_date"?  [attr-defined]
- pandas/plotting/_matplotlib/timeseries.py:139: error: "Axes" has no attribute "_plot_data"; maybe "plot_date"?  [attr-defined]
- pandas/plotting/_matplotlib/timeseries.py:156: error: "Axes" has no attribute "_plot_data"; maybe "plot_date"?  [attr-defined]
- pandas/plotting/_matplotlib/timeseries.py:158: error: "Axes" has no attribute "freq"  [attr-defined]
- pandas/plotting/_matplotlib/timeseries.py:160: error: "XAxis" has no attribute "freq"  [attr-defined]
- pandas/plotting/_matplotlib/timeseries.py:162: error: "Axes" has no attribute "legendlabels"  [attr-defined]
- pandas/plotting/_matplotlib/timeseries.py:166: error: "Axes" has no attribute "date_axis_info"  [attr-defined]
- pandas/plotting/_matplotlib/core.py:487: error: "Axes" has no attribute "right_ax"  [attr-defined]
- pandas/plotting/_matplotlib/core.py:487: error: "_AxesBase" has no attribute "left_ax"  [attr-defined]
- pandas/plotting/_matplotlib/core.py:1438: error: "Axes" has no attribute "_plot_data"; maybe "plot_date"?  [attr-defined]
- pandas/plotting/_matplotlib/core.py:1442: error: "Axes" has no attribute "freq"  [attr-defined]
- pandas/plotting/_matplotlib/core.py:1456: error: "Axes" has no attribute "_stacker_pos_prior"  [attr-defined]
- pandas/plotting/_matplotlib/core.py:1458: error: "Axes" has no attribute "_stacker_neg_prior"  [attr-defined]
- pandas/plotting/_matplotlib/core.py:1459: error: "Axes" has no attribute "_stacker_pos_prior"  [attr-defined]
- pandas/plotting/_matplotlib/core.py:1460: error: "Axes" has no attribute "_stacker_neg_prior"  [attr-defined]
- pandas/plotting/_matplotlib/core.py:1471: error: "Axes" has no attribute "_stacker_pos_prior"  [attr-defined]
- pandas/plotting/_matplotlib/core.py:1473: error: "Axes" has no attribute "_stacker_neg_prior"  [attr-defined]
- pandas/plotting/_matplotlib/core.py:1486: error: "Axes" has no attribute "_stacker_pos_prior"  [attr-defined]
- pandas/plotting/_matplotlib/core.py:1488: error: "Axes" has no attribute "_stacker_neg_prior"  [attr-defined]
- pandas/plotting/_matplotlib/core.py:1586: error: "Axes" has no attribute "_stacker_pos_prior"  [attr-defined]
- pandas/plotting/_matplotlib/core.py:1588: error: "Axes" has no attribute "_stacker_neg_prior"  [attr-defined]
  • As these are purely pandas, up to you how you work around these, but I'm personally a little skeptical of attaching things to upstream objects...
- pandas/plotting/_matplotlib/timeseries.py:165: error: "Axes" has no attribute "view_interval"  [attr-defined]
  • This one there is some reference to in mpl, though never as an instance attribute currently, there are a get_view_interval and set_view_interval pair though
  • there is a deprecation note on MPL 3.6 for v_interval as an attribute on Axes3D, with replacment being the getter/setter pair

Pandas using private (and thus untyped) functionality:

- pandas/plotting/_matplotlib/core.py:485: error: "_AxesBase" has no attribute "_get_lines"; maybe "get_lines"?  [attr-defined]
- pandas/plotting/_matplotlib/core.py:485: error: "Axes" has no attribute "_get_lines"  [attr-defined]
- pandas/plotting/_matplotlib/core.py:486: error: "_AxesBase" has no attribute "_get_patches_for_fill"  [attr-defined]
- pandas/plotting/_matplotlib/core.py:486: error: "Axes" has no attribute "_get_patches_for_fill"  [attr-defined]

May need more thought on mpl side...:

- pandas/plotting/_matplotlib/converter.py:353: error: Argument 1 to "set_axis" of "TickHelper" has incompatible type "Union[None, Axis, _DummyAxis, _AxisWrapper]"; expected "Union[Axis, _DummyAxis, None]"  [arg-type]
- pandas/plotting/_matplotlib/converter.py:355: error: Item "None" of "Union[None, Axis, _DummyAxis, _AxisWrapper]" has no attribute "set_view_interval"  [union-attr]
- pandas/plotting/_matplotlib/converter.py:355: error: Item "None" of "Union[None, Axis, _DummyAxis, _AxisWrapper]" has no attribute "get_view_interval"  [union-attr]
- pandas/plotting/_matplotlib/converter.py:356: error: Item "None" of "Union[None, Axis, _DummyAxis, _AxisWrapper]" has no attribute "set_data_interval"  [union-attr]
- pandas/plotting/_matplotlib/converter.py:356: error: Item "None" of "Union[None, Axis, _DummyAxis, _AxisWrapper]" has no attribute "get_data_interval"  [union-attr]
- pandas/plotting/_matplotlib/converter.py:970: error: Item "None" of "Union[None, Axis, _DummyAxis, _AxisWrapper]" has no attribute "get_view_interval"  [union-attr]
- pandas/plotting/_matplotlib/converter.py:992: error: Item "None" of "Union[None, Axis, _DummyAxis, _AxisWrapper]" has no attribute "get_data_interval"  [union-attr]
- pandas/plotting/_matplotlib/converter.py:1060: error: Item "None" of "Union[None, Axis, _DummyAxis, _AxisWrapper]" has no attribute "get_view_interval"  [union-attr]
- pandas/plotting/_matplotlib/converter.py:1103: error: Item "None" of "Union[None, Axis, _DummyAxis, _AxisWrapper]" has no attribute "get_view_interval"  [union-attr]
  • MPL is leaking private behavior
  • the "May be None" might be legitimate, though also could be "yes, technically, this field is set to None, but only during instantiation, so maybe we should report what it ends up as"
  • We are leaking private classes as type hints here, perhaps making a Protocol is the way to go? (i.e. actually encode the methods we expect to be available and let the private shims ducktype as working)
- pandas/tests/plotting/common.py:127: error: Item "Artist" of "Union[Any, Artist]" has no attribute "fill"  [union-attr]
  • We did not make ArtistList generic in the first go around, so it lacks the type narrowing that it probably should have
  • Can probably make this better on our end
  • ArtistList is pseudo temporary to transition from a mutable list-like to an immutable tuple-like, and may just go away, not sure yet/what timescale
  • short term, can probably cast/ignore
- pandas/plotting/_matplotlib/core.py:468: error: "Axes" has no attribute "containers"  [attr-defined]
  • Actually does seem to exist, but not set directly in __init__ (actually set in __clear) so it was missed, may be others as well
- pandas/plotting/_matplotlib/core.py:1118: error: Module "matplotlib.axes" does not explicitly export attribute "Subplot"  [attr-defined]
  • Subplot is now just an alias for Axes
  • runtime uses import Axes as Subplot, but that is insufficient for mypy to see it as exported
  • probably the play is to add __all__ to axes/__init__.py
  • also have a star import in there, which could potentially mess some things up?

Unclear where to land on what needs fixing:

- pandas/plotting/_matplotlib/core.py:1172: error: Argument 1 to "set_xlabel" of "_AxesBase" has incompatible type "Hashable"; expected "str"  [arg-type]
- pandas/plotting/_matplotlib/core.py:1173: error: Argument 1 to "set_ylabel" of "_AxesBase" has incompatible type "Hashable"; expected "str"  [arg-type]
- pandas/plotting/_matplotlib/core.py:1820: error: Argument 1 to "set_xlabel" of "_AxesBase" has incompatible type "Optional[Hashable]"; expected "str"  [arg-type]
- pandas/plotting/_matplotlib/hist.py:184: error: Argument 1 to "set_xlabel" of "_AxesBase" has incompatible type "Hashable"; expected "str"  [arg-type]
- pandas/plotting/_matplotlib/hist.py:185: error: Argument 1 to "set_ylabel" of "_AxesBase" has incompatible type "Optional[Hashable]"; expected "str"  [arg-type]
- pandas/plotting/_matplotlib/hist.py:187: error: Argument 1 to "set_xlabel" of "_AxesBase" has incompatible type "Optional[Hashable]"; expected "str"  [arg-type]
- pandas/plotting/_matplotlib/hist.py:188: error: Argument 1 to "set_ylabel" of "_AxesBase" has incompatible type "Hashable"; expected "str"  [arg-type]
  • Docstring supports current mpl type hints, but could arguably be expanded (the underlying implementation doesn't actually rely on str specifically, and in fact ends up at Any, even Hashable is not actually a requirement)
- pandas/plotting/_matplotlib/core.py:1507: error: Argument 1 to "FixedLocator" has incompatible type "ndarray[Any, Any]"; expected "Sequence[float]"  [arg-type]
- pandas/plotting/_matplotlib/hist.py:434: error: Argument 1 to "set_ticks_props" has incompatible type "ndarray[Any, dtype[Any]]"; expected "Union[Axes, Sequence[Axes]]"  [arg-type]
  • NDArray doesn't qualify as Sequence
    • NDArray doesn't have an index or count method, which is expected by collections.ABC.Sequence
  • but Sequence is the most accurate expectation we have
  • there are probably a few more of these, because I was trying an experiment when I sat down to run these (that's actually why I started...)
    • My experiment is to use a Protocol class for Sequence that omits index and count, and thus is a strict subset of collections.ABC.Sequence but includes NDArray
    • It seems to work really well, but still not convinced its a good idea, at the very least may want to actually change the name...

Most of the fixes are fairly straightforward (especially to pandas itself), with the most sticky things probably being on our end (mpl). May require a few extra ignores in the interim, for those.

Sorry to have missed these when I did run against downstreams, I think I ran against the stubs repo thinking that mypy would report inconsistencies there, but it didn't find anything, so I moved on.

@mroeschke
Copy link
Member

I will look into it (I can't promise how soon). Is the only reason to pin it for typing reasons?

I believe code checks were the only reason why this was pinned. @lithomas1 might be able to confirm

@lithomas1
Copy link
Member Author

No, there were some other failing tests too IIRC.

I don't remember, though. Sorry about that, I think I was rushing to get CI green for the release and forgot to open an issue for this.

@twoertwein
Copy link
Member

I opened #55253 to take care of most of the typing errors - thank you very much @ksunden! There is still more to do, see comments in the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants