Enable many tests for complex numbers #54441

MichaelTiemannOSC · 2023-08-06T17:52:38Z

Add complex to test_numpy.py and add complex128 to test_numeric.py. Then fix the many small problems revealed by the test suite.

closes #xxxx (Replace xxxx with the GitHub issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

MichaelTiemannOSC · 2023-08-06T19:37:54Z

One of the reasons I made these changes was to get a better understanding of how, in tests/base/ops.py, the function _get_expected_exception should be written for handling the case of complex numbers. In this first draft of changes, I didn't really use it, warded off by the comment that "the self.obj_bar pattern isn't great in part because it can depend on op_name or dtypes, but we use it here for backward-compatibility". I could rewrite the changes so that BaseArithmenticOpsTests uses _get_expected_exception instead of skipping the tests, but that would also entail lots more work in terms of creating test cases that notice an exception being raised. I didn't want to do that work speculatively.

pandas/tests/extension/base/ops.py

pandas/core/nanops.py

mroeschke · 2023-08-15T19:14:51Z

pandas/tests/extension/base/dim2.py

@@ -213,6 +213,12 @@ def test_reductions_2d_axis0(self, data, method, min_count):

        kwargs = {}
        if method in ["std", "var"]:
+            if data.dtype.kind == "c":
+                pytest.skip(


Is this supposed to work? If so could you mark this with request.node.add_marker(pytest.mark.xfail(reason=...))?

Better yet, I've fixed the test for the complex case (check_dtype=False in call to assert_extension_array_equal). Will commit when power/Internet restored.

mroeschke · 2023-08-15T19:15:14Z

pandas/tests/extension/base/io.py

@@ -11,6 +11,8 @@
 class BaseParsingTests(BaseExtensionTests):
    @pytest.mark.parametrize("engine", ["c", "python"])
    def test_EA_types(self, engine, data):
+        if data.dtype.kind == "c":


Done. This led to lots of interface changes to get request everwhere it needed to be.

mroeschke · 2023-08-15T19:15:20Z

pandas/tests/extension/base/missing.py

@@ -88,6 +88,11 @@ def test_fillna_limit_backfill(self, data_missing):
        tm.assert_series_equal(result, expected)

    def test_fillna_no_op_returns_copy(self, data):
+        if data.dtype.kind == "c":


mroeschke · 2023-08-15T19:16:04Z

pandas/tests/extension/test_numpy.py

+        ]:
+            for arg in [obj, other]:
+                if isinstance(arg, complex):
+                    pytest.skip(f"{type(arg).__name__} does not support {op_name}")


jbrockmendel · 2023-08-15T22:08:02Z

pandas/tests/extension/test_numpy.py

@@ -54,7 +56,7 @@ def _assert_attr_equal(attr: str, left, right, obj: str = "Attributes"):
    orig_assert_attr_equal(attr, left, right, obj)


-@pytest.fixture(params=["float", "object"])
+@pytest.fixture(params=["complex", "float", "object"])


good idea fleshing these out!

jbrockmendel · 2023-08-15T22:09:13Z

pandas/tests/extension/test_numpy.py

-        super().test_arith_series_with_scalar(data, all_arithmetic_operators)
+        opname = all_arithmetic_operators
+        df = pd.DataFrame({"A": data})
+        self.check_opname(df, opname, data[0])


how is this different from the status quo?

jbrockmendel · 2023-08-15T22:10:05Z

pandas/tests/extension/base/setitem.py

@@ -338,7 +338,8 @@ def test_setitem_slice_array(self, data):

    def test_setitem_scalar_key_sequence_raise(self, data):
        arr = data[:5].copy()
-        with pytest.raises(ValueError):
+        # complex128 data raises TypeError; other numeric types raise ValueError


any idea why? would it make sense to catch and re-raise as a ValueError for consistency?

jbrockmendel · 2023-08-15T22:10:31Z

pandas/tests/extension/base/missing.py

+        if data.dtype.kind == "c":
+            pytest.skip(
+                f"no cython implementation of backfill(ndarray[{data.dtype.name}_t],"
+                f"ndarray[{data.dtype.name}_t], int64_t) in libs/algos.pxd"


if you must skip/xfail, the place to do it is in test_numpy, not in the base tests

MichaelTiemannOSC · 2023-08-17T17:10:39Z

I don't understand the mypy error messages (which I've never seen before).

mroeschke · 2023-08-18T17:04:56Z

pandas/core/arrays/_mixins.py

@@ -517,7 +517,9 @@ def _cast_quantile_result(self, res_values: np.ndarray) -> np.ndarray:
    # numpy-like methods

    @classmethod
-    def _empty(cls, shape: Shape, dtype: ExtensionDtype) -> Self:
+    def _empty(
+        cls, shape: Shape, dtype: ExtensionDtype, fill_value: object = None


I don't think adding a fill_value argument everywhere is a desirable solution

I understand. What's tricky is that we have no good way to query SCALARS (defined elsewhere) as to what sort of Array they want. In the case of complex128, it's obviously an ndarray of complex numbers. In the case of Pint quantities, it needs to be an array of whatever types can hold the magnitude value. But there's no way to ask that question from within Pandas right now. And reading about other EA types, which could have multiple independent arrays actually backing whatever object type is facially presented to _empty is even more difficult.

That said, if the ExtensionDtype had an interface that could either return an obj_dtype (which would be the dtype of the underlying Array) or a callable (which could wrap whatever complexity it wants to into something that _empty could process), maybe that would be cleaner.

Should I take a crack at making ExtensionDtype a little more introspectable? A groovy side-effect would be the efficient support of int8 and float16 EA datatypes.

Would #53089 help?

Looks like it nibbles around the same edges. I'll see what I can do. If there's an obvious git/github sequence that puts me in the right spot as far as "add this PR to that PR and let me hack" please tell me. Otherwise, I'll construct things manually (which might make merging harder later).

Answering my own question, I've found this article on stacked PRs: https://zx77.medium.com/stacked-pull-requests-with-github-f407b5d371ea . Is this the right way to think about the git part of making these changes on the PR #53089?

Maybe a pull request to @jbrockmendel 's branch would be a good way to propose changes

I could not figure out the easy way to do that (I'm the opposite of a git expert). If you can tell me the git command for that, I can adjust. I didn't need to modify anything they did (other than fix an assertion failure in boolean arrays that didn't like seeing "bool" types coming up from the ObjDtype).

MichaelTiemannOSC · 2023-08-22T17:41:04Z

pandas/core/internals/managers.py

-                n, dtype=object if immutable_ea else dtype  # type: ignore[arg-type]
+            cls = None
+
+        if hasattr(cls, "_from_scalars"):


Just realized that this hasattr check is redundant for EAs if we have _from_scalars in the base class. Will clean up next commit.

MichaelTiemannOSC · 2023-08-24T09:35:49Z

This is not what I meant to do. GitHub fooled me again!!

MichaelTiemannOSC mentioned this pull request Aug 7, 2023

ENH: enable setitem dim2 test to work for EA with complex128 dtype #54445

Open

3 tasks

jbrockmendel reviewed Aug 7, 2023

View reviewed changes

pandas/tests/extension/base/ops.py Outdated Show resolved Hide resolved

jbrockmendel reviewed Aug 7, 2023

View reviewed changes

pandas/tests/extension/base/ops.py Outdated Show resolved Hide resolved

mroeschke reviewed Aug 7, 2023

View reviewed changes

pandas/core/nanops.py Outdated Show resolved Hide resolved

mroeschke added Testing pandas testing functions or related to the test suite Complex Complex Numbers labels Aug 7, 2023

mroeschke reviewed Aug 15, 2023

View reviewed changes

jbrockmendel reviewed Aug 15, 2023

View reviewed changes

mroeschke requested changes Aug 18, 2023

View reviewed changes

MichaelTiemannOSC commented Aug 22, 2023

View reviewed changes

MichaelTiemannOSC closed this Aug 24, 2023

MichaelTiemannOSC force-pushed the test_numpy_complex branch from 7e7cecc to 867cae3 Compare August 24, 2023 09:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable many tests for complex numbers #54441

Enable many tests for complex numbers #54441

MichaelTiemannOSC commented Aug 6, 2023

MichaelTiemannOSC commented Aug 6, 2023

mroeschke Aug 15, 2023

MichaelTiemannOSC Aug 16, 2023

mroeschke Aug 15, 2023

MichaelTiemannOSC Aug 17, 2023

mroeschke Aug 15, 2023

MichaelTiemannOSC Aug 17, 2023

mroeschke Aug 15, 2023

MichaelTiemannOSC Aug 17, 2023

jbrockmendel Aug 15, 2023

jbrockmendel Aug 15, 2023

jbrockmendel Aug 15, 2023

jbrockmendel Aug 15, 2023

MichaelTiemannOSC Aug 17, 2023

MichaelTiemannOSC commented Aug 17, 2023

mroeschke Aug 18, 2023

MichaelTiemannOSC Aug 18, 2023

mroeschke Aug 21, 2023

MichaelTiemannOSC Aug 21, 2023

MichaelTiemannOSC Aug 21, 2023

mroeschke Aug 22, 2023

MichaelTiemannOSC Aug 22, 2023

MichaelTiemannOSC Aug 22, 2023

MichaelTiemannOSC commented Aug 24, 2023

Enable many tests for complex numbers #54441

Enable many tests for complex numbers #54441

Conversation

MichaelTiemannOSC commented Aug 6, 2023

MichaelTiemannOSC commented Aug 6, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaelTiemannOSC commented Aug 17, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaelTiemannOSC commented Aug 24, 2023