-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable many tests for complex numbers #54441
Enable many tests for complex numbers #54441
Conversation
One of the reasons I made these changes was to get a better understanding of how, in |
pandas/tests/extension/base/dim2.py
Outdated
@@ -213,6 +213,12 @@ def test_reductions_2d_axis0(self, data, method, min_count): | |||
|
|||
kwargs = {} | |||
if method in ["std", "var"]: | |||
if data.dtype.kind == "c": | |||
pytest.skip( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to work? If so could you mark this with request.node.add_marker(pytest.mark.xfail(reason=...))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better yet, I've fixed the test for the complex case (check_dtype=False in call to assert_extension_array_equal). Will commit when power/Internet restored.
pandas/tests/extension/base/io.py
Outdated
@@ -11,6 +11,8 @@ | |||
class BaseParsingTests(BaseExtensionTests): | |||
@pytest.mark.parametrize("engine", ["c", "python"]) | |||
def test_EA_types(self, engine, data): | |||
if data.dtype.kind == "c": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. This led to lots of interface changes to get request
everwhere it needed to be.
@@ -88,6 +88,11 @@ def test_fillna_limit_backfill(self, data_missing): | |||
tm.assert_series_equal(result, expected) | |||
|
|||
def test_fillna_no_op_returns_copy(self, data): | |||
if data.dtype.kind == "c": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pandas/tests/extension/test_numpy.py
Outdated
]: | ||
for arg in [obj, other]: | ||
if isinstance(arg, complex): | ||
pytest.skip(f"{type(arg).__name__} does not support {op_name}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pandas/tests/extension/test_numpy.py
Outdated
@@ -54,7 +56,7 @@ def _assert_attr_equal(attr: str, left, right, obj: str = "Attributes"): | |||
orig_assert_attr_equal(attr, left, right, obj) | |||
|
|||
|
|||
@pytest.fixture(params=["float", "object"]) | |||
@pytest.fixture(params=["complex", "float", "object"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea fleshing these out!
pandas/tests/extension/test_numpy.py
Outdated
super().test_arith_series_with_scalar(data, all_arithmetic_operators) | ||
opname = all_arithmetic_operators | ||
df = pd.DataFrame({"A": data}) | ||
self.check_opname(df, opname, data[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this different from the status quo?
@@ -338,7 +338,8 @@ def test_setitem_slice_array(self, data): | |||
|
|||
def test_setitem_scalar_key_sequence_raise(self, data): | |||
arr = data[:5].copy() | |||
with pytest.raises(ValueError): | |||
# complex128 data raises TypeError; other numeric types raise ValueError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any idea why? would it make sense to catch and re-raise as a ValueError for consistency?
if data.dtype.kind == "c": | ||
pytest.skip( | ||
f"no cython implementation of backfill(ndarray[{data.dtype.name}_t]," | ||
f"ndarray[{data.dtype.name}_t], int64_t) in libs/algos.pxd" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you must skip/xfail, the place to do it is in test_numpy, not in the base tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I don't understand the mypy error messages (which I've never seen before). |
pandas/core/arrays/_mixins.py
Outdated
@@ -517,7 +517,9 @@ def _cast_quantile_result(self, res_values: np.ndarray) -> np.ndarray: | |||
# numpy-like methods | |||
|
|||
@classmethod | |||
def _empty(cls, shape: Shape, dtype: ExtensionDtype) -> Self: | |||
def _empty( | |||
cls, shape: Shape, dtype: ExtensionDtype, fill_value: object = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think adding a fill_value
argument everywhere is a desirable solution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand. What's tricky is that we have no good way to query SCALARS (defined elsewhere) as to what sort of Array they want. In the case of complex128, it's obviously an ndarray of complex numbers. In the case of Pint quantities, it needs to be an array of whatever types can hold the magnitude
value. But there's no way to ask that question from within Pandas right now. And reading about other EA types, which could have multiple independent arrays actually backing whatever object type is facially presented to _empty
is even more difficult.
That said, if the ExtensionDtype had an interface that could either return an obj_dtype (which would be the dtype of the underlying Array) or a callable (which could wrap whatever complexity it wants to into something that _empty could process), maybe that would be cleaner.
Should I take a crack at making ExtensionDtype a little more introspectable? A groovy side-effect would be the efficient support of int8 and float16 EA datatypes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would #53089 help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it nibbles around the same edges. I'll see what I can do. If there's an obvious git/github sequence that puts me in the right spot as far as "add this PR to that PR and let me hack" please tell me. Otherwise, I'll construct things manually (which might make merging harder later).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answering my own question, I've found this article on stacked PRs: https://zx77.medium.com/stacked-pull-requests-with-github-f407b5d371ea . Is this the right way to think about the git part of making these changes on the PR #53089?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a pull request to @jbrockmendel 's branch would be a good way to propose changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not figure out the easy way to do that (I'm the opposite of a git expert). If you can tell me the git command for that, I can adjust. I didn't need to modify anything they did (other than fix an assertion failure in boolean arrays that didn't like seeing "bool" types coming up from the ObjDtype).
pandas/core/internals/managers.py
Outdated
n, dtype=object if immutable_ea else dtype # type: ignore[arg-type] | ||
cls = None | ||
|
||
if hasattr(cls, "_from_scalars"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized that this hasattr check is redundant for EAs if we have _from_scalars in the base class. Will clean up next commit.
7e7cecc
to
867cae3
Compare
This is not what I meant to do. GitHub fooled me again!! |
Add
complex
totest_numpy.py
and addcomplex128
totest_numeric.py
. Then fix the many small problems revealed by the test suite.doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.