Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Raise TypeError when subracting DateTimeArray and other date types #59901

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

KevsterAmp
Copy link
Contributor

@KevsterAmp KevsterAmp commented Sep 26, 2024

@KevsterAmp KevsterAmp marked this pull request as draft September 26, 2024 11:40
@KevsterAmp
Copy link
Contributor Author

@rhshadrach - can you verify if my implementation here is correct? I'm not sure of this. A bit of guidance could be useful. Thanks

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is looking good, CI failures look unrelated. They'll be fixed outside of this PR.

@KevsterAmp KevsterAmp closed this Oct 3, 2024
@KevsterAmp KevsterAmp reopened this Nov 11, 2024
@KevsterAmp KevsterAmp marked this pull request as ready for review November 12, 2024 11:37
@KevsterAmp
Copy link
Contributor Author

        from pandas.core.arrays import DatetimeArray

        datetime_result = self - other
        if isinstance(datetime_result, DatetimeArray):
            raise TypeError(
                "TypeError: unsupported operand type(s) for -: "
                f"'{type(self).__name__}' and '{type(other).__name__}'"
            )
        return -(datetime_result)

I added an outside top-level import because I am experiencing Unbound errors with Datetimearray in this line exactly:

        if isinstance(datetime_result, DatetimeArray):

Also, I replaced the messages with the variable type as any wildcard .*. Because I had trouble matching the exact dtype of the variable vs the error output. Seems like the dtype gets converted. For example, there are instances where timedelta variable error message contains timedelta64 but there are cases where it returns timedelta as normal.

            r"TypeError: unsupported operand type\(s\) for -: 'DatetimeArray' and '.*'"

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@rhshadrach rhshadrach added Datetime Datetime data dtype Error Reporting Incorrect or improved errors from pandas labels Nov 12, 2024
@rhshadrach rhshadrach added this to the 3.0 milestone Nov 12, 2024
@rhshadrach rhshadrach added the Timedelta Timedelta data type label Nov 12, 2024
@KevsterAmp
Copy link
Contributor Author

@mroeschke - could you review this. Thanks

Comment on lines 1500 to 1501
datetime_result = self - other
if isinstance(datetime_result, DatetimeArray):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little nervous enforcing a particular type out of a rsub operation.

Could we just catch the TypeError and reraise it with a better message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, I agree that it would be a better implementation

@KevsterAmp
Copy link
Contributor Author

KevsterAmp commented Nov 16, 2024

My implementation raises re-raises new message in TypeError on DateTimeArrays only.

Since there are already other error messages for other dtypes like

Cannot subtract tz-naive and tz-aware datetime-like objects.

If I implement it as

try:
    return -(self - other)
except TypeError as e:
    raise TypeError(
        "Unsupported operand type(s) for -: "
        f"'{type(self).__name__}' and '{type(other).__name__}'"
    ) from e

I would have to fix all tests related to raising tz-naive/tz-aware datetime-like objects. Thoughts?

@KevsterAmp KevsterAmp requested a review from mroeschke November 16, 2024 23:53
@KevsterAmp
Copy link
Contributor Author

Looks like my previous commit affects other datetime related operations.

Could you give me ideas on catching the error? From my understanding, when a datetime objects go into this block

return -(self - other)

It will always raise a TypeError. So my previous implementation catches it before doing the operation.

        # We get here with e.g. datetime objects
        from pandas.core.arrays import DatetimeArray

        datetime_result = self - other
        if isinstance(datetime_result, DatetimeArray):
            raise TypeError(
                "Unsupported operand type(s) for -: "
                f"'{type(self).__name__}' and '{type(other).__name__}'"
            )
        return -(datetime_result)

Though from checking the CI failures, I can't seem to trace/ I'm not familiar how my latest commit affected them.

Guidance would be helpful. Thank you!

try:
return -(datetime_result)
except TypeError as e:
if isinstance(datetime_result, DatetimeArray):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need this if statement here. We'd always want to raise the exception

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Error Reporting Incorrect or improved errors from pandas Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: provide better error message for pd.Timedelta - pd.Series[Timestamp]
3 participants