-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: passing mixed offsets with utc=False into to_datetime #54014
DEPR: passing mixed offsets with utc=False into to_datetime #54014
Conversation
…rray_to_datetime, fix tests
…st for FutureWarning, fix tests
…ix errors in docs
… to_datetime, correct an example in whatsnew/v1.1.0.rst
@MarcoGorelli, could you please take a look at this pr? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for updating! I haven't finished reviewing all the tests, but here's some comments in the meantime
…ename test functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really close. I just have a minor comment and then am ready to approve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, looks good to me on green
Going to leave this open a bit in case others have comments, but I think we can probably get this in for 2.1
".*parsing datetimes with mixed time zones will raise a warning", | ||
category=FutureWarning, | ||
) | ||
result = tools.to_datetime( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a couple more usages of to_datetime
in this function, I think we should ideally silence them too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I addedcatch_warnings()
in two other places where we call to_datetime
in this function.
I get failures in ci and I’m not sure if they are related to my changes. Locally all tests pass.
e.g the tests below failed in ci, but passed locally:
pandas/tests/arrays/floating/test_construction.py
pandas/tests/arrays/integer/test_repr.py
pandas/tests/arrays/floating/test_repr.py
we don't even call this function (converter()
) while running these tests.
…://github.com/natmokval/pandas into DEPR-to_datetime-mixed-offsets-with-utc=False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks @natmokval
This is going to be a fun one to enforce, it will simplify so much
Leaving open a bit in case others have objections
doc/source/whatsnew/v2.1.0.rst
Outdated
To create a ``Series`` with mixed offsets and ``object`` dtype, please use ``apply`` | ||
and ``datetime.datetime.strptime``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add this sentence to the warning message itself, and here show an example?
like
In [2]: pd.Series(data).apply(lambda x: dt.datetime.strptime(x, '%Y-%m-%d %H:%M:%S%z'))
Out[2]:
0 2020-01-01 00:00:00+06:00
1 2020-01-01 00:00:00+01:00
dtype: object
(formtted appropriately)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I corrected the warning message and added the example how users can get the old behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for updating, looks like this addresses the request!
over to Matt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (though I am still lukewarm on the change)
Sure, thanks - I don't want to force this onto the project, I really want us all to be happy with what we're doing Suggestion: we get this into 2.1, and then if there's complaints or issues we hadn't foreseen (or use cases for which the workaround suggested isn't enough), then we revert for 2.2 / 3.0? Getting it in for 2.1 would give more time to hear people's potential complaints. Happy to admit I've been wrong about this, and to revert course, if that happens |
Great work @natmokval I do like that this will not catch users by surprise with |
thank you. I hope users will find this change reasonable. |
Thank you @MarcoGorelli for helping me with this PR. |
"In a future version of pandas, parsing datetimes with mixed time " | ||
"zones will raise a warning unless `utc=True`. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The message here says that this will raise a warning in the future. But is that indeed the intent, or should that be "error" instead?
(my understanding of the discussion was that it would error in the future)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right, thanks - @natmokval fancy addressing this in a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I replaced "warning" with "error" in warning message and made a new PR
@@ -348,8 +349,20 @@ def _return_parsed_timezone_results( | |||
dta = dta.tz_localize("utc") | |||
else: | |||
dta = dta.tz_convert("utc") | |||
else: | |||
if not dta.isna().all(): | |||
non_na_timezones.add(zone) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this might break if we ever had a tzaware datetime object with a dateutil/pytz tzinfo in the input array bc those tzinfos are not hashable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks fine:
In [2]: import pytz
In [3]: result = pd.to_datetime(
...: [
...: "2000-01-03 12:34:56.123456+01:00",
...: datetime(2020, 1, 1, tzinfo=pytz.timezone('Asia/Kathmandu'))
...: ],
...: exact=False,
...: )
<ipython-input-3-36f1f20a96cd>:1: FutureWarning: In a future version of pandas, parsing datetimes with mixed time zones will raise an error unless `utc=True`. Please specify `utc=True` to opt in to the new behaviour and silence this warning. To create a `Series` with mixed offsets and `object` dtype, please use `apply` and `datetime.datetime.strptime`
result = pd.to_datetime(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
woops, its dateutil tzs that aren't hashable. though your example still seems to work with one, so never mind
Why are we filtering warnings internally here? My understanding is that this will raise in the future, so we should give users the time to address things |
it's in a place where the user can't do anything about it, and where in the future it just won't get converted (the input will be preserved) as we'll no longer end up with an |
xref #50887
parsing datetimes with mixed time zones will raise a warning if
utc=False
. We advice users to useutc=True
to silence this warning.