Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: disallow parsing datetimes with mixed time zones unless utc=True #57275

Merged

Conversation

natmokval
Copy link
Contributor

@natmokval natmokval commented Feb 6, 2024

xref #54014

enforced deprecation disallowing parsing datetimes with mixed time zones unless user passes utc=True to to_datetime

@simonjayhawkins simonjayhawkins added Datetime Datetime data dtype Deprecate Functionality to remove in pandas labels Feb 7, 2024
@natmokval natmokval marked this pull request as ready for review February 8, 2024 23:21
@natmokval
Copy link
Contributor Author

@MarcoGorelli, could you please take a look at this PR?

pandas/_libs/tslib.pyx Outdated Show resolved Hide resolved
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done! This is exactly the kind of simplification / optimisation I was hoping to see! Feels so good to see those object fallbacks go

pandas/_libs/tslib.pyx Show resolved Hide resolved
pandas/_libs/tslibs/strptime.pyx Show resolved Hide resolved
pandas/_libs/tslibs/strptime.pyx Show resolved Hide resolved
Comment on lines 808 to 809
with mixed time zones unless ``utc=True``. Parsing datetimes with mixed
time zones, please specify ``utc=True`` to opt in to the new behaviour.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a word missing in this sentence? Like, "If parsing"? Also, no need to say "to opt in to the new behaviour" - this is the new behaviour :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment. I corrected the wording as you suggested.

pandas/tests/io/parser/common/test_read_errors.py Outdated Show resolved Hide resolved
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

almost there

Comment on lines 3589 to 3592
msg = "cannot parse datetimes with mixed time zones unless `utc=True`"
with pytest.raises(ValueError, match=msg):
to_datetime(vec, format="mixed")
with tm.assert_produces_warning(FutureWarning, match=depr_msg):
msg = "DatetimeIndex has mixed timezones"
with pytest.raises(TypeError, match=msg):
DatetimeIndex(vec)
DatetimeIndex(vec)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if they're both meant to raise, can we do this in two context managers please? else maybe only one of them could raise the error and the test would erroneously pass

instead of

with pytest.raises(ValueError, match=msg):
    to_datetime(vec, format="mixed")
    DatetimeIndex(vec)

let's do

with pytest.raises(ValueError, match=msg):
    to_datetime(vec, format="mixed")
with pytest.raises(ValueError, match=msg):
    DatetimeIndex(vec)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I replaced one context manager with two.

I checked, both to_datetime(vec, format="mixed") and DatetimeIndex(vec) raise ValueError: cannot parse datetimes with mixed time zones unless 'utc=True'

@@ -598,7 +597,9 @@ cpdef array_to_datetime(
# (with individual dateutil.tzoffsets) are returned
is_same_offsets = len(out_tzoffset_vals) == 1
if not is_same_offsets:
return _array_to_datetime_object(values, errors, dayfirst, yearfirst)
raise ValueError(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comments above should also be revised cc @MarcoGorelli

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I missed this comment. I replaced it with

# 2) If the offsets are different, then do not force the parsing
#    and raise a ValueError: "cannot parse datetimes with
#    mixed time zones unless `utc=True`" instead

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice one!

@MarcoGorelli MarcoGorelli added this to the 3.0 milestone Feb 19, 2024
@mroeschke
Copy link
Member

Thanks @natmokval

@mroeschke mroeschke merged commit 997e1b8 into pandas-dev:main Feb 19, 2024
47 checks passed
pmhatre1 pushed a commit to pmhatre1/pandas-pmhatre1 that referenced this pull request May 7, 2024
…ue` (pandas-dev#57275)

* correct def _array_to_datetime_object, _array_strptime_object_fallback, fix tests

* fix tests

* correct to_datetime docs, add a note to v3.0.0

* correct to_datetime docs

* fix failures in benchmarks/inference.py

* fix pre-commit error

* correct examples in to_datetime docs

* correct to_datetime docs

* delete time_different_offset from benchmarks/inference.py as redundant

* correct v3.0.0

* removed _array_to_datetime_object and _array_strptime_object_fallback

* correct to_datetime docstring, roll back changes in test_suppress_error_output

* fix pre-commit error

* correct test_to_datetime_mixed_awareness_mixed_types, and a comment in array_to_datetime
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Deprecate Functionality to remove in pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants