Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rename "Y" freq string to "YE" (pandas parity) #8629

Merged
merged 10 commits into from
Jan 22, 2024

Conversation

mathause
Copy link
Collaborator

This renames the frequency string "Y" (formerly "A") to "YE" to achieve pandas parity. It could be better to wait for the conclusion of pandas-dev/pandas#56840 before doing this (but fixing the related failure in #8623 seemed a good reason as any to do it know).

Let me know what you think @spencerkclark @aulemahal

Copy link
Member

@spencerkclark spencerkclark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mathause for keeping up on this. This makes sense to me!

@@ -1009,29 +1013,29 @@ def cftime_range(
+------------+--------------------------------------------------------------------+
| Alias | Description |
+============+====================================================================+
| Y(S)-JAN | Annual frequency, anchored at the end (or beginning) of January |
| Y(E,S)-JAN | Annual frequency, anchored at the (end, beginning) of January |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may need to realign the edge of the table for the doc build to pass:

Suggested change
| Y(E,S)-JAN | Annual frequency, anchored at the (end, beginning) of January |
| Y(E,S)-JAN | Annual frequency, anchored at the (end, beginning) of January |

xarray/coding/cftime_offsets.py Outdated Show resolved Hide resolved
@max-sixty
Copy link
Collaborator

This does fix most of the tests. Are we OK to merge in its current state and fix the remaining breakages in another PR?

@spencerkclark
Copy link
Member

I tentatively prefer to wait until @mathause has a chance to fully wrap this up (I think it should be quick).

@max-sixty
Copy link
Collaborator

I tentatively prefer to wait until @mathause has a chance to fully wrap this up (I think it should be quick).

OK. I do think we should really try and not having things in a failing state. I know it's not easy.

More generally — we did have deprecation warnings on the cftime code for months now — I understand that everyone is a volunteer and it's difficult to ask specific people to do the work to prevent these sorts of outcomes. But possibly this sort of outcome — where all the test jobs are failing for multiple days — should update us to doing less in the main xarray library...


I think I've fixed all the non-cftime issues (mostly through workarounds)

@spencerkclark
Copy link
Member

@max-sixty I hear your frustration / concern, and I am sorry that this happened. I must admit though that the first time I saw this failure was in #8623, which was opened only a couple days before this breaking version of pandas was released. This was tricky to catch ahead of time, because the return value for infer_freq in pandas changed abruptly without a deprecation cycle (and the change was made after my work in #8415):

pandas 2.1.4

>>> pd.__version__
'2.1.4'
>>> pd.infer_freq(pd.date_range("2000", periods=10, freq="A-FEB"))
'A-FEB'

pandas 2.2.0

>>> pd.__version__
'2.2.0'
>>> pd.infer_freq(pd.date_range("2000", periods=10, freq="A-FEB"))
<stdin>:1: FutureWarning: 'A-FEB' is deprecated and will be removed in a future version, please use 'YE-FEB' instead.
'YE-FEB'

Deprecation warnings in the main test suite

The deprecation warnings in our main test suite in the preceding months were not relevant to this particular case. I was aware of those warnings in #8415, but did not address them at the time for the following reason from the PR description:

I also took the liberty to transition to using "Y", "YS", "h", "min", "s", "ms", "us", and "ns" within our code, tests, and documentation to reduce the amount of warnings emitted. I have held off on switching to "QE", "ME", and anchored offsets involving "Y" or "YS" in pandas-related code since those are not supported in older versions of pandas.

We can discuss whether I should have done something differently there—I agree the deprecation warnings look annoying / concerning—but that is a separate issue (since we test against old versions of pandas it was awkward to address those deprecation warnings at the time).

I hope that extra context gives you some room for empathy 🙏. I'm trying to stay on top of these issues, but this one was particularly difficult to be out ahead of.

@max-sixty
Copy link
Collaborator

I hope that extra context gives you some room for empathy 🙏. I'm trying to stay on top of these issues, but this one was particularly difficult to be out ahead of.

Sorry, of course @spencerkclark . I didn't mean to sound too harsh. The impact is only on developers, not on users, so it's bounded.

FWIW I really wish we had a way of upgrading dependencies deliberately — e.g. dependabot supported conda — then we could be much more relaxed about this — when things broke we'd have a single new PR that failed, rather than every single PR failing...


I rechecked and see that the deprecation warnings from cftime we've been getting (e.g. ~20 in https://github.com/pydata/xarray/actions/runs/7410777266/job/20163815985) are indeed unrelated to this break! I had presumed they were the same.


Do you mind if I ask about how we managed the change? These days I'm a minor contributor, and you know this much better — I'm really not trying to give anyone a hard time.

The .infer_freq change is indeed quite unfriendly. But is it correct to think that if we'd responded to the deprecation warnings by allowing YE-FEB to work, then once .infer_freq returned YE-FEB, xarray would have handled this OK?

(I think I commented in a previous issue — could we even just pass anything we don't recognize through to pandas and then avoid maintaining a full list? Or we do need to intercept everything and so that's not viable...)

@spencerkclark
Copy link
Member

All good @max-sixty, no worries!

But is it correct to think that if we'd responded to the deprecation warnings by allowing YE-FEB to work, then once .infer_freq returned YE-FEB, xarray would have handled this OK?

The main issue is that I, and I'm guessing @mathause, was not aware of this deprecation / change until three days ago. Looking back, this was discussed in pandas-dev/pandas#54275, but it had only been partially completed at the time of #8415 (the "A" was changed to "Y" but not all the way to "YE" yet). I had only looked at pandas-dev/pandas#55252, which was the initial PR going from "A" to "Y", and regrettably not the issue, and didn't realize there was to be a second PR changing "Y" to "YE" coming later on, pandas-dev/pandas#55792.

The deprecation warnings only showed up in the development version of pandas (i.e. what is now released as 2.2.0), so they were unfortunately not very visible.


FWIW I really wish we had a way of upgrading dependencies deliberately — e.g. dependabot supported conda — then we could be much more relaxed about this — when things broke we'd have a single new PR that failed, rather than every single PR failing...

Indeed this would be nice!

@max-sixty
Copy link
Collaborator

The main issue is that I, and I'm guessing @\mathause, was not aware of this deprecation / change until three days ago.

Yes, that makes sense. I hadn't realized that the warnings I'd been seeing over the past few months weren't directly related to the break.

Thanks a lot @spencerkclark

@mathause
Copy link
Collaborator Author

Thanks everyone and sorry about the delay - I was offline over the weekend. The failing tests should be covered elsewhere, so I'll merge.

@mathause mathause merged commit 5a92d48 into pydata:main Jan 22, 2024
26 of 29 checks passed
@mathause mathause deleted the freq_string_Y_YE branch January 22, 2024 08:01
@spencerkclark
Copy link
Member

Awesome, thanks for the quick updates and merge @mathause!

@max-sixty
Copy link
Collaborator

Thanks @mathause !

mathause added a commit to mathause/xarray that referenced this pull request Jan 22, 2024
spencerkclark pushed a commit that referenced this pull request Jan 23, 2024
* infer_freq: return 'YE' (#8629 follow-up)

* fix whats new

---------

Co-authored-by: Maximilian Roos <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants