Skip to content

Commit

Permalink
DEPR offsets: rename 'M' to 'ME' (#52064)
Browse files Browse the repository at this point in the history
* Frequency: raise warnings when using ‘M’ frequency

* Frequency: raise warnings when using ‘M’ frequency II

* remove is_period and change str representation for freq in Period [skip ci]

* remove is_period and fix some tests [skip ci]

* fix some tests

* fix some tests II

* fix tests in pandas/tests/indexes/period/ [skip ci]

* fix tests in pandas/tests/indexes/period/ and correct timedeltas.pyx

* update frequencies.py, resample.py, and fix some tests

* modify pandas/tseries/frequencies.py

* fix tests

* fix tests II

* fix tests III

* rename 'M' to 'ME' in docs

* rename 'M' to 'ME' in docs II

* rename 'M' to 'ME' in docs III

* rename 'M' to 'ME' in docs IV

* rename 'M' to 'ME' in docs V

* add is_period to to_offset I

* add is_period to to_offset II

* correct the definition of period_array(…) and fix 19 tests

* add is_period to _parse_dtype_strict() and fix tests

* add constant OFFSET_TO_PERIOD_FREQSTR to period.pyx and fix tests

* correct definitions of extract_ordinals() and _round(), fix tests

* add replacement ME to M in _require_matching_freq, _parsed_string_to_bounds, and fix tests

* add the constant PERIOD_TO_OFFSET_FREQSTR to period.pyx, correct definition of _resolution_obj and fix tests

* fix tests

* add the conversion ME to M to _from_datetime64, period_index, raise_on_incompatible and fix tests

* fix some tests with resample

* correct definitions of to_period, freqstr and get_period_alias, fix tests for plotting

* correct pre-commit failures

* add key from Grouper to the constructor of TimeGrouper and fix tests

* add to asfreq() from resampler the conversion ME to M, fix tests

* fix tests for for PeriodIndex and base tests for resample

* correct the constructor of TimeGrouper and fix tests for resample and plotting

* correct the definition of use_dynamic_x() and fix tests for plotting

* correct the definition of the method use_dynamic_x, fix tests

* correct the definition of the asfreq for PeriodArray, _get_period_alias, and fix tests

* correct documentation, fix tests

* correct docs: rename ME to M for periods

* add pytest.mark.xfail to test_to_timestamp_quarterly_bug

* correct mypy error attr-defined

* correct the definition of variables which convert M/ME to ME/M in dtypes.pyx, declare to_offset in offsets.pyi, fix mypy errors

* created the c version for dicts which convert M/ME to ME/M and fix mypy errors

* fix doc build error in 09_timeseries.rst and mypy error

* correct the constructor of Period, fix mypy errors

* replace in _attrname_to_abbrevs ME with M and correct the constructor of Period

* add conversion ME/M to Period constructor, add conversion M/ME to maybe_resample and reverse changes in _attrname_to_abbrevs

* correct dict “time rules”, correct the definition of _parsed_string_to_bounds, remove is_period from definition _parse_weekly_str and _parse_dtype_strict

* remove the argument is_period from _parse_dtype_strict

* add to is_subperiod, is_superperiod and _is_monthly both M and ME, correct definitions of _downsample and _maybe_cast_slice_bound

* add dict ME to M to the definition of freqstr, constructor of Period and remove pytest.mark.xfail from test_round_trip_current

* refactor freqstr, extract_ordinals, and  _require_matching_freq for Period, asfreq for resample and _parsed_string_to_bounds for datetimes

* refactor _resolution_obj in dtypes.pyx and freqstr in /indexes/datetimelike.py

* define a new function freq_to_period_freqstr in dtypes to convert ME to M

* refactor use_dynamic_x for plotting and to_period in arrays/datetimes.py

* refactor def _check_plot_works in plotting and test_to_period in class TestDatetimeArray

* refactor name method of PeriodDtype, refactor  __arrow_array__  and add test for ValueError in test_period.py

* in PeriodArray refactor _from_datetime64 and remove redundant if in asfreq, add test for ValueError in test_period_index.py and ignore mypy error

* correct def _resolution_obj in DatetimeLikeArrayMixin, refactor def freqstr in PeriodArray and add tests ValueError for ME

* correct def _resolution_obj in DatetimeLikeArrayMixin and def to_offset, refactor def freqstr in PeriodArray and add tests for ‘ValueError’ and 'UserWarning'

* add tests for 'UserWarning'

* refactor methods to_period in DatetimeArray, _from_datetime64 in PeriodArray, fix test in plotting

* add freq_to_offset_freqstr to convert M to ME, refactor _resolution_obj, add tests for ‘ValueError’ and 'UserWarning'

* fix pre-commit failures

* correct the definition of to_period in DatetimeArray, refactor _check_plot_works, fix test_asfreq_2M

* correct definitions of _resolution_obj in dtypes.pyx and in DatetimeLikeArrayMixin, _attrname_to_abbrevs and fix test_get_attrname_from_abbrev

* correct def asfreq in PeriodArray, remove unused function freq_to_offset_freqstr, fix tests

* roll back in test_fillna_period dtype Period[M] with capital P

* refactor the function raise_on_incompatible

* fix mypy error in pandas/core/arrays/period.py

* fix ruff error in pandas/tests/arrays/period/test_constructors.py

* remove ME from definitions of is_monthly, is_subperiod, correct _maybe_coerce_freq and test_period_ordinal_start_values

* fix test_dti_to_period_2monthish

* update whatsnew/v2.1.0.rst

* add an example for old/new behavior in whatsnew/v2.1.0.rst

* corrected typo

* replace name of section Deprecations with Other Deprecations

* remove ME form is_superperiod, refactored tests

* correct a test

* move some tests to a new place

* correct def asfreq for resampling, refactor asfreq for Period, fix tests

* correct tests

* correct def _shift_with_freq and fix test for shift

* correct docs for asfreq in PeriodArray

* correct def _shift_with_freq

* add ‘me’ to _dont_uppercase, correct  _require_matching_freq, fix tests

* minor corrections

* correct whatsnew

* correct an example in user_guide/reshaping.rst

* fix tests for plotting

* correct tests for plotting

* remove from OFFSET_TO_PERIOD_FREQSTR deprecated freqstr, fix tests
  • Loading branch information
natmokval authored Sep 20, 2023
1 parent 61a6335 commit a98be06
Show file tree
Hide file tree
Showing 96 changed files with 726 additions and 397 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -295,7 +295,7 @@ Aggregate the current hourly time series values to the monthly maximum value in

.. ipython:: python
monthly_max = no_2.resample("M").max()
monthly_max = no_2.resample("ME").max()
monthly_max
A very powerful method on time series data with a datetime index, is the
Expand Down
2 changes: 1 addition & 1 deletion doc/source/user_guide/cookbook.rst
Original file line number Diff line number Diff line change
Expand Up @@ -771,7 +771,7 @@ To create year and month cross tabulation:
df = pd.DataFrame(
{"value": np.random.randn(36)},
index=pd.date_range("2011-01-01", freq="M", periods=36),
index=pd.date_range("2011-01-01", freq="ME", periods=36),
)
pd.pivot_table(
Expand Down
6 changes: 3 additions & 3 deletions doc/source/user_guide/groupby.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1416,7 +1416,7 @@ Groupby a specific column with the desired frequency. This is like resampling.

.. ipython:: python
df.groupby([pd.Grouper(freq="1M", key="Date"), "Buyer"])[["Quantity"]].sum()
df.groupby([pd.Grouper(freq="1ME", key="Date"), "Buyer"])[["Quantity"]].sum()
When ``freq`` is specified, the object returned by ``pd.Grouper`` will be an
instance of ``pandas.api.typing.TimeGrouper``. You have an ambiguous specification
Expand All @@ -1426,9 +1426,9 @@ in that you have a named index and a column that could be potential groupers.
df = df.set_index("Date")
df["Date"] = df.index + pd.offsets.MonthEnd(2)
df.groupby([pd.Grouper(freq="6M", key="Date"), "Buyer"])[["Quantity"]].sum()
df.groupby([pd.Grouper(freq="6ME", key="Date"), "Buyer"])[["Quantity"]].sum()
df.groupby([pd.Grouper(freq="6M", level="Date"), "Buyer"])[["Quantity"]].sum()
df.groupby([pd.Grouper(freq="6ME", level="Date"), "Buyer"])[["Quantity"]].sum()
Taking the first rows of each group
Expand Down
2 changes: 1 addition & 1 deletion doc/source/user_guide/reshaping.rst
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ Also, you can use :class:`Grouper` for ``index`` and ``columns`` keywords. For d

.. ipython:: python
pd.pivot_table(df, values="D", index=pd.Grouper(freq="M", key="F"), columns="C")
pd.pivot_table(df, values="D", index=pd.Grouper(freq="ME", key="F"), columns="C")
.. _reshaping.pivot.margins:

Expand Down
18 changes: 9 additions & 9 deletions doc/source/user_guide/timeseries.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ data however will be stored as ``object`` data.
pd.Series(pd.period_range("1/1/2011", freq="M", periods=3))
pd.Series([pd.DateOffset(1), pd.DateOffset(2)])
pd.Series(pd.date_range("1/1/2011", freq="M", periods=3))
pd.Series(pd.date_range("1/1/2011", freq="ME", periods=3))
Lastly, pandas represents null date times, time deltas, and time spans as ``NaT`` which
is useful for representing missing or null date like values and behaves similar
Expand Down Expand Up @@ -450,7 +450,7 @@ variety of :ref:`frequency aliases <timeseries.offset_aliases>`:

.. ipython:: python
pd.date_range(start, periods=1000, freq="M")
pd.date_range(start, periods=1000, freq="ME")
pd.bdate_range(start, periods=250, freq="BQS")
Expand Down Expand Up @@ -882,7 +882,7 @@ into ``freq`` keyword arguments. The available date offsets and associated frequ
:class:`~pandas.tseries.offsets.Week`, ``'W'``, "one week, optionally anchored on a day of the week"
:class:`~pandas.tseries.offsets.WeekOfMonth`, ``'WOM'``, "the x-th day of the y-th week of each month"
:class:`~pandas.tseries.offsets.LastWeekOfMonth`, ``'LWOM'``, "the x-th day of the last week of each month"
:class:`~pandas.tseries.offsets.MonthEnd`, ``'M'``, "calendar month end"
:class:`~pandas.tseries.offsets.MonthEnd`, ``'ME'``, "calendar month end"
:class:`~pandas.tseries.offsets.MonthBegin`, ``'MS'``, "calendar month begin"
:class:`~pandas.tseries.offsets.BMonthEnd` or :class:`~pandas.tseries.offsets.BusinessMonthEnd`, ``'BM'``, "business month end"
:class:`~pandas.tseries.offsets.BMonthBegin` or :class:`~pandas.tseries.offsets.BusinessMonthBegin`, ``'BMS'``, "business month begin"
Expand Down Expand Up @@ -1246,7 +1246,7 @@ frequencies. We will refer to these aliases as *offset aliases*.
"C", "custom business day frequency"
"D", "calendar day frequency"
"W", "weekly frequency"
"M", "month end frequency"
"ME", "month end frequency"
"SM", "semi-month end frequency (15th and end of month)"
"BM", "business month end frequency"
"CBM", "custom business month end frequency"
Expand Down Expand Up @@ -1690,7 +1690,7 @@ the end of the interval.
.. warning::

The default values for ``label`` and ``closed`` is '**left**' for all
frequency offsets except for 'M', 'A', 'Q', 'BM', 'BA', 'BQ', and 'W'
frequency offsets except for 'ME', 'A', 'Q', 'BM', 'BA', 'BQ', and 'W'
which all have a default of 'right'.

This might unintendedly lead to looking ahead, where the value for a later
Expand Down Expand Up @@ -1856,15 +1856,15 @@ to resample based on datetimelike column in the frame, it can passed to the
),
)
df
df.resample("M", on="date")[["a"]].sum()
df.resample("ME", on="date")[["a"]].sum()
Similarly, if you instead want to resample by a datetimelike
level of ``MultiIndex``, its name or location can be passed to the
``level`` keyword.

.. ipython:: python
df.resample("M", level="d")[["a"]].sum()
df.resample("ME", level="d")[["a"]].sum()
.. _timeseries.iterating-label:

Expand Down Expand Up @@ -2137,7 +2137,7 @@ The ``period`` dtype can be used in ``.astype(...)``. It allows one to change th
pi.astype("datetime64[ns]")
# convert to PeriodIndex
dti = pd.date_range("2011-01-01", freq="M", periods=3)
dti = pd.date_range("2011-01-01", freq="ME", periods=3)
dti
dti.astype("period[M]")
Expand Down Expand Up @@ -2256,7 +2256,7 @@ and vice-versa using ``to_timestamp``:

.. ipython:: python
rng = pd.date_range("1/1/2012", periods=5, freq="M")
rng = pd.date_range("1/1/2012", periods=5, freq="ME")
ts = pd.Series(np.random.randn(len(rng)), index=rng)
Expand Down
18 changes: 14 additions & 4 deletions doc/source/whatsnew/v0.14.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -860,10 +860,20 @@ Enhancements
datetime.datetime(2013, 9, 5, 10, 0)]})
df
df.pivot_table(values='Quantity',
index=pd.Grouper(freq='M', key='Date'),
columns=pd.Grouper(freq='M', key='PayDay'),
aggfunc="sum")
.. code-block:: ipython
In [75]: df.pivot_table(values='Quantity',
....: index=pd.Grouper(freq='M', key='Date'),
....: columns=pd.Grouper(freq='M', key='PayDay'),
....: aggfunc="sum")
Out[75]:
PayDay 2013-09-30 2013-10-31 2013-11-30
Date
2013-09-30 NaN 3.0 NaN
2013-10-31 6.0 NaN 1.0
2013-11-30 NaN 9.0 NaN
[3 rows x 3 columns]
- Arrays of strings can be wrapped to a specified width (``str.wrap``) (:issue:`6999`)
- Add :meth:`~Series.nsmallest` and :meth:`Series.nlargest` methods to Series, See :ref:`the docs <basics.nsorted>` (:issue:`3960`)
Expand Down
19 changes: 17 additions & 2 deletions doc/source/whatsnew/v0.18.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -837,9 +837,24 @@ Previously
New API

.. ipython:: python
.. code-block:: ipython
s.resample('M').ffill()
In [91]: s.resample('M').ffill()
Out[91]:
2010-03-31 0
2010-04-30 0
2010-05-31 0
2010-06-30 1
2010-07-31 1
2010-08-31 1
2010-09-30 2
2010-10-31 2
2010-11-30 2
2010-12-31 3
2011-01-31 3
2011-02-28 3
2011-03-31 4
Freq: M, Length: 13, dtype: int64
.. note::

Expand Down
22 changes: 20 additions & 2 deletions doc/source/whatsnew/v0.19.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -498,8 +498,26 @@ Other enhancements
),
)
df
df.resample("M", on="date")[["a"]].sum()
df.resample("M", level="d")[["a"]].sum()
.. code-block:: ipython
In [74]: df.resample("M", on="date")[["a"]].sum()
Out[74]:
a
date
2015-01-31 6
2015-02-28 4
[2 rows x 1 columns]
In [75]: df.resample("M", level="d")[["a"]].sum()
Out[75]:
a
d
2015-01-31 6
2015-02-28 4
[2 rows x 1 columns]
- The ``.get_credentials()`` method of ``GbqConnector`` can now first try to fetch `the application default credentials <https://developers.google.com/identity/protocols/application-default-credentials>`__. See the docs for more details (:issue:`13577`).
- The ``.tz_localize()`` method of ``DatetimeIndex`` and ``Timestamp`` has gained the ``errors`` keyword, so you can potentially coerce nonexistent timestamps to ``NaT``. The default behavior remains to raising a ``NonExistentTimeError`` (:issue:`13057`)
Expand Down
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v2.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ Below is a possibly non-exhaustive list of changes:

.. ipython:: python
idx = pd.date_range(start='1/1/2018', periods=3, freq='M')
idx = pd.date_range(start='1/1/2018', periods=3, freq='ME')
idx.array.year
idx.year
Expand Down
25 changes: 25 additions & 0 deletions doc/source/whatsnew/v2.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,31 @@ Other API changes

Deprecations
~~~~~~~~~~~~

Deprecate alias ``M`` in favour of ``ME`` for offsets
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The alias ``M`` is deprecated in favour of ``ME`` for offsets, please use ``ME`` for "month end" instead of ``M`` (:issue:`9586`)

For example:

*Previous behavior*:

.. code-block:: ipython
In [7]: pd.date_range('2020-01-01', periods=3, freq='M')
Out [7]:
DatetimeIndex(['2020-01-31', '2020-02-29', '2020-03-31'],
dtype='datetime64[ns]', freq='M')
*Future behavior*:

.. ipython:: python
pd.date_range('2020-01-01', periods=3, freq='ME')
Other Deprecations
^^^^^^^^^^^^^^^^^^
- Changed :meth:`Timedelta.resolution_string` to return ``min``, ``s``, ``ms``, ``us``, and ``ns`` instead of ``T``, ``S``, ``L``, ``U``, and ``N``, for compatibility with respective deprecations in frequency aliases (:issue:`52536`)
- Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_clipboard`. (:issue:`54229`)
- Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_csv` except ``path_or_buf``. (:issue:`54229`)
Expand Down
2 changes: 2 additions & 0 deletions pandas/_libs/tslibs/dtypes.pxd
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ cpdef int64_t periods_per_second(NPY_DATETIMEUNIT reso) except? -1
cpdef NPY_DATETIMEUNIT get_supported_reso(NPY_DATETIMEUNIT reso)
cpdef bint is_supported_unit(NPY_DATETIMEUNIT reso)

cpdef freq_to_period_freqstr(freq_n, freq_name)
cdef dict c_OFFSET_TO_PERIOD_FREQSTR
cdef dict c_DEPR_ABBREVS
cdef dict attrname_to_abbrevs
cdef dict npy_unit_to_attrname
Expand Down
2 changes: 2 additions & 0 deletions pandas/_libs/tslibs/dtypes.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ from pandas._libs.tslibs.timedeltas import UnitChoices
# are imported in tests.
_attrname_to_abbrevs: dict[str, str]
_period_code_map: dict[str, int]
OFFSET_TO_PERIOD_FREQSTR: dict[str, str]
DEPR_ABBREVS: dict[str, UnitChoices]

def periods_per_day(reso: int) -> int: ...
Expand All @@ -14,6 +15,7 @@ def is_supported_unit(reso: int) -> bool: ...
def npy_unit_to_abbrev(reso: int) -> str: ...
def get_supported_reso(reso: int) -> int: ...
def abbrev_to_npy_unit(abbrev: str) -> int: ...
def freq_to_period_freqstr(freq_n: int, freq_name: str) -> str: ...

class PeriodDtypeBase:
_dtype_code: int # PeriodDtypeCode
Expand Down
40 changes: 39 additions & 1 deletion pandas/_libs/tslibs/dtypes.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ from pandas._libs.tslibs.np_datetime cimport (

import_pandas_datetime()


cdef class PeriodDtypeBase:
"""
Similar to an actual dtype, this contains all of the information
Expand Down Expand Up @@ -186,6 +185,45 @@ _attrname_to_abbrevs = {
cdef dict attrname_to_abbrevs = _attrname_to_abbrevs
cdef dict _abbrev_to_attrnames = {v: k for k, v in attrname_to_abbrevs.items()}

OFFSET_TO_PERIOD_FREQSTR: dict = {
"WEEKDAY": "D",
"EOM": "M",
"BM": "M",
"BQS": "Q",
"QS": "Q",
"BQ": "Q",
"BA": "A",
"AS": "A",
"BAS": "A",
"MS": "M",
"D": "D",
"B": "B",
"min": "min",
"s": "s",
"ms": "ms",
"us": "us",
"ns": "ns",
"H": "H",
"Q": "Q",
"A": "A",
"W": "W",
"ME": "M",
"Y": "A",
"BY": "A",
"YS": "A",
"BYS": "A",
}
cdef dict c_OFFSET_TO_PERIOD_FREQSTR = OFFSET_TO_PERIOD_FREQSTR

cpdef freq_to_period_freqstr(freq_n, freq_name):
if freq_n == 1:
freqstr = f"""{c_OFFSET_TO_PERIOD_FREQSTR.get(
freq_name, freq_name)}"""
else:
freqstr = f"""{freq_n}{c_OFFSET_TO_PERIOD_FREQSTR.get(
freq_name, freq_name)}"""
return freqstr

# Map deprecated resolution abbreviations to correct resolution abbreviations
DEPR_ABBREVS: dict[str, str]= {
"T": "min",
Expand Down
2 changes: 1 addition & 1 deletion pandas/_libs/tslibs/offsets.pxd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from numpy cimport int64_t


cpdef to_offset(object obj)
cpdef to_offset(object obj, bint is_period=*)
cdef bint is_offset_object(object obj)
cdef bint is_tick_object(object obj)

Expand Down
6 changes: 3 additions & 3 deletions pandas/_libs/tslibs/offsets.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -103,11 +103,11 @@ class SingleConstructorOffset(BaseOffset):
def __reduce__(self): ...

@overload
def to_offset(freq: None) -> None: ...
def to_offset(freq: None, is_period: bool = ...) -> None: ...
@overload
def to_offset(freq: _BaseOffsetT) -> _BaseOffsetT: ...
def to_offset(freq: _BaseOffsetT, is_period: bool = ...) -> _BaseOffsetT: ...
@overload
def to_offset(freq: timedelta | str) -> BaseOffset: ...
def to_offset(freq: timedelta | str, is_period: bool = ...) -> BaseOffset: ...

class Tick(SingleConstructorOffset):
_creso: int
Expand Down
Loading

0 comments on commit a98be06

Please sign in to comment.