Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: resample with PeriodIndex #55968

Merged
merged 16 commits into from
Dec 18, 2023
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 11 additions & 6 deletions doc/source/whatsnew/v0.21.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -635,17 +635,22 @@ Previous behavior:

New behavior:

.. ipython:: python
.. code-block:: ipython

pi = pd.period_range('2017-01', periods=12, freq='M')
In [1]: pi = pd.period_range('2017-01', periods=12, freq='M')

s = pd.Series(np.arange(12), index=pi)
In [2]: s = pd.Series(np.arange(12), index=pi)

resampled = s.resample('2Q').mean()
In [3]: resampled = s.resample('2Q').mean()

resampled
In [4]: resampled
Out[4]:
2017Q1 2.5
2017Q3 8.5
Freq: 2Q-DEC, dtype: float64

resampled.index
In [5]: resampled.index
Out[5]: PeriodIndex(['2017Q1', '2017Q3'], dtype='period[2Q-DEC]')

Upsampling and calling ``.ohlc()`` previously returned a ``Series``, basically identical to calling ``.asfreq()``. OHLC upsampling now returns a DataFrame with columns ``open``, ``high``, ``low`` and ``close`` (:issue:`13083`). This is consistent with downsampling and ``DatetimeIndex`` behavior.

Expand Down
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -363,6 +363,7 @@ Other Deprecations
- Deprecated :meth:`.DataFrameGroupBy.fillna` and :meth:`.SeriesGroupBy.fillna`; use :meth:`.DataFrameGroupBy.ffill`, :meth:`.DataFrameGroupBy.bfill` for forward and backward filling or :meth:`.DataFrame.fillna` to fill with a single value (or the Series equivalents) (:issue:`55718`)
- Deprecated :meth:`Index.format`, use ``index.astype(str)`` or ``index.map(formatter)`` instead (:issue:`55413`)
- Deprecated :meth:`Series.ravel`, the underlying array is already 1D, so ravel is not necessary (:issue:`52511`)
- Deprecated :meth:`Series.resample` and :meth:`DataFrame.resample` with a :class:`PeriodIndex`, convert to :class:`DatetimeIndex` (with ``.to_timestamp()``) before resampling instead (:issue:`53481`)
mroeschke marked this conversation as resolved.
Show resolved Hide resolved
- Deprecated :meth:`Series.view`, use ``astype`` instead to change the dtype (:issue:`20251`)
- Deprecated ``core.internals`` members ``Block``, ``ExtensionBlock``, and ``DatetimeTZBlock``, use public APIs instead (:issue:`55139`)
- Deprecated ``year``, ``month``, ``quarter``, ``day``, ``hour``, ``minute``, and ``second`` keywords in the :class:`PeriodIndex` constructor, use :meth:`PeriodIndex.from_fields` instead (:issue:`55960`)
Expand Down
49 changes: 0 additions & 49 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -9443,55 +9443,6 @@ def resample(
2000-01-01 00:06:00 26
Freq: 3min, dtype: int64

For a Series with a PeriodIndex, the keyword `convention` can be
used to control whether to use the start or end of `rule`.

Resample a year by quarter using 'start' `convention`. Values are
assigned to the first quarter of the period.

>>> s = pd.Series([1, 2], index=pd.period_range('2012-01-01',
... freq='Y',
... periods=2))
>>> s
2012 1
2013 2
Freq: Y-DEC, dtype: int64
>>> s.resample('Q', convention='start').asfreq()
2012Q1 1.0
2012Q2 NaN
2012Q3 NaN
2012Q4 NaN
2013Q1 2.0
2013Q2 NaN
2013Q3 NaN
2013Q4 NaN
Freq: Q-DEC, dtype: float64

Resample quarters by month using 'end' `convention`. Values are
assigned to the last month of the period.

>>> q = pd.Series([1, 2, 3, 4], index=pd.period_range('2018-01-01',
... freq='Q',
... periods=4))
>>> q
2018Q1 1
2018Q2 2
2018Q3 3
2018Q4 4
Freq: Q-DEC, dtype: int64
>>> q.resample('M', convention='end').asfreq()
2018-03 1.0
2018-04 NaN
2018-05 NaN
2018-06 2.0
2018-07 NaN
2018-08 NaN
2018-09 3.0
2018-10 NaN
2018-11 NaN
2018-12 4.0
Freq: M, dtype: float64

For DataFrame objects, the keyword `on` can be used to specify the
column instead of the index for resampling.

Expand Down
21 changes: 21 additions & 0 deletions pandas/core/resample.py
Original file line number Diff line number Diff line change
Expand Up @@ -1876,6 +1876,12 @@ class PeriodIndexResampler(DatetimeIndexResampler):

@property
def _resampler_for_grouping(self):
warnings.warn(
"Resampling a groupby with a PeriodIndex is deprecated. "
"Cast to DatetimeIndex before resampling instead.",
FutureWarning,
stacklevel=find_stack_level(),
)
return PeriodIndexResamplerGroupby

def _get_binner_for_time(self):
Expand Down Expand Up @@ -2224,6 +2230,21 @@ def _get_resampler(self, obj: NDFrame, kind=None) -> Resampler:
gpr_index=ax,
)
elif isinstance(ax, PeriodIndex) or kind == "period":
if isinstance(ax, PeriodIndex):
# GH#53481
warnings.warn(
"Resampling with a PeriodIndex is deprecated. "
"Cast index to DatetimeIndex before resampling instead.",
FutureWarning,
stacklevel=find_stack_level(),
)
else:
warnings.warn(
"Resampling with kind='period' is deprecated. "
"Use datetime paths instead.",
FutureWarning,
stacklevel=find_stack_level(),
)
return PeriodIndexResampler(
obj,
timegrouper=self,
Expand Down
9 changes: 6 additions & 3 deletions pandas/plotting/_matplotlib/timeseries.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,9 +86,12 @@ def maybe_resample(series: Series, ax: Axes, kwargs: dict[str, Any]):
)
freq = ax_freq
elif _is_sup(freq, ax_freq): # one is weekly
how = "last"
series = getattr(series.resample("D"), how)().dropna()
series = getattr(series.resample(ax_freq), how)().dropna()
# Resampling with PeriodDtype is deprecated, so we convert to
# DatetimeIndex, resample, then convert back.
ser_ts = series.to_timestamp()
ser_d = ser_ts.resample("D").last().dropna()
ser_freq = ser_d.resample(ax_freq).last().dropna()
series = ser_freq.to_period(ax_freq)
freq = ax_freq
elif is_subperiod(freq, ax_freq) or _is_sub(freq, ax_freq):
_upsample_others(ax, freq, kwargs)
Expand Down
103 changes: 88 additions & 15 deletions pandas/tests/resample/test_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,13 @@ def test_asfreq_fill_value(series, create_index):
def test_resample_interpolate(frame):
# GH#12925
df = frame
result = df.resample("1min").asfreq().interpolate()
expected = df.resample("1min").interpolate()
warn = None
if isinstance(df.index, PeriodIndex):
warn = FutureWarning
msg = "Resampling with a PeriodIndex is deprecated"
with tm.assert_produces_warning(warn, match=msg):
result = df.resample("1min").asfreq().interpolate()
expected = df.resample("1min").interpolate()
tm.assert_frame_equal(result, expected)


Expand Down Expand Up @@ -118,7 +123,13 @@ def test_resample_empty_series(freq, empty_series_dti, resample_method):
elif freq == "ME" and isinstance(ser.index, PeriodIndex):
# index is PeriodIndex, so convert to corresponding Period freq
freq = "M"
rs = ser.resample(freq)

warn = None
if isinstance(ser.index, PeriodIndex):
warn = FutureWarning
msg = "Resampling with a PeriodIndex is deprecated"
with tm.assert_produces_warning(warn, match=msg):
rs = ser.resample(freq)
result = getattr(rs, resample_method)()

if resample_method == "ohlc":
Expand Down Expand Up @@ -150,7 +161,10 @@ def test_resample_nat_index_series(freq, series, resample_method):

ser = series.copy()
ser.index = PeriodIndex([NaT] * len(ser), freq=freq)
rs = ser.resample(freq)

msg = "Resampling with a PeriodIndex is deprecated"
with tm.assert_produces_warning(FutureWarning, match=msg):
rs = ser.resample(freq)
result = getattr(rs, resample_method)()

if resample_method == "ohlc":
Expand Down Expand Up @@ -182,7 +196,13 @@ def test_resample_count_empty_series(freq, empty_series_dti, resample_method):
elif freq == "ME" and isinstance(ser.index, PeriodIndex):
# index is PeriodIndex, so convert to corresponding Period freq
freq = "M"
rs = ser.resample(freq)

warn = None
if isinstance(ser.index, PeriodIndex):
warn = FutureWarning
msg = "Resampling with a PeriodIndex is deprecated"
with tm.assert_produces_warning(warn, match=msg):
rs = ser.resample(freq)

result = getattr(rs, resample_method)()

Expand Down Expand Up @@ -210,7 +230,13 @@ def test_resample_empty_dataframe(empty_frame_dti, freq, resample_method):
elif freq == "ME" and isinstance(df.index, PeriodIndex):
# index is PeriodIndex, so convert to corresponding Period freq
freq = "M"
rs = df.resample(freq, group_keys=False)

warn = None
if isinstance(df.index, PeriodIndex):
warn = FutureWarning
msg = "Resampling with a PeriodIndex is deprecated"
with tm.assert_produces_warning(warn, match=msg):
rs = df.resample(freq, group_keys=False)
result = getattr(rs, resample_method)()
if resample_method == "ohlc":
# TODO: no tests with len(df.columns) > 0
Expand Down Expand Up @@ -253,7 +279,14 @@ def test_resample_count_empty_dataframe(freq, empty_frame_dti):
elif freq == "ME" and isinstance(empty_frame_dti.index, PeriodIndex):
# index is PeriodIndex, so convert to corresponding Period freq
freq = "M"
result = empty_frame_dti.resample(freq).count()

warn = None
if isinstance(empty_frame_dti.index, PeriodIndex):
warn = FutureWarning
msg = "Resampling with a PeriodIndex is deprecated"
with tm.assert_produces_warning(warn, match=msg):
rs = empty_frame_dti.resample(freq)
result = rs.count()

index = _asfreq_compat(empty_frame_dti.index, freq)

Expand All @@ -280,7 +313,14 @@ def test_resample_size_empty_dataframe(freq, empty_frame_dti):
elif freq == "ME" and isinstance(empty_frame_dti.index, PeriodIndex):
# index is PeriodIndex, so convert to corresponding Period freq
freq = "M"
result = empty_frame_dti.resample(freq).size()

msg = "Resampling with a PeriodIndex"
warn = None
if isinstance(empty_frame_dti.index, PeriodIndex):
warn = FutureWarning
with tm.assert_produces_warning(warn, match=msg):
rs = empty_frame_dti.resample(freq)
result = rs.size()

index = _asfreq_compat(empty_frame_dti.index, freq)

Expand All @@ -298,12 +338,21 @@ def test_resample_size_empty_dataframe(freq, empty_frame_dti):
],
)
@pytest.mark.parametrize("dtype", [float, int, object, "datetime64[ns]"])
@pytest.mark.filterwarnings(r"ignore:PeriodDtype\[B\] is deprecated:FutureWarning")
def test_resample_empty_dtypes(index, dtype, resample_method):
# Empty series were sometimes causing a segfault (for the functions
# with Cython bounds-checking disabled) or an IndexError. We just run
# them to ensure they no longer do. (GH #10228)
warn = None
if isinstance(index, PeriodIndex):
# GH#53511
index = PeriodIndex([], freq="B", name=index.name)
warn = FutureWarning
msg = "Resampling with a PeriodIndex is deprecated"

empty_series_dti = Series([], index, dtype)
rs = empty_series_dti.resample("d", group_keys=False)
with tm.assert_produces_warning(warn, match=msg):
rs = empty_series_dti.resample("d", group_keys=False)
try:
getattr(rs, resample_method)()
except DataError:
Expand All @@ -329,8 +378,18 @@ def test_apply_to_empty_series(empty_series_dti, freq):
elif freq == "ME" and isinstance(empty_series_dti.index, PeriodIndex):
# index is PeriodIndex, so convert to corresponding Period freq
freq = "M"
result = ser.resample(freq, group_keys=False).apply(lambda x: 1)
expected = ser.resample(freq).apply("sum")

msg = "Resampling with a PeriodIndex"
warn = None
if isinstance(empty_series_dti.index, PeriodIndex):
warn = FutureWarning

with tm.assert_produces_warning(warn, match=msg):
rs = ser.resample(freq, group_keys=False)

result = rs.apply(lambda x: 1)
with tm.assert_produces_warning(warn, match=msg):
expected = ser.resample(freq).apply("sum")

tm.assert_series_equal(result, expected, check_dtype=False)

Expand All @@ -340,8 +399,16 @@ def test_resampler_is_iterable(series):
# GH 15314
freq = "h"
tg = Grouper(freq=freq, convention="start")
grouped = series.groupby(tg)
resampled = series.resample(freq)
msg = "Resampling with a PeriodIndex"
warn = None
if isinstance(series.index, PeriodIndex):
warn = FutureWarning

with tm.assert_produces_warning(warn, match=msg):
grouped = series.groupby(tg)

with tm.assert_produces_warning(warn, match=msg):
resampled = series.resample(freq)
for (rk, rv), (gk, gv) in zip(resampled, grouped):
assert rk == gk
tm.assert_series_equal(rv, gv)
Expand All @@ -353,6 +420,12 @@ def test_resample_quantile(series):
ser = series
q = 0.75
freq = "h"
result = ser.resample(freq).quantile(q)
expected = ser.resample(freq).agg(lambda x: x.quantile(q)).rename(ser.name)

msg = "Resampling with a PeriodIndex"
warn = None
if isinstance(series.index, PeriodIndex):
warn = FutureWarning
with tm.assert_produces_warning(warn, match=msg):
result = ser.resample(freq).quantile(q)
expected = ser.resample(freq).agg(lambda x: x.quantile(q)).rename(ser.name)
tm.assert_series_equal(result, expected)
5 changes: 4 additions & 1 deletion pandas/tests/resample/test_datetime_index.py
Original file line number Diff line number Diff line change
Expand Up @@ -1040,7 +1040,10 @@ def test_period_with_agg():
)

expected = s2.to_timestamp().resample("D").mean().to_period()
result = s2.resample("D").agg(lambda x: x.mean())
msg = "Resampling with a PeriodIndex is deprecated"
with tm.assert_produces_warning(FutureWarning, match=msg):
rs = s2.resample("D")
result = rs.agg(lambda x: x.mean())
tm.assert_series_equal(result, expected)


Expand Down
Loading
Loading