Skip to content

Commit

Permalink
Merge branch 'main' into fix_cow_warning
Browse files Browse the repository at this point in the history
  • Loading branch information
phofl authored Nov 26, 2023
2 parents 3b7f73c + 6d1b07f commit c601d40
Show file tree
Hide file tree
Showing 63 changed files with 847 additions and 493 deletions.
2 changes: 1 addition & 1 deletion doc/source/development/contributing_environment.rst
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ Before we begin, please:
Option 1: using mamba (recommended)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* Install `mamba <https://mamba.readthedocs.io/en/latest/installation.html>`_
* Install `mamba <https://mamba.readthedocs.io/en/latest/installation/mamba-installation.html>`_
* Make sure your mamba is up to date (``mamba update mamba``)

.. code-block:: none
Expand Down
4 changes: 4 additions & 0 deletions doc/source/whatsnew/v2.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -393,9 +393,12 @@ Other Deprecations
- Deprecated strings ``H``, ``S``, ``U``, and ``N`` denoting units in :func:`to_timedelta` (:issue:`52536`)
- Deprecated strings ``H``, ``T``, ``S``, ``L``, ``U``, and ``N`` denoting units in :class:`Timedelta` (:issue:`52536`)
- Deprecated strings ``T``, ``S``, ``L``, ``U``, and ``N`` denoting frequencies in :class:`Minute`, :class:`Second`, :class:`Milli`, :class:`Micro`, :class:`Nano` (:issue:`52536`)
- Deprecated the :class:`.BaseGrouper` attributes ``group_keys_seq`` and ``reconstructed_codes``; these will be removed in a future version of pandas (:issue:`56148`)
- Deprecated the :class:`.Grouping` attributes ``group_index``, ``result_index``, and ``group_arraylike``; these will be removed in a future version of pandas (:issue:`56148`)
- Deprecated the ``errors="ignore"`` option in :func:`to_datetime`, :func:`to_timedelta`, and :func:`to_numeric`; explicitly catch exceptions instead (:issue:`54467`)
- Deprecated the ``fastpath`` keyword in the :class:`Series` constructor (:issue:`20110`)
- Deprecated the ``ordinal`` keyword in :class:`PeriodIndex`, use :meth:`PeriodIndex.from_ordinals` instead (:issue:`55960`)
- Deprecated the behavior of :meth:`Series.value_counts` and :meth:`Index.value_counts` with object dtype; in a future version these will not perform dtype inference on the resulting :class:`Index`, do ``result.index = result.index.infer_objects()`` to retain the old behavior (:issue:`56161`)
- Deprecated the extension test classes ``BaseNoReduceTests``, ``BaseBooleanReduceTests``, and ``BaseNumericReduceTests``, use ``BaseReduceTests`` instead (:issue:`54663`)
- Deprecated the option ``mode.data_manager`` and the ``ArrayManager``; only the ``BlockManager`` will be available in future versions (:issue:`55043`)
- Deprecated the previous implementation of :class:`DataFrame.stack`; specify ``future_stack=True`` to adopt the future version (:issue:`53515`)
Expand Down Expand Up @@ -452,6 +455,7 @@ Datetimelike
- Bug in :meth:`DatetimeIndex.union` returning object dtype for tz-aware indexes with the same timezone but different units (:issue:`55238`)
- Bug in :meth:`Index.is_monotonic_increasing` and :meth:`Index.is_monotonic_decreasing` always caching :meth:`Index.is_unique` as ``True`` when first value in index is ``NaT`` (:issue:`55755`)
- Bug in :meth:`Index.view` to a datetime64 dtype with non-supported resolution incorrectly raising (:issue:`55710`)
- Bug in :meth:`Series.dt.round` with non-nanosecond resolution and ``NaT`` entries incorrectly raising ``OverflowError`` (:issue:`56158`)
- Bug in :meth:`Tick.delta` with very large ticks raising ``OverflowError`` instead of ``OutOfBoundsTimedelta`` (:issue:`55503`)
- Bug in ``.astype`` converting from a higher-resolution ``datetime64`` dtype to a lower-resolution ``datetime64`` dtype (e.g. ``datetime64[us]->datetim64[ms]``) silently overflowing with values near the lower implementation bound (:issue:`55979`)
- Bug in adding or subtracting a :class:`Week` offset to a ``datetime64`` :class:`Series`, :class:`Index`, or :class:`DataFrame` column with non-nanosecond resolution returning incorrect results (:issue:`55583`)
Expand Down
17 changes: 3 additions & 14 deletions pandas/_libs/src/vendored/ujson/python/objToJSON.c
Original file line number Diff line number Diff line change
Expand Up @@ -1280,13 +1280,7 @@ char **NpyArr_encodeLabels(PyArrayObject *labels, PyObjectEncoder *enc,
NPY_DATETIMEUNIT dateUnit = NPY_FR_ns;
if (PyTypeNum_ISDATETIME(type_num)) {
is_datetimelike = 1;
PyArray_VectorUnaryFunc *castfunc =
PyArray_GetCastFunc(PyArray_DescrFromType(type_num), NPY_INT64);
if (!castfunc) {
PyErr_Format(PyExc_ValueError, "Cannot cast numpy dtype %d to long",
enc->npyType);
}
castfunc(dataptr, &i8date, 1, NULL, NULL);
i8date = *(npy_int64 *)dataptr;
dateUnit = get_datetime_metadata_from_dtype(dtype).base;
} else if (PyDate_Check(item) || PyDelta_Check(item)) {
is_datetimelike = 1;
Expand Down Expand Up @@ -1425,13 +1419,8 @@ void Object_beginTypeContext(JSOBJ _obj, JSONTypeContext *tc) {

if (PyTypeNum_ISDATETIME(enc->npyType)) {
int64_t longVal;
PyArray_VectorUnaryFunc *castfunc =
PyArray_GetCastFunc(PyArray_DescrFromType(enc->npyType), NPY_INT64);
if (!castfunc) {
PyErr_Format(PyExc_ValueError, "Cannot cast numpy dtype %d to long",
enc->npyType);
}
castfunc(enc->npyValue, &longVal, 1, NULL, NULL);

longVal = *(npy_int64 *)enc->npyValue;
if (longVal == get_nat()) {
tc->type = JT_NULL;
} else {
Expand Down
26 changes: 25 additions & 1 deletion pandas/_libs/tslibs/fields.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -746,7 +746,31 @@ cdef ndarray[int64_t] _ceil_int64(const int64_t[:] values, int64_t unit):


cdef ndarray[int64_t] _rounddown_int64(values, int64_t unit):
return _ceil_int64(values - unit // 2, unit)
cdef:
Py_ssize_t i, n = len(values)
ndarray[int64_t] result = np.empty(n, dtype="i8")
int64_t res, value, remainder, half

half = unit // 2

with cython.overflowcheck(True):
for i in range(n):
value = values[i]

if value == NPY_NAT:
res = NPY_NAT
else:
# This adjustment is the only difference between rounddown_int64
# and _ceil_int64
value = value - half
remainder = value % unit
if remainder == 0:
res = value
else:
res = value + (unit - remainder)

result[i] = res
return result


cdef ndarray[int64_t] _roundup_int64(values, int64_t unit):
Expand Down
3 changes: 2 additions & 1 deletion pandas/_libs/tslibs/offsets.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ class ApplyTypeError(TypeError): ...

class BaseOffset:
n: int
normalize: bool
def __init__(self, n: int = ..., normalize: bool = ...) -> None: ...
def __eq__(self, other) -> bool: ...
def __ne__(self, other) -> bool: ...
Expand Down Expand Up @@ -85,7 +86,7 @@ class BaseOffset:
@property
def freqstr(self) -> str: ...
def _apply(self, other): ...
def _apply_array(self, dtarr) -> None: ...
def _apply_array(self, dtarr: np.ndarray) -> np.ndarray: ...
def rollback(self, dt: datetime) -> datetime: ...
def rollforward(self, dt: datetime) -> datetime: ...
def is_on_offset(self, dt: datetime) -> bool: ...
Expand Down
53 changes: 10 additions & 43 deletions pandas/_libs/tslibs/offsets.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -110,33 +110,6 @@ cdef bint _is_normalized(datetime dt):
return True


def apply_wrapper_core(func, self, other) -> ndarray:
result = func(self, other)
result = np.asarray(result)

if self.normalize:
# TODO: Avoid circular/runtime import
from .vectorized import normalize_i8_timestamps
reso = get_unit_from_dtype(other.dtype)
result = normalize_i8_timestamps(result.view("i8"), None, reso=reso)

return result


def apply_array_wraps(func):
# Note: normally we would use `@functools.wraps(func)`, but this does
# not play nicely with cython class methods
def wrapper(self, other) -> np.ndarray:
# other is a DatetimeArray
result = apply_wrapper_core(func, self, other)
return result

# do @functools.wraps(func) manually since it doesn't work on cdef funcs
wrapper.__name__ = func.__name__
wrapper.__doc__ = func.__doc__
return wrapper


def apply_wraps(func):
# Note: normally we would use `@functools.wraps(func)`, but this does
# not play nicely with cython class methods
Expand Down Expand Up @@ -644,8 +617,9 @@ cdef class BaseOffset:
def _apply(self, other):
raise NotImplementedError("implemented by subclasses")

@apply_array_wraps
def _apply_array(self, dtarr):
def _apply_array(self, dtarr: np.ndarray) -> np.ndarray:
# NB: _apply_array does not handle respecting `self.normalize`, the
# caller (DatetimeArray) handles that in post-processing.
raise NotImplementedError(
f"DateOffset subclass {type(self).__name__} "
"does not have a vectorized implementation"
Expand Down Expand Up @@ -1399,8 +1373,7 @@ cdef class RelativeDeltaOffset(BaseOffset):
"applied vectorized"
)

@apply_array_wraps
def _apply_array(self, dtarr):
def _apply_array(self, dtarr: np.ndarray) -> np.ndarray:
reso = get_unit_from_dtype(dtarr.dtype)
dt64other = np.asarray(dtarr)

Expand Down Expand Up @@ -1814,8 +1787,7 @@ cdef class BusinessDay(BusinessMixin):
days = n + 2
return days

@apply_array_wraps
def _apply_array(self, dtarr):
def _apply_array(self, dtarr: np.ndarray) -> np.ndarray:
i8other = dtarr.view("i8")
reso = get_unit_from_dtype(dtarr.dtype)
res = self._shift_bdays(i8other, reso=reso)
Expand Down Expand Up @@ -2361,8 +2333,7 @@ cdef class YearOffset(SingleConstructorOffset):
months = years * 12 + (self.month - other.month)
return shift_month(other, months, self._day_opt)

@apply_array_wraps
def _apply_array(self, dtarr):
def _apply_array(self, dtarr: np.ndarray) -> np.ndarray:
reso = get_unit_from_dtype(dtarr.dtype)
shifted = shift_quarters(
dtarr.view("i8"), self.n, self.month, self._day_opt, modby=12, reso=reso
Expand Down Expand Up @@ -2613,8 +2584,7 @@ cdef class QuarterOffset(SingleConstructorOffset):
months = qtrs * 3 - months_since
return shift_month(other, months, self._day_opt)

@apply_array_wraps
def _apply_array(self, dtarr):
def _apply_array(self, dtarr: np.ndarray) -> np.ndarray:
reso = get_unit_from_dtype(dtarr.dtype)
shifted = shift_quarters(
dtarr.view("i8"),
Expand Down Expand Up @@ -2798,8 +2768,7 @@ cdef class MonthOffset(SingleConstructorOffset):
n = roll_convention(other.day, self.n, compare_day)
return shift_month(other, n, self._day_opt)

@apply_array_wraps
def _apply_array(self, dtarr):
def _apply_array(self, dtarr: np.ndarray) -> np.ndarray:
reso = get_unit_from_dtype(dtarr.dtype)
shifted = shift_months(dtarr.view("i8"), self.n, self._day_opt, reso=reso)
return shifted
Expand Down Expand Up @@ -3029,10 +2998,9 @@ cdef class SemiMonthOffset(SingleConstructorOffset):

return shift_month(other, months, to_day)

@apply_array_wraps
@cython.wraparound(False)
@cython.boundscheck(False)
def _apply_array(self, dtarr):
def _apply_array(self, dtarr: np.ndarray) -> np.ndarray:
cdef:
ndarray i8other = dtarr.view("i8")
Py_ssize_t i, count = dtarr.size
Expand Down Expand Up @@ -3254,8 +3222,7 @@ cdef class Week(SingleConstructorOffset):

return other + timedelta(weeks=k)

@apply_array_wraps
def _apply_array(self, dtarr):
def _apply_array(self, dtarr: np.ndarray) -> np.ndarray:
if self.weekday is None:
td = timedelta(days=7 * self.n)
unit = np.datetime_data(dtarr.dtype)[0]
Expand Down
Loading

0 comments on commit c601d40

Please sign in to comment.