Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: is_string_dtype(pd.Index([], dtype='O')) returns False #54997

Merged
merged 11 commits into from
Oct 4, 2023
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,7 @@ Bug fixes
~~~~~~~~~
- Bug in :class:`AbstractHolidayCalendar` where timezone data was not propagated when computing holiday observances (:issue:`54580`)
- Bug in :class:`pandas.core.window.Rolling` where duplicate datetimelike indexes are treated as consecutive rather than equal with ``closed='left'`` and ``closed='neither'`` (:issue:`20712`)
- Bug in :func:`pandas.api.types.is_string_dtype` while checking object array with no elements is of the string dtype (:issue:`54661`)
- Bug in :meth:`DataFrame.apply` where passing ``raw=True`` ignored ``args`` passed to the applied function (:issue:`55009`)
- Bug in :meth:`pandas.read_excel` with a ODS file without cached formatted cell for float values (:issue:`55219`)

Expand Down
9 changes: 6 additions & 3 deletions pandas/core/dtypes/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -1664,9 +1664,12 @@ def is_all_strings(value: ArrayLike) -> bool:
dtype = value.dtype

if isinstance(dtype, np.dtype):
return dtype == np.dtype("object") and lib.is_string_array(
np.asarray(value), skipna=False
)
if len(value) == 0:
return dtype == np.dtype("object")
else:
return dtype == np.dtype("object") and lib.is_string_array(
np.asarray(value), skipna=False
)
elif isinstance(dtype, CategoricalDtype):
return dtype.categories.inferred_type == "string"
return dtype == "string"
Expand Down
25 changes: 17 additions & 8 deletions pandas/tests/dtypes/test_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -301,14 +301,23 @@ def test_is_categorical_dtype():
assert com.is_categorical_dtype(pd.CategoricalIndex([1, 2, 3]))


def test_is_string_dtype():
assert not com.is_string_dtype(int)
assert not com.is_string_dtype(pd.Series([1, 2]))

assert com.is_string_dtype(str)
assert com.is_string_dtype(object)
assert com.is_string_dtype(np.array(["a", "b"]))
assert com.is_string_dtype(pd.StringDtype())
@pytest.mark.parametrize(
"dtype, expected",
[
(int, False),
(pd.Series([1, 2]), False),
(str, True),
(object, True),
(np.array(["a", "b"]), True),
(pd.StringDtype(), True),
(pd.Index([], dtype="O"), True),
],
)
def test_is_string_dtype(dtype, expected):
# GH#54661

result = com.is_string_dtype(dtype)
assert result is expected


@pytest.mark.parametrize(
Expand Down