Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: is_string_dtype(pd.Index([], dtype='O')) returns False #54997

Merged
merged 11 commits into from
Oct 4, 2023
1 change: 1 addition & 0 deletions doc/source/whatsnew/v2.1.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ Fixed regressions
Bug fixes
~~~~~~~~~
- Fixed bug for :class:`ArrowDtype` raising ``NotImplementedError`` for fixed-size list (:issue:`55000`)
- Fixed bug in :func:`is_string_dtype` while checking object array with no elements is of the string dtype (:issue:`54661`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you'll need this to be

:func:`pandas.api.types.is_string_dtype`

and to move this to 2.2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I corrected the note in whatsnew and moved it to 2.2.0

- Fixed bug in :meth:`DataFrame.stack` with ``future_stack=True`` and columns a non-:class:`MultiIndex` consisting of tuples (:issue:`54948`)
- Fixed bug in :meth:`Series.pct_change` and :meth:`DataFrame.pct_change` showing unnecessary ``FutureWarning`` (:issue:`54981`)

Expand Down
9 changes: 6 additions & 3 deletions pandas/core/dtypes/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -1664,9 +1664,12 @@ def is_all_strings(value: ArrayLike) -> bool:
dtype = value.dtype

if isinstance(dtype, np.dtype):
return dtype == np.dtype("object") and lib.is_string_array(
np.asarray(value), skipna=False
)
if len(value) == 0:
return dtype == np.dtype("object")
else:
return dtype == np.dtype("object") and lib.is_string_array(
np.asarray(value), skipna=False
)
elif isinstance(dtype, CategoricalDtype):
return dtype.categories.inferred_type == "string"
return dtype == "string"
Expand Down
25 changes: 17 additions & 8 deletions pandas/tests/dtypes/test_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -301,14 +301,23 @@ def test_is_categorical_dtype():
assert com.is_categorical_dtype(pd.CategoricalIndex([1, 2, 3]))


def test_is_string_dtype():
assert not com.is_string_dtype(int)
assert not com.is_string_dtype(pd.Series([1, 2]))

assert com.is_string_dtype(str)
assert com.is_string_dtype(object)
assert com.is_string_dtype(np.array(["a", "b"]))
assert com.is_string_dtype(pd.StringDtype())
@pytest.mark.parametrize(
"dtype, expected",
[
(int, False),
(pd.Series([1, 2]), False),
(str, True),
(object, True),
(np.array(["a", "b"]), True),
(pd.StringDtype(), True),
(pd.Index([], dtype="O"), True),
],
)
def test_is_string_dtype(dtype, expected):
# GH#54661

result = com.is_string_dtype(dtype)
assert result is expected


@pytest.mark.parametrize(
Expand Down