BUG: AssertionError: Did not expect new dtype float64 to equal self.dtype float64. Please report a bug #56329

adrienpacifico · 2023-12-04T22:27:32Z

Pandas version checks

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
import numpy as np

data = pd.DataFrame({
    'categ': ['B', 'T', 'A'],
    'Int': [np.nan,1,2,]
    }
).astype({"categ":"category"})

data['Int'].fillna(data['categ'].apply(lambda x: 
                                        {'B': 42, 'T': 51, 'A': 666}.get(x))
                   )

---------------------------------------------------------------------------
LossySetitemError                         Traceback (most recent call last)
File ~/opt/anaconda3/envs/python_3_12/lib/python3.12/site-packages/pandas/core/internals/blocks.py:1302, in Block.where(self, other, cond, _downcast, using_cow)
   1298 try:
   1299     # try/except here is equivalent to a self._can_hold_element check,
   1300     #  but this gets us back 'casted' which we will re-use below;
   1301     #  without using 'casted', expressions.where may do unwanted upcasts.
-> 1302     casted = np_can_hold_element(values.dtype, other)
   1303 except (ValueError, TypeError, LossySetitemError):
   1304     # we cannot coerce, return a compat dtype

File ~/opt/anaconda3/envs/python_3_12/lib/python3.12/site-packages/pandas/core/dtypes/cast.py:1816, in np_can_hold_element(dtype, element)
   1814 if tipo.kind not in "iuf":
   1815     # Anything other than float/integer we cannot hold
-> 1816     raise LossySetitemError
   1817 if not isinstance(tipo, np.dtype):
   1818     # i.e. nullable IntegerDtype or FloatingDtype;
   1819     #  we can put this into an ndarray losslessly iff it has no NAs

LossySetitemError: 

During handling of the above exception, another exception occurred:

AssertionError                            Traceback (most recent call last)
Cell In[9], line 10
      3 num_rows = 5
      4 data = pd.DataFrame({
      5     'categ': ['B', 'T', 'A'],
      6     'Int': [np.nan,1,2,]
      7     }
      8 ).astype({"categ":"category"})
---> 10 data['Int'].fillna(data['categ'].apply(lambda x: 
     11                                         {'B': 42, 'T': 51, 'A': 666}.get(x))
     12                    )

File ~/opt/anaconda3/envs/python_3_12/lib/python3.12/site-packages/pandas/core/generic.py:7212, in NDFrame.fillna(self, value, method, axis, inplace, limit, downcast)
   7205     else:
   7206         raise TypeError(
   7207             '"value" parameter must be a scalar, dict '
   7208             "or Series, but you passed a "
   7209             f'"{type(value).__name__}"'
   7210         )
-> 7212     new_data = self._mgr.fillna(
   7213         value=value, limit=limit, inplace=inplace, downcast=downcast
   7214     )
   7216 elif isinstance(value, (dict, ABCSeries)):
   7217     if axis == 1:

File ~/opt/anaconda3/envs/python_3_12/lib/python3.12/site-packages/pandas/core/internals/base.py:173, in DataManager.fillna(self, value, limit, inplace, downcast)
    169 if limit is not None:
    170     # Do this validation even if we go through one of the no-op paths
    171     limit = libalgos.validate_limit(None, limit=limit)
--> 173 return self.apply_with_block(
    174     "fillna",
    175     value=value,
    176     limit=limit,
    177     inplace=inplace,
    178     downcast=downcast,
    179     using_cow=using_copy_on_write(),
    180 )

File ~/opt/anaconda3/envs/python_3_12/lib/python3.12/site-packages/pandas/core/internals/managers.py:354, in BaseBlockManager.apply(self, f, align_keys, **kwargs)
    352         applied = b.apply(f, **kwargs)
    353     else:
--> 354         applied = getattr(b, f)(**kwargs)
    355     result_blocks = extend_blocks(applied, result_blocks)
    357 out = type(self).from_blocks(result_blocks, self.axes)

File ~/opt/anaconda3/envs/python_3_12/lib/python3.12/site-packages/pandas/core/internals/blocks.py:1419, in Block.fillna(self, value, limit, inplace, downcast, using_cow)
   1415     nbs = self.putmask(mask.T, value, using_cow=using_cow)
   1416 else:
   1417     # without _downcast, we would break
   1418     #  test_fillna_dtype_conversion_equiv_replace
-> 1419     nbs = self.where(value, ~mask.T, _downcast=False)
   1421 # Note: blk._maybe_downcast vs self._maybe_downcast(nbs)
   1422 #  makes a difference bc blk may have object dtype, which has
   1423 #  different behavior in _maybe_downcast.
   1424 return extend_blocks(
   1425     [
   1426         blk._maybe_downcast([blk], downcast=downcast, using_cow=using_cow)
   1427         for blk in nbs
   1428     ]
   1429 )

File ~/opt/anaconda3/envs/python_3_12/lib/python3.12/site-packages/pandas/core/internals/blocks.py:1309, in Block.where(self, other, cond, _downcast, using_cow)
   1303 except (ValueError, TypeError, LossySetitemError):
   1304     # we cannot coerce, return a compat dtype
   1306     if self.ndim == 1 or self.shape[0] == 1:
   1307         # no need to split columns
-> 1309         block = self.coerce_to_target_dtype(other)
   1310         blocks = block.where(orig_other, cond, using_cow=using_cow)
   1311         return self._maybe_downcast(
   1312             blocks, downcast=_downcast, using_cow=using_cow
   1313         )

File ~/opt/anaconda3/envs/python_3_12/lib/python3.12/site-packages/pandas/core/internals/blocks.py:490, in Block.coerce_to_target_dtype(self, other, warn_on_upcast)
    481     warnings.warn(
    482         f"Setting an item of incompatible dtype is deprecated "
    483         "and will raise in a future error of pandas. "
   (...)
    487         stacklevel=find_stack_level(),
    488     )
    489 if self.values.dtype == new_dtype:
--> 490     raise AssertionError(
    491         f"Did not expect new dtype {new_dtype} to equal self.dtype "
    492         f"{self.values.dtype}. Please report a bug at "
    493         "https://github.com/pandas-dev/pandas/issues."
    494     )
    495 return self.astype(new_dtype, copy=False)

AssertionError: Did not expect new dtype float64 to equal self.dtype float64. Please report a bug at https://github.com/pandas-dev/pandas/issues.

Issue Description

When the column is categ is put as categorical, the code does not work.

Expected Behavior

The code should behave the same with the categ column being an object or a categorical?

Installed Versions

Tested with multiple versions of Python (3.10, 3.12, 3.8). The bug did not exist in old version of pandas (1.2.1 is ok).

INSTALLED VERSIONS

commit : 2a953cf
python : 3.12.0.final.0
python-bits : 64
OS : Darwin
OS-release : 21.1.0
Version : Darwin Kernel Version 21.1.0: Sat Sep 11 12:27:45 PDT 2021; root:xnu-8019.40.67.171.4~1/RELEASE_ARM64_T8101
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : fr_FR.UTF-8
LOCALE : fr_FR.UTF-8

pandas : 2.1.3
numpy : 1.26.2
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 68.0.0
pip : 23.2.1
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.9.3
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.18.0
pandas_datareader : None
bs4 : 4.12.2
bottleneck : None
dataframe-api-compat: None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.8.2
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.11.4
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None

The text was updated successfully, but these errors were encountered:

lithomas1 · 2023-12-05T15:00:09Z

cc @MarcoGorelli.

MarcoGorelli · 2023-12-05T15:01:10Z

thanks for the report! will take a look

MarcoGorelli · 2023-12-06T10:08:56Z

this came from https://github.com/pandas-dev/pandas/pull/55201/files

so, not sure it's pdep6-related - going to ping @jbrockmendel on this one

lithomas1 · 2023-12-06T13:49:20Z

Huh, the bisecting doesn't look right.

That PR is merged for 2.2, but this bug is already there in 2.1.

Can you try bisecting further back?

adrienpacifico · 2023-12-06T14:58:43Z

I had the bug in pandas 1.5.x, and I did not have it in 1.2.1.
The issue was likely introduced between these two versions.

jbrockmendel · 2023-12-06T18:49:56Z

Looks like in np_can_hold_element we should be casting the Categorical to float64

adrienpacifico added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 4, 2023

rhshadrach added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Categorical Categorical Data Type labels Dec 5, 2023

lithomas1 added Dtype Conversions Unexpected or buggy dtype conversions PDEP6-related related to PDEP6 (not upcasting during setitem-like Series operations) and removed Dtype Conversions Unexpected or buggy dtype conversions labels Dec 5, 2023

lithomas1 removed the Needs Triage Issue that has not been reviewed by a pandas team member label Dec 18, 2023

lopof mentioned this issue Feb 26, 2024

second approach to fix setitem parametrizations #57629

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: AssertionError: Did not expect new dtype float64 to equal self.dtype float64. Please report a bug #56329

BUG: AssertionError: Did not expect new dtype float64 to equal self.dtype float64. Please report a bug #56329

adrienpacifico commented Dec 4, 2023 •

edited by MarcoGorelli

Loading

INSTALLED VERSIONS

lithomas1 commented Dec 5, 2023

MarcoGorelli commented Dec 5, 2023

MarcoGorelli commented Dec 6, 2023

lithomas1 commented Dec 6, 2023

adrienpacifico commented Dec 6, 2023

jbrockmendel commented Dec 6, 2023

BUG: AssertionError: Did not expect new dtype float64 to equal self.dtype float64. Please report a bug #56329

BUG: AssertionError: Did not expect new dtype float64 to equal self.dtype float64. Please report a bug #56329

Comments

adrienpacifico commented Dec 4, 2023 • edited by MarcoGorelli Loading

Pandas version checks

Reproducible Example

Issue Description

Expected Behavior

Installed Versions

INSTALLED VERSIONS

lithomas1 commented Dec 5, 2023

MarcoGorelli commented Dec 5, 2023

MarcoGorelli commented Dec 6, 2023

lithomas1 commented Dec 6, 2023

adrienpacifico commented Dec 6, 2023

jbrockmendel commented Dec 6, 2023

adrienpacifico commented Dec 4, 2023 •

edited by MarcoGorelli

Loading