BUG: wrong future Warning on string assignment in certain condition #56503

xmatthias · 2023-12-14T16:29:43Z

Pandas version checks

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randint(0, 100, size=(5, 2)), columns=list('AB'))

# Assigning string directly works
df.loc[df.index < 2, 'E'] = 'abc'

# Assigning a string and a float in the same step raises a Future warning
df.loc[df.index <= 2, ['F', 'G']] = (1, 'abc', )

# DOing the same assignment unconditionally does NOT raise the future warning
df.loc[:, ['I', 'J']] = (1, 'abc', )
df.head()

Issue Description

String assignment future warnings are quite flaky and seem to change and break in different ways in every recent version.

While most issues i've seen have been fixed - i just encountered a new issue, where behavior is completely non-uniform and surprising

I can reproduce this in anything after 2.1.x - including 2.1.4 - which is the latest current release
2.0.3 doesn't issue this - but i think these flaky FutureWarnings were only introduced in 2.0.3.

Expected Behavior

No future warning when creating a new string column - independant of the precence of another column in the assignment (and independent of a filter).

Installed Versions

INSTALLED VERSIONS

commit : a671b5a
python : 3.11.4.final.0
python-bits : 64
OS : Linux
OS-release : 6.6.3-arch1-1
Version : #1 SMP PREEMPT_DYNAMIC Wed, 29 Nov 2023 00:37:40 +0000
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 2.1.4
numpy : 1.26.2
pytz : 2023.3
dateutil : 2.8.2
setuptools : 68.2.2
pip : 23.3.1
Cython : 0.29.35
pytest : 7.4.3
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.9.3
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.14.0
pandas_datareader : None
bs4 : 4.12.2
bottleneck : None
dataframe-api-compat: None
fastparquet : None
fsspec : 2023.12.1
gcsfs : None
matplotlib : 3.7.1
numba : None
numexpr : 2.8.4
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 14.0.1
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.11.4
sqlalchemy : 2.0.23
tables : 3.9.1
tabulate : 0.9.0
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None

The text was updated successfully, but these errors were encountered:

phofl · 2023-12-14T23:02:23Z

cc @MarcoGorelli

I think the middle case is raising erroneously?

Otherwise, this should raise as well:

df.loc[df.index < 2, 'E'] = 1

which isn't showing a warning at the moment

xmatthias · 2023-12-15T05:28:37Z

That's correct - the warning is only showing on the middle case, so it requires

multi column assignment (it also happens if both columns are a string)
with a filter

to actually trigger

david-bx-zhang · 2023-12-20T21:53:44Z

Hi, I'm a new contributor, Is it okay If I take this to look at as a first issue?

david-bx-zhang · 2023-12-20T21:57:07Z

take

xmatthias · 2024-01-14T06:52:28Z

@david-bx-zhang any progress / update on this?

It's a quite nagging problem (considering it's the 3rd or 4th similar problem in the last few versions) - which is affecting our community quite heavily - so we'd apreciate some timeline on a fix ...

MarcoGorelli · 2024-01-14T08:45:40Z

When the code reaches

pandas/pandas/core/internals/managers.py

Line 1340 in 486b440

new_mgr = col_mgr.setitem((idx,), value)

then in the single-column assignment case we have

(Pdb) p col_mgr, idx, value
(SingleBlockManager
Items: RangeIndex(start=0, stop=5, step=1)
NumpyBlock: 5 dtype: object, array([ True,  True, False, False, False]), 'abc')

whereas as in the multi-column assignment one, when setting the 'abc' value, we have

(Pdb) p col_mgr, idx, value
(SingleBlockManager
Items: RangeIndex(start=0, stop=5, step=1)
NumpyBlock: 5 dtype: float64, array([ True,  True,  True, False, False]), 'abc')

This block probably needs initialising as object too

MarcoGorelli · 2024-01-14T09:29:36Z

A list-like column indexer is all you need to raise the warning, even if there's just a single key, as that'll go down _ensure_listlike_indexer, which initialises the new columns as 'float64'

@xmatthias I'd suggest updating your docs to assign to each column in turn:

def populate_entry_trend(self, dataframe: DataFrame, metadata: dict) -> DataFrame:
    row_indexer = (dataframe['rsi'] < 35) & (dataframe['volume'] > 0)
    dataframe.loc[row_indexer, 'enter_long'] = 1
    dataframe.loc[row_indexer, 'enter_tag'] = 'buy_signal_rsi'

    return dataframe

, the code is very "spaghettified" at this point and it's going to be very tricky to not warn here without causing other behaviour changes

xmatthias · 2024-01-14T09:32:42Z

@MarcoGorelli well that doesn't change the fact that it's a pandas bug though.

I'm aware that there are workarounds, but isn't the point of an issue to point out false behavior?

Respecting the warning text, this will be an error in future versions, making it an even more severe bug, not just a wrong warning.

If the warnings are not reliable, they should be disabled instead of warning falsely.

MarcoGorelli · 2024-01-14T09:36:17Z

I'll see what can be done, but "row-wise column-wise expansion with multiple items" strikes me as uncommon-enough that if it can't be fixed simply, then it'd be better to just disallow it

The workaround on the users' side is very simple here

MarcoGorelli · 2024-01-15T18:29:42Z

Going to try addressing this by 2.2.1, maybe it's still salvageable

david-bx-zhang · 2024-01-16T14:53:16Z

Hi sorry, I haven't too on top of this but my first findings were that the warning seems to be only triggered in this function def coerce_to_target_dtype and it seems it it was only applicable to the conditional multi assignment, for the unconditional assignment it seems to be a different logic where it goes through another path.

Will continue investigating in this and provide a timeline but as a question, should this error also occur in the single column?

phofl · 2024-01-16T16:14:01Z

Investigations are welcome, but this is probably not a good issue for new contributors

xmatthias added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 14, 2023

rhshadrach added Dtype Conversions Unexpected or buggy dtype conversions Warnings Warnings that appear or should be added to pandas and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 15, 2023

MarcoGorelli added the PDEP6-related related to PDEP6 (not upcasting during setitem-like Series operations) label Dec 15, 2023

github-actions bot assigned david-bx-zhang Dec 20, 2023

xmatthias mentioned this issue Jan 14, 2024

Deprecation warning of using enter_tag, will raise in future versions of pandas freqtrade/freqtrade#9679

Closed

MarcoGorelli added this to the 2.2.1 milestone Jan 15, 2024

MarcoGorelli mentioned this issue Jan 19, 2024

[READY] perf improvements for strftime #51298

Open

5 tasks

MarcoGorelli mentioned this issue Feb 13, 2024

BUG: wrong future Warning on string assignment in certain condition #57402

Merged

5 tasks

mroeschke closed this as completed in #57402 Feb 16, 2024

xmatthias mentioned this issue Feb 21, 2024

Setting an item of incompatible dtype is deprecated freqtrade/freqtrade#9852

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: wrong future Warning on string assignment in certain condition #56503

BUG: wrong future Warning on string assignment in certain condition #56503

xmatthias commented Dec 14, 2023 •

edited

Loading

INSTALLED VERSIONS

phofl commented Dec 14, 2023

xmatthias commented Dec 15, 2023

david-bx-zhang commented Dec 20, 2023

david-bx-zhang commented Dec 20, 2023

xmatthias commented Jan 14, 2024

MarcoGorelli commented Jan 14, 2024

MarcoGorelli commented Jan 14, 2024

xmatthias commented Jan 14, 2024 •

edited

Loading

MarcoGorelli commented Jan 14, 2024

MarcoGorelli commented Jan 15, 2024 •

edited

Loading

david-bx-zhang commented Jan 16, 2024

phofl commented Jan 16, 2024

BUG: wrong future Warning on string assignment in certain condition #56503

BUG: wrong future Warning on string assignment in certain condition #56503

Comments

xmatthias commented Dec 14, 2023 • edited Loading

Pandas version checks

Reproducible Example

Issue Description

Expected Behavior

Installed Versions

INSTALLED VERSIONS

phofl commented Dec 14, 2023

xmatthias commented Dec 15, 2023

david-bx-zhang commented Dec 20, 2023

david-bx-zhang commented Dec 20, 2023

xmatthias commented Jan 14, 2024

MarcoGorelli commented Jan 14, 2024

MarcoGorelli commented Jan 14, 2024

xmatthias commented Jan 14, 2024 • edited Loading

MarcoGorelli commented Jan 14, 2024

MarcoGorelli commented Jan 15, 2024 • edited Loading

david-bx-zhang commented Jan 16, 2024

phofl commented Jan 16, 2024

xmatthias commented Dec 14, 2023 •

edited

Loading

xmatthias commented Jan 14, 2024 •

edited

Loading

MarcoGorelli commented Jan 15, 2024 •

edited

Loading