Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: manipulating or adding columns under a MultiIndex header yields no changes in the DataFrame whatsover. #55769

Closed
2 of 3 tasks
davidedigrande opened this issue Oct 30, 2023 · 3 comments
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@davidedigrande
Copy link

davidedigrande commented Oct 30, 2023

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
import numpy as np

arrays = [
    ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
    ['C', 'C', 'D', 'D', 'E', 'E', 'F', 'F'],
    ['G', 'H', 'E', 'L', 'M', 'N', 'O', 'P'],
]

columns = pd.MultiIndex.from_tuples(list(zip(*arrays)))

data = np.random.randn(3, 8)

df = pd.DataFrame(data, columns=columns)

df['A']['C']['G'] = 1

df['A']['C']['New Column'] = df['A']['C']['G']

Issue Description

Hi,

I encountered an issue while trying to assign values and add a new column to a multi-indexed DataFrame.
The expected changes are not being reflected in the DataFrame.

Steps to Reproduce:

  • Create a multi-index using pd.MultiIndex.from_tuples.
  • Create a DataFrame df with random data and the created multi-index:
          A                                       B                              
          C                   D                   E                   F          
          G         H         E         L         M         N         O         P
0 -0.859066 -2.100128  0.357982  0.415481  0.787294  0.279474 -0.465880  0.217855
1  0.057084 -1.639392  1.308054 -0.095697 -0.609562  0.028083  0.052342  1.030347
2  0.421698  0.303234 -0.107760 -0.965608 -0.105634 -0.273073  1.210474 -1.197179
  • Attempt to set a value in the DataFrame using multi-indexing:
df['A']['C']['G'] = 1

Printing out the column shows no changes:

df['A']['C']['G'] 

0   -0.859066
1    0.057084
2    0.421698
Name: G, dtype: float64

Here I'd expect to see either the column with all values set to 1, or an exception to be raised if it is not doable for any reason.


  • Attempt to add a new column under a nested column:
df['A']['C']['New Column'] = df['A']['C']['G']

df['A']['C']['New Column']

Outputs:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/davide/.local/lib/python3.8/site-packages/pandas/core/frame.py", line 3761, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/home/davide/.local/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3655, in get_loc
    raise KeyError(key) from err
KeyError: 'New Column'

Expected Behavior

I'd expect to be able to manipulate columns nested under different levels of a MultiIndex on axis 1, but Pandas seems to be non responsive to changes.

Or at least raise an Exception if this is not doable for some reason.

Installed Versions

INSTALLED VERSIONS

commit : a60ad39
python : 3.10.8.final.0
python-bits : 64
OS : Linux
OS-release : 6.2.0-1015-azure
Version : #15~22.04.1-Ubuntu SMP Fri Oct 6 13:20:44 UTC 2023
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8

pandas : 2.1.2
numpy : 1.26.1
pytz : 2023.3.post1

@davidedigrande davidedigrande added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 30, 2023
@davidedigrande davidedigrande changed the title BUG: manipulating or adding columns under a MultiIndex on axis 1 yields no changes in the DataFrame whatsover. BUG: manipulating or adding columns under a MultiIndex header yields no changes in the DataFrame whatsover. Oct 30, 2023
@mroeschke
Copy link
Member

Thanks for the report but for now this is the intended behavior since you're assigning via chained assignment. Please reference https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#indexing-view-versus-copy on how to correctly assign a new column in this situation

@davidedigrande
Copy link
Author

davidedigrande commented Oct 31, 2023

Thanks for the report but for now this is the intended behavior since you're assigning via chained assignment. Please reference https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#indexing-view-versus-copy on how to correctly assign a new column in this situation

Thank you for your answer.
I get it the rule is for both single and multi level indexes and I've been searching the problem specifically for the multilevel index, but this piece of documentation never showed up.
Furthermore GPT insisted I was using the right approach, so I was convinced something was off with pandas.

Still, shouldn't this raise an exception or at least show a warning message?

@mroeschke
Copy link
Member

Ideally this should raise a SettingWithCopyWarning but these cases are hard to catch in all cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

2 participants