BUG: groupby, as_index=False still returning group variable as index #13217

nickeubank · 2016-05-18T15:25:39Z

Code Sample, a copy-pastable example if possible

a = pd.DataFrame({'a':[1,1,2,2], 'b':[1,1,2,2], 'c':[1,1,1,1]})
a.groupby(['a','b'], as_index=False).apply(lambda x: 1)

Out[4]: 
a  b
1  1    1
2  2    1
dtype: int64

Expected Output

Out[4]:
0    1
1    1
dtype: int64

(this is what you get with a unique by column --

a.groupby(['a'], as_index=False).apply(lambda x: 1)
Out[8]: 
0    1
1    1
dtype: int64

output of `pd.show_versions()`

pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 15.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.1
nose: 1.3.7
pip: 8.1.1
setuptools: 20.3
Cython: 0.23.4
numpy: 1.11.0
scipy: 0.17.0
statsmodels: None
xarray: None
IPython: 4.1.2
sphinx: 1.3.5
patsy: 0.4.0
dateutil: 2.5.1
pytz: 2016.2
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5.1
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

jreback · 2016-05-18T15:35:30Z

I guess. this is very odd to do.

jreback · 2016-05-18T15:36:02Z

to be honest we should just remove as_index entirely. Its a simple .reset_index() if someone wants it.

pfrcks · 2016-05-18T15:42:06Z

@jreback I want to look into this.
Can you specify what do you mean by simple reset_index().
Do we apply reset_index attribute in case someone passes the 'as_index' param?

jreback · 2016-05-18T15:45:49Z

no that's a different (API issue)

this can be solved by stepping thru code and see where it doesn't properly handle the as_insex flag

nickeubank · 2016-05-18T15:46:53Z

Don't have a strong preference on how it behaves (or if we keep as_index), just want to get rid of inconsistent behavior.

pfrcks · 2016-05-18T16:36:17Z

@jreback Upon looking through the code, core/groupby.py seems to be responsible in some way.
Upon looking in the file I came across _index_with_as_index function which has not been called anywhere.
The func is supposed to 'Take boolean mask of index to be returned from apply, if as_index=True', but since it is note getting called from anywhere, I don't understand what its purpose.

jreback · 2016-05-18T16:38:17Z

@pfrcks write the test and step thru. identify where you think you need to change and test.

jrbrodie77 · 2017-12-15T23:58:01Z

I ran into this same bug today in ver. 0.21

a = pd.DataFrame([np.zeros(3), np.ones(3), 2*np.ones(3)], columns="A B C".split())
a.groupby(['A', 'B'], as_index=False).apply(np.mean)

if my groupby is a pair of column names as_index is ignored.

If I get a chance in the next couple of weeks I may try to find/fix it.

rhshadrach · 2020-11-07T20:28:26Z

In this case, the function is an aggregator; with as_index=False the grouping labels should appear in the result as columns. Right now on master:

a = pd.DataFrame({'a':[1, 1, 2, 2], 'b':[1,1,2,2], 'c':[1,1,1,1]})
print(a.groupby(['a','b'], as_index=False).apply(lambda x: 1))
print(a.groupby(['a','b'], as_index=True).apply(lambda x: 1))

gives:

   a  b  NaN
0  1  1    1
1  2  2    1

a  b
1  1    1
2  2    1
dtype: int64

which looks right to me.

vroomzel · 2022-01-20T21:17:30Z

Still doesn't work in version '1.3.4'

rhshadrach · 2022-01-20T22:25:31Z

@vroomzel - can you post the input / output you're seeing, as well as the result of pd.show_versions()

openSourcerer9000 · 2023-10-09T22:41:05Z

We're still suffering here in v1.5

rhshadrach · 2023-10-10T20:59:00Z

@openSourcerer9000 - when sharing reproducible examples, please do so in plain text rather than screen shots. Plain text is more convenient for maintainers.

Though perhaps not well documented, I believe as_index is not intended to have an impact when iterating over a groupby object. Can you open a new issue?

jreback added Bug Groupby Difficulty Intermediate labels May 18, 2016

jreback added this to the Next Major Release milestone May 18, 2016

nickeubank mentioned this issue May 19, 2016

Dissolve tweaks geopandas/geopandas#323

Merged

chris-b1 mentioned this issue Oct 31, 2016

Group-by/apply unexpected output with some operations when as_index=False #14547

Closed

h-vetinari mentioned this issue Aug 30, 2018

API: groupby aggregation with apply does not drop groupby-column #22542

Closed

jbrockmendel removed Difficulty Intermediate labels Oct 21, 2019

simonjayhawkins mentioned this issue Apr 24, 2020

Using agg with groupy, as_index=False still returning group variable as index #25011

Closed

simonjayhawkins changed the title ~~BUG: as_index issues with groupby~~ BUG: as_index=False issues with groupby Apr 24, 2020

simonjayhawkins changed the title ~~BUG: as_index=False issues with groupby~~ BUG: groupby, as_index=False still returning group variable as index Apr 24, 2020

rhshadrach added the Needs Tests Unit test(s) needed to prevent regressions label Nov 7, 2020

mroeschke added the good first issue label Apr 30, 2021

mroeschke mentioned this issue May 12, 2021

TST: Add test for old issues #41431

Merged

10 tasks

jreback modified the milestones: Contributions Welcome, 1.3 May 12, 2021

jreback closed this as completed in #41431 May 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: groupby, as_index=False still returning group variable as index #13217

BUG: groupby, as_index=False still returning group variable as index #13217

nickeubank commented May 18, 2016

jreback commented May 18, 2016

jreback commented May 18, 2016

pfrcks commented May 18, 2016

jreback commented May 18, 2016

nickeubank commented May 18, 2016

pfrcks commented May 18, 2016

jreback commented May 18, 2016

jrbrodie77 commented Dec 15, 2017 •

edited

Loading

rhshadrach commented Nov 7, 2020 •

edited

Loading

vroomzel commented Jan 20, 2022

rhshadrach commented Jan 20, 2022

openSourcerer9000 commented Oct 9, 2023

rhshadrach commented Oct 10, 2023

BUG: groupby, as_index=False still returning group variable as index #13217

BUG: groupby, as_index=False still returning group variable as index #13217

Comments

nickeubank commented May 18, 2016

Code Sample, a copy-pastable example if possible

Expected Output

output of pd.show_versions()

INSTALLED VERSIONS

jreback commented May 18, 2016

jreback commented May 18, 2016

pfrcks commented May 18, 2016

jreback commented May 18, 2016

nickeubank commented May 18, 2016

pfrcks commented May 18, 2016

jreback commented May 18, 2016

jrbrodie77 commented Dec 15, 2017 • edited Loading

rhshadrach commented Nov 7, 2020 • edited Loading

vroomzel commented Jan 20, 2022

rhshadrach commented Jan 20, 2022

openSourcerer9000 commented Oct 9, 2023

rhshadrach commented Oct 10, 2023

output of `pd.show_versions()`

jrbrodie77 commented Dec 15, 2017 •

edited

Loading

rhshadrach commented Nov 7, 2020 •

edited

Loading