Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: The count() API does not consider inf as NA (with pandas.options.mode.use_inf_as_na = True) when dtype is "double[pyarrow]" #52501

Closed
3 tasks done
tinadu0806 opened this issue Apr 6, 2023 · 2 comments
Labels
Arrow pyarrow functionality Bug ExtensionArray Extending pandas with custom dtypes or arrays. Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

Comments

@tinadu0806
Copy link

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

>>>import pandas as pd
>>>pd.options.mode.use_inf_as_na = True
>>>s = pd.Series([1.1, float("inf"), float("-inf"), 2.3], dtype="double[pyarrow]")
>>>s
0    1.1
1    NaN
2    NaN
3    2.3
dtype: double[pyarrow]
>>>s.count()
4
>>>type(s.count())
<class 'numpy.int64'>

Issue Description

When create a pandas series with pyarrow type "double[pyarrow]", after setting pd.options.mode.use_inf_as_na = True, the inf values are still counted. Also it returns a numpy int64 type scalar instead of pyarrow double scalar.

Expected Behavior

After setting pd.options.mode.use_inf_as_na = True, above s.count() should return 2 as result.

Installed Versions

Replace this line with the output of pd.show_versions()

@tinadu0806 tinadu0806 added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 6, 2023
@mroeschke mroeschke added Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Arrow pyarrow functionality and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 6, 2023
@mroeschke
Copy link
Member

Thanks for the report. Yeah generally no extension array types will respect this option.

We are also considering deprecating the use_inf_as_na option #51684 in favor of replacing inf with na first.

@mroeschke mroeschke added the ExtensionArray Extending pandas with custom dtypes or arrays. label Apr 6, 2023
@mroeschke
Copy link
Member

Given that this was deprecated in 2.1, closing as a won't fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arrow pyarrow functionality Bug ExtensionArray Extending pandas with custom dtypes or arrays. Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

No branches or pull requests

2 participants