-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Float64Dtype and isnull() functions unexpected behavior #53887
Comments
xref #32265 |
I just had an issue related to this while using On import numpy as np
import pandas as pd
df = pd.DataFrame({'a': [5, 0], 'b': [np.nan, 12]})
df = df.astype('float64')
df['c'] = df['a'] * np.inf
print(df.isna()) Results:
Simply changing types: df = df.astype('Float64')
df['c'] = df['a'] * np.inf
print(df.isna()) Changes the result of the columns C row 1:
We solved this issue using I know little about the internals, but would be happy to help with if this is not a desired behavior! If this is desired, it would be nice to document and guide users :) |
You can fix this in your own codebase at the moment by making the following change: import numpy as np
from pandas.core.arrays.floating import FloatingArray
FloatingArray.oldisna = FloatingArray.isna
def newisna(self: FloatingArray) -> np.ndarray:
return np.isnan(self._data) | self._mask.copy()
FloatingArray.isna = newisna For us this is part of our default pandas monkey patch module that is loaded in all code. |
For anyone else seeing this, also worth patching from pandas.core.arrays.masked import BaseMaskedArrayT, validate_fillna_kwargs, is_array_like, missing
FloatingArray.oldfillna = FloatingArray.fillna
def newfillna(
self: BaseMaskedArrayT, value=None, method=None, limit=None
) -> BaseMaskedArrayT:
value, method = validate_fillna_kwargs(value, method)
mask = (self._mask | np.isnan(self._data)).copy()
if is_array_like(value):
if len(value) != len(self):
raise ValueError(
f"Length of 'value' does not match. Got ({len(value)}) "
f" expected {len(self)}"
)
value = value[mask]
if mask.any():
if method is not None:
func = missing.get_fill_func(method, ndim=self.ndim)
npvalues = self._data.copy().T
new_mask = mask.T
func(npvalues, limit=limit, mask=new_mask)
return type(self)(npvalues.T, new_mask.T)
else:
# fill with value
new_values = self.copy()
new_values[mask] = value
else:
new_values = self.copy()
return new_values
FloatingArray.fillna = newfillna |
Hi, I have the same problem when filling nans generated by a 0/0 division:
When filling the nan, the 0/0 nan is not filled. Casting to
|
Some more info on this behavior: #32265 The v2.0 changelog mention this: https://pandas.pydata.org/docs/whatsnew/v2.0.0.html#
To unify the nan values I've found another solution:
This replaces all np.nan with the pd.NA values |
This has tripped me up and I consider it currently broken with pyarrow backend. My current workarounds are to use Simple reproduction: a = pd.DataFrame({'a': [1.0, 0.0, np.nan]})
b = a.convert_dtypes(dtype_backend='pyarrow')
# inconsistent behaviour for 0/0 but consistent for nan/0
display((a / 0).isna())
display((b / 0).isna())
# consistent behaviour 0/0 and nan/0 both yield True.
display(np.isnan(a / 0))
display(np.isnan(b / 0))
``` |
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
[Float64Dtype(), Float64Dtype(), Float64Dtype(), dtype('float64'), dtype('float64')]
0 1
0 1
0 0
0 0
5 5
5 5
[Float64Dtype(), Float64Dtype(), Float64Dtype(), Float64Dtype(), Float64Dtype()]
0 1
0 1
0 0
0 0
5 5
5 5
[dtype('float64'), dtype('float64'), dtype('float64'), dtype('float64'), dtype('float64')]
1 1
1 1
0 0
0 0
5 5
5 5
[dtype('float64'), dtype('float64'), dtype('float64'), dtype('float64'), dtype('float64')]
1 1
1 1
0 0
0 0
5 5
5 5
[Float64Dtype(), Float64Dtype(), Float64Dtype(), Float64Dtype(), Float64Dtype()]
1 1
1 1
0 0
0 0
5 5
5 5
[dtype('float64'), dtype('float64'), dtype('float64'), dtype('float64'), dtype('float64')]
1 1
1 1
0 0
0 0
5 5
5 5
1 1
1 1
0 0
0 0
5 5
5 5
The text was updated successfully, but these errors were encountered: