-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Inconsistent data types of Series.min()
return values by python interpreters
#55566
Comments
Series.min()
return values by python interpretersSeries.min()
return values by python interpreters
Are you using the same numpy version in both cases? |
As per the given details it seems he is using the same numpy version(1.26.1) in both the cases |
@twoertwein Yes, I’m using the same NumPy version in both cases. |
The difference in dtype depends on whether you have |
What's the expected result here? I was thinking we always returned Python objects for numerical scalars, but apparently not. |
@rhshadrach Thank you for your comment! You're right. As you mentioned, I looked at how the returned data type differs depending on whether the bottleneck package is installed or not. If the bottleneck package is installed, it is returned as With the bottleneck package installed:
After I uninstall the bottleneck package:
However, it is difficult to understand why pandas' behavior varies depending on whether the other package 'bottleneck' is installed or not. Should I call this a bug, or not a bug? In my case, I am using the return values for JSON serialization, but sometimes the NumPy data type (e.g. |
I think this is a bug. |
I think there are many places where pandas might return a numpy type instead of a builtin type. For
|
take |
Just noting there appears to be a fair amount of inconsistency across reduction operations in terms of whether a python scalar or a numpy scalar is returned. It might be nice to make this a bit more consistent: (these are all without bottleneck installed)
|
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
As I expected, the
Series.mean()
function returns a consistent data type,numpy.float64
. However, I found thatSeries.min()
function does not guarantee the same data type. Even though I use the same Python (3.10.12) and pandas (2.1.1) versions, the data type of the return value of theSeries.min()
orSeries.max()
function is returned differently depending on the Python interpreters. In some cases, it returnsnumpy.int64
or python's built-in integer typeint
. Is there anything I missed in the environment settings?Expected Behavior
I expect that it returns the value as a consistent data type like
numpy.int64
.Installed Versions
The text was updated successfully, but these errors were encountered: