-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Parameter keep_equal
in .compare(...)
method should not rely on the keep_shape
parameter
#49510
Comments
Hi @it176131
I think, this expectation is a bit off as the the function is supposed to show differences. By default equal values are shown as If having both |
Thanks for looking at this @vamsi-verma-s I created this example to hopefully show more of what I think is an issue. import numpy as np # 1.23.3
import pandas as pd # 1.5.1
np.random.seed(0)
s0 = pd.Series(np.random.random(size=(5)))
s1 = s0.copy()
# change the 0th element in `s1` to something else
s1.iloc[0] = "a different value"
print(s0.compare(s1, keep_equal=True)) With
So I convert to a import numpy as np # 1.23.3
import pandas as pd # 1.5.1
np.random.seed(0)
s0 = pd.Series(np.random.random(size=(5)))
s1 = s0.copy()
# change the 0th element in `s1` to something else
s1.iloc[0] = "a different value"
f0 = s0.to_frame("col0")
f1 = s1.to_frame("col0")
print(f0.compare(f1, keep_equal=True)) But this doesn't do anything either.
To keep the method working as intended, should the docstring for |
I think it's more useful for comparing DataFrames that have multiple columns, In [41]: df1
Out[41]:
A B C
0 1 5 9
1 2 6 10
2 3 7 11
3 4 8 12
In [42]: df2
Out[42]:
A B C
0 1 5 9
1 2 -2 10
2 -1 7 11
3 4 8 12
In [43]: df1.compare(df2, keep_equal=True)
Out[43]:
A B
self other self other
1 2 2 6 -2
2 3 -1 7 7
In [44]: df1.compare(df2)
Out[44]:
A B
self other self other
1 NaN NaN 6.0 -2.0
2 3.0 -1.0 NaN NaN |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
The docs for the
keep_equal
parameter say:I would expect the following code to return rows where the values in each
Series
are equal, but instead it is empty.Changing the
keep_shape
parameter argument toTrue
results in the expected resultsI would like it if the latter output would be returned without having to change the
keep_shape
argument.Feature Description
Add logic to
.compare(...)
method sokeep_equal
can swap the underlyingmask
.NOTE -- this has not been tested!
Alternative Solutions
Use the
.compare(...)
method as is, but remember to change thekeep_shape
argument fromFalse
toTrue
.Additional Context
I could not find any existing issues related to the
keep_equal
parameter in.compare
. However, I did find this issue related totolerance
parameters.The text was updated successfully, but these errors were encountered: