You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to have df.compare accept "tolerance" thresholds to allow for approximate comparisons. This feature already exists in the assert_frame_equal utility, and it would be beneficial in compare to help users identify the rows and columns that are causing their assertion to fail. It would be helpful in many cases to allow users to filter out differences that are sufficiently small.
Feature Description
def compare(
self,
other: DataFrame,
align_axis: Axis = 1,
keep_shape: bool = False,
keep_equal: bool = False,
rtol = None,
atol = None,
) -> DataFrame:
"""
...
rtol: float | None, default None
Relative tolerance. Numeric differences below this value will not be considered differences for the purposes of "keep_shape" and will be shown as NaN if "keep_equal" is False.
atol: float | None, default None
Absolute tolerance. Numeric differences below this value will not be considered differences for the purposes of "keep_shape" and will be shown as NaN if "keep_equal" is False.
For implementation, the current comparison is essentially the following check: mask = ~((self == other) | (self.isna() & other.isna())). From a quick glance of _testing.assert_almost_equal, it appears we could implement it by calling that function iteratively with each item of the DataFrame, though I'm not sure if it's okay to reference the _testing library outside of testing functions.
Alternative Solutions
Could also be implemented more directly with math.isclose function calls, but this would need to be applied only to numeric columns.
Additional Context
No response
The text was updated successfully, but these errors were encountered:
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
I would like to have
df.compare
accept "tolerance" thresholds to allow for approximate comparisons. This feature already exists in theassert_frame_equal
utility, and it would be beneficial in compare to help users identify the rows and columns that are causing their assertion to fail. It would be helpful in many cases to allow users to filter out differences that are sufficiently small.Feature Description
For implementation, the current comparison is essentially the following check:
mask = ~((self == other) | (self.isna() & other.isna()))
. From a quick glance of_testing.assert_almost_equal
, it appears we could implement it by calling that function iteratively with each item of the DataFrame, though I'm not sure if it's okay to reference the_testing
library outside of testing functions.Alternative Solutions
Could also be implemented more directly with
math.isclose
function calls, but this would need to be applied only to numeric columns.Additional Context
No response
The text was updated successfully, but these errors were encountered: