-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: replace na_rep for pd.NA values in to_string
#54959
Conversation
@mroeschke Need some help here. This test will fail |
@mroeschke The documentation says, that the default value for def _format(x):
if self.na_rep is not None and is_scalar(x) and isna(x):
try:
# try block for np.isnat specifically
# determine na_rep if x is None or NaT-like
if x is None:
return "None"
elif x is NA:
return str(NA)
elif x is NaT or np.isnat(x):
return "NaT"
except (TypeError, ValueError):
# np.isnat only handles datetime or timedelta objects
pass
return self.na_rep and if I replace |
Which tests need fixing? |
@mroeschke Some tests that are failing are below. This is not an exhaustive list. The
|
@mroeschke pinging for visibility. Please let me know if the above message was clear or not. |
@mroeschke need some inputs here :) |
Yeah I think the tests should be updated for this change |
@mroeschke can you please review this PR? |
@mroeschke can you please review this? I have updated the test cases. |
@mroeschke pinging on green. |
@mroeschke @MarcoGorelli can one of you please review this? |
Thanks for your PR!
Just as a heads up, I'm not familiar with strings - I can take a look, but I'm afraid I'll need to prioritise other things first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay on my end
pandas/core/frame.py
Outdated
@@ -7340,8 +7340,8 @@ def value_counts( | |||
>>> df | |||
first_name middle_name | |||
0 John Smith | |||
1 Anne <NA> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the usages that were changed should still use <NA>
rather than NaN
. This should only change if na_rep
was specified as something else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expected value in these tests is
<NA>
but according to docs it should beNaN
since that's the default value for na_rep. Somehow this behaviour is not in place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think since the current behavior has been the case for a few releases now, the default of na_rep="NaN"
should be changed to na_rep=lib.no_default
so that NaN
, NA
, and None
can continue to use their reprs here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mroeschke the problem is if I set na_rep=lib.no_default
then the string representation will not contain either of NaN
, NA
or None
. I have tried setting the default as <NA>
and reverted most of the test cases, however, there's issue in a couple of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using na_rep=lib.no_default
would mean making NaN
, NA
and None
return their string defaults like
if self.na_rep is lib.no_default:
...existing logic
else:
...use na_rap
Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen. |
na_rep
not respected into_*
methods withpd.NA
#54872doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.