-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: na_values defaults for read_csv don't match STR_NA_VALUES correctly #55803
Comments
Or indeed rewrite the joined strings to include the quotes so there is less fiddling to do to get the start and end quotes right. |
|
Also, although it doesn't show with the default font used, the closing quote in the " " is wrong. The html source appears as: : “ “, “#N/A”, “#N/A N/A”, “#NA”, “-1.#IND”, “-1.#QNAN”, “-NaN”, “-nan”, which again doesn't show in the github font, but if you copy and paste it, you can see the fifth character is the wrong quote character, as is the last but one. This is probably an issue with whatever is generating the smart quotes. Hopefully this will go away if the spurious spaces are fixed. |
In reference to the previous comment, something like: fill(', '.join(f'"{value}"' for value in sorted(STR_NA_VALUES)), 70, subsequent_indent=" ") should quote the values correctly. |
And a note that while this is a fairly trivial bug, STR_NA_VALUES as far as I can tell isn't documented, so if you want to remove a value from the default na_values list, the only documented way to do this is to copy the default values from the documentation. We've run into this, as we're implementing a workaround for the Pandas 2.0 non-backwards-compatible change of adding 'None' to the na_values default. |
Pandas version checks
main
hereLocation of the documentation
https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
Documentation problem
In https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html the list of na_values is supposed to match the list in STR_NA_VALUES.
However, in d4889bc the change to double quotes broke the first one, which is now incorrectly displayed as " ", i.e. a single space, rather than the correct "".
Curiously, read_excel still shows the first value as ‘’, so it hasn't had the change to double quotes.
Suggested fix for documentation
I think this bug got introduced due to adding a space so the triple double-quote could correctly end the string.
One fix could be to change the line:
NaN
: " """to something like
NaN
: """ + '"'The text was updated successfully, but these errors were encountered: