-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: ExcelWriter set_column with num_format datetime doesn't work (but it works for xlsxwriter) #55196
Comments
set_column does not change the format; it changes the default format:
Since pandas is setting the format for date cells, it has no effect. As far as I can tell there is no easy way to overwrite an existing format with xlsxwriter. The best resolution I see for this within pandas is to use datetime_format as you mentioned. |
So here if we are looking to change any format of date column, by using strftime function we could make the possible changes to the date values. |
@Pavanmahaveer7 - I think you are saying to modify the format prior calling |
@rhshadrach Okay thanks for that information, I didn't know it only changed the default format. In that case we will keep using datetime_format 👍 |
Currently the default value of I still think a better resolution is to have a way to override the existing format via xlsxwriter. |
@rhshadrach Im sorry, I dont really understand your last message. |
Depending on interpretation, I think this is not correct - in particular pandas is not overwriting. pandas sets the format as If users could tell pandas to not set any format for datetime (my proposal above - which I do not like but it is an option), then changing the default format would have an impact and so your call to It seems better to me if xlsxwriter gave you the ability to modify the format on an entire column, and not just the default format. I believe xlsxwriter has no such function. |
Perhaps my use of "default format" is confusing and I should be saying "no format" instead. |
Looking into this again, I'm now understanding that Excel has cell, row, and column formats. The Also, |
@jmcnamara - I was wondering if you could confirm my understanding here is correct. I believe this comment is self-contained, so hopefully you don't have to wade through the above comments 😄. pandas writes dates and datetimes by applying a cell format to particular cells. This makes xlxswriter's I was wondering if pandas should do something different here - but it appears there is no "default date format" in Excel. XlsxWriter does have a So users wanting to change the format of date and datetime columns that already have a format applied just need to loop over the cells in a column. |
Yes that is correct. To reiterate: Pandas applies a cell format when writing a datetime and that cell format can't/won't be overwritten by a row or column format. @rhshadrach I think the proposal to have change the behaviour of For reference here are the XlsxWriter docs where I try to point people in the right direction on Formatting of the Dataframe output. |
Thanks @jmcnamara!
Makes sense - but I would think just not as the default. I'll put up a PR for this and improve some of the docs. |
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
I created an Excel sheet using
pd.to_excel()
. The data contains one column with a float number and one column with adatetime.datetime
object.Then, I want to set the column type using the
num_format
parameter inset_column
.This works for the float value, which I want in the percentage format.
However, it does not work for the datetime value. The value is in the wrong format "01.01.2020 00:00:00" (German date). The value in the Excel file is in the same wrong format, whether I run
set_column()
for the date column or not.I have tried to circle in the problem by trying other variations, which turns out that they do work. However, those would be only workarounds for our use case.
Using the parameter
datetime_format='dd.mm.yyyy'
in thepd.ExcelWriter
constructor puts the date into the right format.Instead of writing the sheet from a DataFrame, using the
xlsxwriter
functions works fine:Expected Behavior
The value is in the format "01.01.2020" when using
set_column
withpd.ExcelWriter
.Installed Versions
pandas : 2.1.0
numpy : 1.26.0
pytz : 2023.3
dateutil : 2.8.2
setuptools : 65.7.0
pip : 22.0.4
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : 1.3.3
lxml.etree : None
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.5.0
pandas_datareader : None
bs4 : 4.12.2
bottleneck : None
dataframe-api-compat: None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.5.1
numba : None
numexpr : None
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : 3.0.0
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.7.3
sqlalchemy : None
tables : None
tabulate : 0.8.7
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None
The text was updated successfully, but these errors were encountered: