-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Include df.attrs in to_csv output #53577
Comments
Can this be done by writing the |
I believe adding this metadata and storing it in It sounds like using |
Great. IMO it would make sense to have functionality that can read EDIT: I've changed the issue title to reflect that this issue has been changed to be about writing attrs metadata to csv. |
Looks like this was pretty much completed but never merged? What needs to be done to make it happen? I would love to have this feature! |
I would be happy to update and re-open my PR if there is interest in having it merged in |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
There are many use cases (especially in the scientific community) where the best/only course of action is to enable to embedding of configuration parameters and/or other metadata into the beginning of a CSV file itself. These are typically prefaced with some comment-indication prefix such as #. This maintains human readability while attaching the metadata to the generated file itself.
Pandas'
read_csv
method already implements a feature to read such files and ignore these lines when parsing the the data into a dataframe. This new feature would implements the complement of this feature. It allows users to write these metadata and/or comment lines in their CSV outputs as well.This could be accomplished file handlers (thanks @twoertwein)
However, adding the comment param to the
to_csv
would better match toread_csv
method.Feature Description
A new function would be implemented to write commend lines using the csv writer
This could then be called in the
_save
methodAlternative Solutions
Technically, using the file handlers method mentioned in the above would satisfy this feature request. However, it could be more logical for users to find if it mirrored the
read_csv
API.An alternative, more complex, but perhaps more flexible solution could be to store the comment lines in the DataFrame object itself with a flag to automatically write those comment lines when
to_csv
is called. This way whento_csv
is called the comments would be guaranteed to write. This would ensure the comments would be written in systems where the DataFrame writing to disk mechanism is abstracted away from the users code. This exists in situations where the pandas/python code is being run my a job submission/scheduling system.Additional Context
I was a little exited and already created a PR for this feature. #53569
Apologies! I should have started here first. I am happy to close or modify it as needed.
The text was updated successfully, but these errors were encountered: