-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: read parquet files in chunks using to_parquet and chunksize #55973
Comments
Hi @match-gabeflores , I would like to have a look into this issue. Can you please assign it to me? |
Thanks, go for it! Unfortunately, I don't have access to assign |
Hello @match-gabeflores, My project team is looking for Pandas enhancement features for our grad school semester long project. We saw this task and would like to contribute if possible! Furthermore, we noticed that @RahulDubey391 mentioned that he wanted to work on this feature a few months ago. However, if no one is currently working on it, we would like to pick it up. |
Go for it, @Meadiocre ! I don't have access to assign, I think that's just a formality anyway. @lithomas1 |
take |
Hello @match-gabeflores, |
take |
Hello @match-gabeflores Thanks! |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
Similar to how
read_csv
haschunksize
parameter, canread_parquet
function have chunksize?Seems like it's possible using pyarrow via
iter_batches
.https://stackoverflow.com/questions/59098785/is-it-possible-to-read-parquet-files-in-chunks
Is this something feasible within pandas?
Feature Description
add a new parameter chunksize to read_parquet
Alternative Solutions
use pyarrow iter_batches
Additional Context
No response
The text was updated successfully, but these errors were encountered: