Skip to content
This repository has been archived by the owner on Jan 11, 2021. It is now read-only.

HDFS Support? #93

Open
andygrove opened this issue Apr 17, 2018 · 1 comment
Open

HDFS Support? #93

andygrove opened this issue Apr 17, 2018 · 1 comment
Assignees

Comments

@andygrove
Copy link
Contributor

I'm curious if there are plans to support HDFS within this crate?

The Java parquet library allows parquet files to be read locally or from HDFS and in both cases it is possible to push down the projection and only retrieve the columns needed, which can make a huge difference in performance.

@sunchao
Copy link
Owner

sunchao commented Apr 18, 2018

Yes that's in the plan. I took a brief look at the hdfs crate, and think it should be relatively easy to have HdfsFile implement the Read. With that, it seems to only require a small code change.

The projection pushdown is orthogonal to the HDFS issue I think. It is already supported by the current record reader.

@sunchao sunchao self-assigned this May 10, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants