Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a to_spatial method on ParquetSource #11

Closed
wants to merge 7 commits into from

Conversation

jsignell
Copy link
Member

Fixes #10

@@ -106,5 +106,32 @@ def _to_dask(self):
self._load_metadata()
return self._df

def to_spatial(self):
"""
Create a datashader spatial object from the parquet data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs a lot more description, and certainly a link to wherever the general idea is described (e.g., how do you make these special parquet datasets?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

# Load metadata
props = json.loads(pf.key_value_metadata['SpatialPointsFrame'])
else:
props = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if no?

props = None

# Call DataFrame constructor with the internals of frame
return SpatialPointsFrame(frame.dask, frame._name, frame._meta,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this can fail in various way?
I suppose the point is to pass this immediately to graphics, would it make sense to (optionally) construct a datashader/holoviews object right here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should have made this clearer, I copied all of this straight from: https://github.com/pyviz/datashader/blob/master/datashader/spatial/points.py#L297-L312

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so we just defer everything to whatever that interface does... I guess then the docstring should be caveated to say that whatever goes wrong, it's not Intake's fault!

conda/meta.yaml Outdated
@@ -30,6 +30,7 @@ test:
- pytest
- pytest-cov
- coverage
- datashader==0.7.0
Copy link
Member

@martindurant martindurant May 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem right. The test skips without datashader, so it's not required; the things here are those needed to be able to run the tests. You did this because there is no separate "testing environment" and tests run via conda-build?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I was trying to run the tests on CI. We should probably set up this repo properly so that there is separate testing env.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can that fit in here? Could be as simple as

conda instal ...
pytest ...

in travis file, instead of the conda build line. Since we don't need to auto-upload, makes sense to me.


frame = self._df or self.to_dask()
urlpath = self._get_cache(self._urlpath)[0]
pf = fp.ParquetFile(urlpath)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this probably doesn't work for remote. Is there a good way to get key_value_metadata off the dask version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No there isn't. You need slightly different things to work with FP directly, but it can be done.

@martindurant
Copy link
Member

I didn't notice the new commits, sorry. Any idea why this isn't testing?

@jsignell
Copy link
Member Author

I suppose just travis flakiness.

@jsignell
Copy link
Member Author

I was doing a little spring cleaning and came across this PR. Closing since it's 4 years old, although it looks like another option might be to merge it?

@jsignell jsignell closed this Jun 27, 2023
@jsignell jsignell deleted the jsignell/spatial branch June 27, 2023 11:53
@martindurant
Copy link
Member

Thanks @jsignell . I think "spatial parquet" is being replaced by geopandas related things and/or awkward array, so there is no need for this any more. Plus, you MUST check out intake 2.0, which will make all these to_* methods explicit.

@jsignell
Copy link
Member Author

I can't wait! Intake-stac is dying to be rewritten once intake 2.0 is ready :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add datashader.SpatialDataFrame reader
2 participants