-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support SQL-Style Joins between Xarray datasets and Dask/Pandas dataframes #5
Comments
This should be possible to demo once #8 is complete. If we figure this out, we should document it in the README. |
I’ve been reading more into how this is done in the status quo. The best example I can find for joining rasters and point data (and vectors) comes from using a hierarchical spatial index like h3 or s2. https://github.com/uber/h3-py-notebooks/blob/master/notebooks/unified_data_layers.ipynb I wonder if this is the technique that underpins Fused.io. |
For non-geospatial data, could we use a kdtree to create a hierarchical index? 🤔 |
This podcast episode is incredibly validating of the use case that this library (and issue) solves. |
https://github.com/DahnJ/H3-Pandas This gives me more confidence that an index system (geospatial via s2 and h3, or pre-computed via kdtrees) is a good integration. To me, this is proof of demand for such features. |
Here's an example workflow that I'd like to support once this feature exists. This is from Jake Wall of the Mara Elephant Project. Here, he would make use of raster and table data from Earth Engine.
I'm imagining this would look like a left join from a Dask Dataframe that had the elephant coordinates to an EE ImageCollection that was opened with Xee via Qarray. Some details are fuzzy, like how we'd interject a NN lookup (maybe, this could be done via a SQL aggregation?).
In general, I think there is broad demand for being able to join raster and tabular data with each other. Later in the line, I bet we could implement geo-aware joins that would make use of geometry.
The text was updated successfully, but these errors were encountered: