-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Tile Matrix Set to describe multiscales #44
Use Tile Matrix Set to describe multiscales #44
Conversation
Do you have a link to the OGC Tile Matrix Set standard. I am currently working on https://github.com/JuliaDataCubes/PyramidScheme.jl a Julia package for generating and working with pyramid datasets mainly for plotting and I aim to be complaint with geozarr in reading and writing these datasets. I will have a more depth look in the next days and try to implement this standard in Julia. |
Co-authored-by: Felix Cremer <[email protected]>
Here the link: https://docs.ogc.org/is/17-083r4/17-083r4.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I've added a few suggestions and inline questions.
Within the Tile Matrix Set | ||
* the Tile Matrix identifier for each zoom level MUST be the relative path to the Zarr group which holds the DataArray variable | ||
* zoom levels MUST be provided from lowest to highest resolutions | ||
* the `supportedCRS` attribute of the Tile Matrix Set MUST match the crs information defined under **grid_mapping**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the duplication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no supportedCRS
attribute in 2DTMS 2.0.
See also the Release Notes explaining how supportedCRS
was changed to crs
, as well as 6.2.1.1. TileMatrixSet CRS Compatibility explaining that a 2DTMS is compatible with multiple CRS (e.g., CRS that only change axis order -- as that does not affect the tiling at all, CRSs that add additional dimension (e.g. CRS84 vs CRS84h), as well as realizations of a datum ensemble .
* MAY list the min and max rows and columns for each zoom level. If omitted, it is assumed that the entire spatial extent is covered (resulting in higher chunk count of the DataArray). | ||
|
||
#### Resampling Method | ||
Resampling Method specifies which resampling method is used for generating multiscales. It MUST be one of the following string values. Resampling method MUST be the same across all zoom levels: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the options constrained by xarray or tms? It would be good to link to a source that details the options (e.g. could I use the 20th percentile (though not sure why)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took the list from rasterio. But looking at this again, I think this should probably be an implementation detail. It will be enough to say that it must be of type string and the same across all zoom levels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the same syntax in the Tile Matrix Set spec? I searched and couldn't find anything. Is this also a requirement for COGs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, resampling is not part of TMS or COGs.
This property was already part of the current specs. It is useful for assuring consistency when progressively adding data to the same zarr store. Otherwise you might end up with overview chunks that were resampled using different methods.
+ ] | ||
+ "multiscales": | ||
- { | ||
- "tile_matrix_set": "https://schemas.opengis.net/tms/2.0/json/examples/tilematrixset/WebMercatorQuad.json", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how concise this becomes!
I Assume this supports any TMS, e.g. one based on an equal area projection such as epsg:6933.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this should work with most CRS as long as you can reference it via URL (such as EPSG), represent it as WKT, or in the ISO 19115 standard.
Well-known Tile Matrix Sets are listed here: https://schemas.opengis.net/tms/2.0/json/examples/tilematrixset/
But you can always define your own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That examples directory is just some examples -- not the official registry.
2DTMS can be registered in any registry, but the OGC registry of 2DTMS is at:
http://www.opengis.net/def/tms
which is sourced from
https://github.com/opengeospatial/2D-Tile-Matrix-Set/tree/master/registry
(the OGC API - Tiles Standard Working Group will review and consider submissions to register new 2DTMS in that repository).
Co-authored-by: Wietze <[email protected]>
Co-authored-by: Wietze <[email protected]>
* MAY list the min and max rows and columns for each zoom level. If omitted, it is assumed that the entire spatial extent is covered (resulting in higher chunk count of the DataArray). | ||
|
||
#### Resampling Method | ||
Resampling Method specifies which resampling method is used for generating multiscales. It MUST be one of the following string values. Resampling method MUST be the same across all zoom levels: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the same syntax in the Tile Matrix Set spec? I searched and couldn't find anything. Is this also a requirement for COGs?
@thomas-maschler discussed in the SWG meeting today, it would be helpful before approving PRs like this if we have an example zarr store to test interoperability before approving - can you provide one? A few of us are available for testing. |
- } | ||
+} | ||
``` | ||
#### Using a URI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should lean towards recommending well-known or explicit identifiers instead of URIs, especially to maintain a 'self-describing' format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The official way to reference well-known 2DTMS for 2DTMS 2.0 is using a URI.
@briannapagan, initially I discussed with @maxrjones that he would give it a first try, he was planning to add some extra functionality to ndpyramids. However, if he didn't manage to find the time for this I should be able to do that and create some example Zarr stores with different overview layouts/ TMS. |
My apologies, I haven't found time for this yet. |
If implemented, each DataArray MUST define the 'multiscales' metadata attribute which includes the following fields: | ||
* `tile_matrix_set` | ||
* `tile_matrix_set_limits` (optional) | ||
* `resampling_method` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this called resampling_method
?
I think of this as an aggregation method, because the high resolution data is aggregated to coarser resolutions to make visualisation easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my point of view, when data is made less detailed, the process isn't just about combining numbers. Resampling specifically refers to the techniques used to interpolate or approximate pixel values as data is transformed to a different spatial resolution (nearest neighbour, bilinear, cubic). The choice of resampling method can greatly influence the quality and interpretive value of the final imagery. While the word "aggregation" might make you think of just adding or averaging numbers, "resampling" points to a broader set of actions and shows that there's more complexity in working with spatial data (e.g. weighted average of the four nearest pixels).
We have some folks interested in having a dedicated discussion about this PR and understanding some of its implications, can @maxrjones @thomas-maschler @felixcremer @wietzesuijker join our next bi-weekly zarr call? |
I won't most likely not be able to attend this weeks geozarr call, since Wednesday is a public holiday in Germany. I worked on implementing the multiscale functionality in PyramidScheme.jl and I am more and more convinced, that the multiscale specification should be independent from the geozarr specification. Building pyramids of a dataset is not restricted to geospatial data but is also As a side note, a source of recurrent confusion when implementing TMS for GeoZarr was that in TMS the concept of "TIles" is a central part of the specification. In contrast, the Zarr specs present n-dimensional arrays to the user which can be seen as one entity and where the chunking structure is rather an (important) implementation detail In practice this means that when users query a subset of a zarr array |
Hi Felix, Being not restricted to geospatial data, this is similar to many aspects covered by GeoZarr, which aims to reuse existing standards (such as OGC Tile Matrix Set) and indicate which location/placeholder must be used in the encoding. However, it is important to note that the pyramid structure is a key aspect for GeoZarr as it aims to offer functions equivalent to alternative formats such as COG within the Zarr format. Additionally, pyramid structures for Earth Observation (EO) data have their own particularities, such as resolution, compared to geospatially agnostic pyramids. |
Feel free to check the playlist below for demonstration of the pyramids encoded in Zarr datasets: https://www.youtube.com/watch?v=NYhh66EstnY&list=PLzPGC4s5HQOPdeLoK1MXK6gEa1x2Az8Dn |
@briannapagan is it still worth joining tomorrow's call? I will only be able to join during the second half. If this can wait another two weeks, i might be able to prepare a POC. |
I'm following this, unlikely I can make the call sadly |
I received a calendar notice that the meeting was moved to next week; unfortunately I am unavailable on May 8th at 11 ET but could join in two weeks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems enough accurate to me
I must say that I am very sad to see this PR merged despite the concerns and discussions initiated, esp by @felixcremer . The post contained some concrete technical discussion points which were replied to with links to very general PR-like youtube material without trying to listen to the community feedback of people actually trying to implement the new portions of specs proposed here. I don't view this process as healthy community integration. In the future our group will continue to use the multiscale conventions proposed here https://forum.image.sc/t/multiscale-arrays-v0-1/37930 and not implement the spec proposed here. |
Thank you for sharing your concerns. We appreciate the insights and the technical feedback raised by the community, especially from those actively working to implement these specifications. I want to clarify that this PR doesn’t modify the correct file, which itself is due for restructuring. The discussion remains open, and I'd encourage you and your group to attend the meetings, where we address these points directly. |
This PR implements the changes discussed in #30 and during the Zarr Sprint on Feb. 8, 2024 (participants: @maxrjones and @thomas-maschler).
It refactors the current
multiscales
metadata attribute and replaces the current dataset definition with theOGC Two Dimensional Tile Matrix Set
standard. This change will allow for more flexibility when defining the layout of multiscales and embrace already existing standards instead of reinventing the wheel.The Tile Matrix Set standard includes all information currently covered by the dataset definition and includes additional information on chunk layout, pixel size, and origin of the matrix.