Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Tile Matrix Set to describe multiscales #44

Merged
merged 8 commits into from
Nov 4, 2024
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
173 changes: 148 additions & 25 deletions geozarr-spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ The following standard names are recommended to describe coordinates variables f

### Coordinate Reference System

The **grid_mapping** CF variable defined by DataArray variable defines the coordinate reference system (CRS) used for the horizontal spatial coordinate values. The grid_mapping value indicates the Auxliary variable that holds all the CF attribute describing the CRS.
The **grid_mapping** CF variable defined by DataArray variable defines the coordinate reference system (CRS) used for the horizontal spatial coordinate values. The grid_mapping value indicates the Auxliary variable that holds all the CF attribute describing the CRS.
thomas-maschler marked this conversation as resolved.
Show resolved Hide resolved

### Other CF Properties

Expand All @@ -85,44 +85,167 @@ All other CF conventions are recommended, in particular the attributes below:

## Multiscales

A GeoZarr Dataset variable might includes multiscales for a set of DataArray variables. Also known as overviews, multiscales provides download-scaled versions of the original image and represent zoomed out version of the image for fast visualisation purposes. A zoomed out version of the original image thus holds much less detail.
A GeoZarr Dataset variable might includes multiscales for a set of DataArray variables. Also known as "overviews", multiscales provides resampled copies of the original data at a coarser resolution. Multiscales of the original data thus always hold less detail. Common use cases for multiscales are fast rendering for visualization purposes and analyzing data at multiple resolution.
thomas-maschler marked this conversation as resolved.
Show resolved Hide resolved

### Multiscales Encoding

Multiscales MUST be encoded in children groups.
Multiscales MUST be encoded in children groups. Data at all scales MUST use the same coordinate reference system and must follow ONE common zoom level strategy. The zoom level strategy is modelled in close alignment to the [OGC Two Dimensional Tile Matrix Set](https://docs.ogc.org/is/17-083r4/17-083r4.html) version 2 and the [Tiled Asset STAC extension](https://github.com/stac-extensions/tiled-assets). Each zoom level is described by a Matrix defining the number, layout, origin and pixel size of included tiles. These tiles MUST correspond to the chunk layout along the `lat` and `lon` dimension of the DataArray within a given group.

* Multiscale group name is the zoom level (e.g. '0').
* Multiscale group contains all DataArrays generated for this specific zoom level.
* Zoom level strategy is based on defacto standard level 0 as 256x256 pixels covering the entire world, and scale doubled on each level as per https://wiki.openstreetmap.org/wiki/Zoom_levels.
briannapagan marked this conversation as resolved.
Show resolved Hide resolved
* Multiscale chunking is RECOMMENDED to be 256 pixels or 512 pixels for the latittude and longitude dimensions.
* Multiscale group name is the zoom level identifier (e.g. '0').
* Multiscale group contains all DataArrays generated for this specific zoom level.
* Multiscale chunking is RECOMMENDED to be 256 pixels or 512 pixels for the latitude and longitude dimensions.

### Multiscales Metadata

Each DataArray MUST define the 'multiscales' metadata attribute that provides the multiscales group path :
If implemented, each DataArray MUST define the 'multiscales' metadata attribute which includes the following fields:
* `tile_matrix_set`
* `tile_matrix_set_limits` (optional)
* `resampling_method`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this called resampling_method?
I think of this as an aggregation method, because the high resolution data is aggregated to coarser resolutions to make visualisation easier.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my point of view, when data is made less detailed, the process isn't just about combining numbers. Resampling specifically refers to the techniques used to interpolate or approximate pixel values as data is transformed to a different spatial resolution (nearest neighbour, bilinear, cubic). The choice of resampling method can greatly influence the quality and interpretive value of the final imagery. While the word "aggregation" might make you think of just adding or averaging numbers, "resampling" points to a broader set of actions and shows that there's more complexity in working with spatial data (e.g. weighted average of the four nearest pixels).



#### Tile Matrix Set
Tile Matrix Set can be:
* the name of a well know tile matrix set. Well known Tile Matrix Sets are listed [here](https://schemas.opengis.net/tms/2.0/json/examples/tilematrixset/).
* the URI of a JSON document describing the Tile Matrix Set following the OGC standard.
* a JSON object describing the Tile Matrix Set following the OGC standard (CamelCase!).

Within the Tile Matrix Set
* the Tile Matrix identifier for each zoom level MUST be the relative path to the Zarr group which holds the DataArray variable
* zoom levels MUST be provided from lowest to highest resolutions
* the `supportedCRS` attribute of the Tile Matrix Set MUST match the crs information defined under **grid_mapping**.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the duplication?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no supportedCRS attribute in 2DTMS 2.0.

See also the Release Notes explaining how supportedCRS was changed to crs, as well as 6.2.1.1. TileMatrixSet CRS Compatibility explaining that a 2DTMS is compatible with multiple CRS (e.g., CRS that only change axis order -- as that does not affect the tiling at all, CRSs that add additional dimension (e.g. CRS84 vs CRS84h), as well as realizations of a datum ensemble .

* the tile layout for each Matrix MUST correspond to the chunk layout along the `lat` and `lon` dimension of the corresponding group.
thomas-maschler marked this conversation as resolved.
Show resolved Hide resolved


#### Tile Matrix Set Limits
Tile Matrix Sets may describe a larger spatial extent and more resolutions than used in the given dataset.
In that case, users MAY specify [Tile Matrix Set Limits](https://docs.ogc.org/is/17-083r4/17-083r4.html#toc21) as described in the OGC standard to define the minimum and a maximum limits of the indices for each TileMatrix that contains actual data. However, the notation for tile matrix set does not the JSON encoding as described in the OGC standard but follows the STAC Tile Asset encoding for better readability.

If used, Tile Matrix Set Limits
* MUST list all included zoom levels
* MAY list the min and max rows and columns for each zoom level. If omitted, it is assumed that the entire spatial extent is covered (resulting in higher chunk count of the DataArray).

#### Resampling Method
Resampling Method specifies which resampling method is used for generating multiscales. It MUST be one of the following string values. Resampling method MUST be the same across all zoom levels:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the options constrained by xarray or tms? It would be good to link to a source that details the options (e.g. could I use the 20th percentile (though not sure why)).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took the list from rasterio. But looking at this again, I think this should probably be an implementation detail. It will be enough to say that it must be of type string and the same across all zoom levels

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the same syntax in the Tile Matrix Set spec? I searched and couldn't find anything. Is this also a requirement for COGs?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, resampling is not part of TMS or COGs.
This property was already part of the current specs. It is useful for assuring consistency when progressively adding data to the same zarr store. Otherwise you might end up with overview chunks that were resampled using different methods.

* nearest
* bilinear
* cubic
* cubic_spline
* lanczos
* average
* mode
* gauss
* max
* min
* med
* q1
* q3
* sum
* rms

### Multiscale examples
#### Using Well Known Name reference

* Path MUST be the relative path to the Zarr group which holds the DataArray variable
* Zoom levels MUST be provided from lowest to highest resolutions
* First level path MUST reference to itself or can be omitted.
* If the optional 'crs' attribute is missing, then the downscaled version is assumed to be non-projected (and can be displayed using a "pseudo plate-carree" projection).
```diff
(mandatory items in red, optional items in green)
+{
+ "multiscales":
- {
- "tile_matrix_set": "WebMercatorQuad",
- "resampling_method": "nearest",
- }
+}
```
#### Using a URI

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should lean towards recommending well-known or explicit identifiers instead of URIs, especially to maintain a 'self-describing' format

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The official way to reference well-known 2DTMS for 2DTMS 2.0 is using a URI.


```diff
(mandatory items in red, optional items in green)
+{
+ "multiscales": [
- {
- "name": "example",
- "datasets": [
- {"path": "0", "level": "0",
+ "crs": "EPSG:3857"},
- {"path": "1", "level": "1",
- {"path": "2", "level": "2"},
- {"path": ".", "level": "3"}
- ],
+ "type": "gaussian",
- }
+ ]
+ "multiscales":
- {
- "tile_matrix_set": "https://schemas.opengis.net/tms/2.0/json/examples/tilematrixset/WebMercatorQuad.json.json",
thomas-maschler marked this conversation as resolved.
Show resolved Hide resolved
- "resampling_method": "nearest",
- }
+}
```

#### Using a JSON object

```diff
(mandatory items in red, optional items in green)
+{
+ "multiscales":
- {
- "tile_matrix_set": {
- "id": "WebMercatorQuad",
- "title": "Google Maps Compatible for the World",
- "uri": "http://www.opengis.net/def/tilematrixset/OGC/1.0/WebMercatorQuad",
- "crs": "http://www.opengis.net/def/crs/EPSG/0/3857",
- "orderedAxes": [
- "X",
- "Y"
- ],
- "wellKnownScaleSet": "http://www.opengis.net/def/wkss/OGC/1.0/GoogleMapsCompatible",
- "tileMatrices": [
- {
- "id": "0",
- "scaleDenominator": 559082264.028717,
- "cellSize": 156543.033928041,
- "pointOfOrigin": [
- -20037508.3427892,
- 20037508.3427892
- ],
- "tileWidth": 256,
- "tileHeight": 256,
- "matrixWidth": 1,
- "matrixHeight": 1
- },
- {
- "id": "1",
- "scaleDenominator": 279541132.014358,
- "cellSize": 78271.5169640204,
- "pointOfOrigin": [
- -20037508.3427892,
- 20037508.3427892
- ],
- "tileWidth": 256,
- "tileHeight": 256,
- "matrixWidth": 2,
- "matrixHeight": 2
- },
- }
- "resampling_method": "nearest",
- }
+}
```
#### Setting limits

```diff
(mandatory items in red, optional items in green)
+{
+ "multiscales":
- {
- "tile_matrix_set": "WebMercatorQuad",
+ "tile_matrix_limits: {
- "0": {},
- "1": {
+ "min_tile_col": 0,
+ "max_tile_col": 0,
+ "min_tile_row": 0,
+ "max_tile_row": 0
- },
- "2": {
+ "min_tile_col": 1,
+ "max_tile_col": 1,
+ "min_tile_row": 2,
+ "max_tile_row": 2
- }
- },
- "resampling_method": "nearest",
- }
+}
```


## Portrayals and Symbology

A GeoZarr Dataset variable might define a set of visual portrayals of the geospatial data and define an adequate symbology. The symbology model is based on a simplified schema based on OGC Symbology Encoding Implementation Specification https://www.ogc.org/standards/symbol.
Expand Down