Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dimension names from cube:dimensions #136

Open
clausmichele opened this issue Dec 18, 2023 · 1 comment
Open

Dimension names from cube:dimensions #136

clausmichele opened this issue Dec 18, 2023 · 1 comment

Comments

@clausmichele
Copy link

clausmichele commented Dec 18, 2023

It would be nice that, if at Collection or Item level the datacube extension is present, the provided dimension names would be reflected in the final returned xarray object. Currently, the dimension names are always the default ones:

Sample STAC Collection with datacube extension:

import json
import pystac
import pystac_client

url = "https://stac.eurac.edu/collections/SENTINEL2_L2A_SAMPLE"

stac_api = pystac_client.stac_api_io.StacApiIO()
stac_dict = json.loads(stac_api.read_text(url))
b_dim = None
t_dim = None
x_dim = None
y_dim = None
z_dim = None
if "cube:dimensions" in stac_dict:
    for dim in stac_dict["cube:dimensions"]:
        if stac_dict["cube:dimensions"][dim]["type"] == "bands":
            b_dim = dim
        if stac_dict["cube:dimensions"][dim]["type"] == "temporal":
            t_dim = dim
        if stac_dict["cube:dimensions"][dim]["type"] == "spatial":
            if stac_dict["cube:dimensions"][dim]["axis"] == "x":
                x_dim = dim
            if stac_dict["cube:dimensions"][dim]["axis"] == "y":
                y_dim = dim
            if stac_dict["cube:dimensions"][dim]["axis"] == "z":
                z_dim = dim
print(b_dim,t_dim,x_dim,y_dim,z_dim)

>>> bands t x y None

Result from odc-stac:

import pystac_client
import odc.stac

catalog_url = "https://stac.eurac.edu/"
collection = "SENTINEL2_L2A_SAMPLE"

catalog = pystac_client.Client.open(catalog_url)
query_params = {"collections": [collection]}

items = catalog.search(**query_params).item_collection()
data = odc.stac.load(items,chunks={})
print(data.dims)

>>> FrozenMappingWarningOnValuesAccess({'y': 86, 'x': 98, 'time': 12})

I understand that in the above example I'm passing STAC Items that do not contain the cube:dimensions field, which is provided only at Collection level.
Would it make sense to give the option for using the naming convention from the STAC itself?

@Kirill888
Copy link
Member

Support for datacube extension would be cool, but it's not just about spatial dimension names though, it's about dimensions other than time,x,y, it's about multiple variables present in the same hdf/zarr/netcdf-like asset, it's about units (duplicating raster extension) and "data variable type" that seems to be extending stac "role" to components of the asset being described, and other metadata like valid data range or a set of allowed values a particular data variable can hold that one would expect to be exposed as an attribute I guess.

Not to mention that those "hdf-like" data sources often have hard-to-support geo-registration strategies like arrays of pixel locations, as opposed to CRS + Linear Transform.

And as far as spatial dimension names go, having custom names can be more of a pain than advantage, I'm still annoyed that odc-stac uses longitude/latitude dimension names when data is in geographic coordinates and x,y when using projections (that's because of opendatacube/datacube legacy, I should at least add an option to force x,y names regardless of CRS being used).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants