We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What happened: Adding a string array as a dimension coordinate breaks the data serving through a web service.
Minimal Complete Verifiable Example:
import xarray as xr import opendap_protocol as dap import numpy as np x = dap.Array(name='x', data=np.array([90, 91, 92]), dtype=dap.Int16) y= dap.Array(name='y', data=np.array(['a', 'b', 'c']), dtype=dap.String) data_array = dap.Grid(name='data', data=np.random.rand(3, 3), dtype=dap.Float64, dimensions=[x, y]) dataset = dap.Dataset(name='Example') dataset.append(x, y, data_array) import urllib from flask import Flask, Response, request app = Flask(__name__) @app.route('/dataset.dds', methods=['GET']) def dds_response(): # Retrieve constraints from the request to handle slicing, etc. constraint = urllib.parse.urlsplit(request.url)[3] return Response( dataset.dds(constraint=constraint), mimetype='text/plain') @app.route('/dataset.das', methods=['GET']) def das_response(): constraint = urllib.parse.urlsplit(request.url)[3] return Response( dataset.das(constraint=constraint), mimetype='text/plain') @app.route('/dataset.dods', methods=['GET']) def dods_response(): constraint = urllib.parse.urlsplit(request.url)[3] return Response( dataset.dods(constraint=constraint), mimetype='application/octet-stream') app.run(debug=True)
And from the terminal:
>>> import netCDF4 as nc >>> data = nc.Dataset('http://localhost:5000/dataset') >>> data <class 'netCDF4._netCDF4.Dataset'> root group (NETCDF3_CLASSIC data model, file format DAP2): dimensions(sizes): maxStrlen64(64), x(3), y(3) variables(dimensions): int16 x(x), |S1 y(y,maxStrlen64), float64 data(x,y) groups: >>> data.variables OrderedDict([('x', <class 'netCDF4._netCDF4.Variable'> int16 x(x) unlimited dimensions: current shape = (3,) filling on, default _FillValue of -32767 used ), ('y', <class 'netCDF4._netCDF4.Variable'> |S1 y(y, maxStrlen64) unlimited dimensions: current shape = (3, 64) filling on, default _FillValue of used ), ('data', <class 'netCDF4._netCDF4.Variable'> float64 data(x, y) unlimited dimensions: current shape = (3, 3) filling on, default _FillValue of 9.969209968386869e+36 used )])
So far so good, but when trying to fetch any of the actual data, an IndexError is raised:
>>> data.variables["x"][...] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "netCDF4/_netCDF4.pyx", line 4351, in netCDF4._netCDF4.Variable.__getitem__ File "netCDF4/_netCDF4.pyx", line 5291, in netCDF4._netCDF4.Variable._get IndexError: index exceeds dimension bounds >>> data.variables["y"][...] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "netCDF4/_netCDF4.pyx", line 4351, in netCDF4._netCDF4.Variable.__getitem__ File "netCDF4/_netCDF4.pyx", line 5291, in netCDF4._netCDF4.Variable._get IndexError: index exceeds dimension bounds >>> data.variables["data"][...] Traceback (most recent call last): File "<stdin>", line 1, in <module> File "netCDF4/_netCDF4.pyx", line 4351, in netCDF4._netCDF4.Variable.__getitem__ File "netCDF4/_netCDF4.pyx", line 5291, in netCDF4._netCDF4.Variable._get IndexError: index exceeds dimension bounds
As a consequence, xarray also fails to read the dataset:
>>> import xarray as xr >>> xr.open_dataset('http://localhost:5000/dataset') Traceback (most recent call last): File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/backends/netCDF4_.py", line 104, in _getitem array = getitem(original_array, key) File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/backends/common.py", line 65, in robust_getitem return array[key] File "netCDF4/_netCDF4.pyx", line 4351, in netCDF4._netCDF4.Variable.__getitem__ File "netCDF4/_netCDF4.pyx", line 5291, in netCDF4._netCDF4.Variable._get IndexError: index exceeds dimension bounds During handling of the above exception, another exception occurred: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/backends/api.py", line 496, in open_dataset backend_ds = backend.open_dataset( File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/backends/netCDF4_.py", line 563, in open_dataset ds = store_entrypoint.open_dataset( File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/backends/store.py", line 37, in open_dataset ds = Dataset(vars, attrs=attrs) File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/core/dataset.py", line 739, in __init__ variables, coord_names, dims, indexes, _ = merge_data_and_coords( File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/core/merge.py", line 477, in merge_data_and_coords return merge_core( File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/core/merge.py", line 623, in merge_core collected = collect_variables_and_indexes(aligned) File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/core/merge.py", line 287, in collect_variables_and_indexes variable = as_variable(variable, name=name) File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/core/variable.py", line 166, in as_variable obj = obj.to_index_variable() File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/core/variable.py", line 536, in to_index_variable return IndexVariable( File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/core/variable.py", line 2540, in __init__ self._data = PandasIndex(self._data) File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/core/indexes.py", line 76, in __init__ self.array = utils.safe_cast_to_index(array) File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/core/utils.py", line 114, in safe_cast_to_index index = pd.Index(np.asarray(array), **kwargs) File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/numpy/core/_asarray.py", line 102, in asarray return array(a, dtype, copy=False, order=order) File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/core/indexing.py", line 572, in __array__ return np.asarray(array[self.key], dtype=None) File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/backends/netCDF4_.py", line 91, in __getitem__ return indexing.explicit_indexing_adapter( File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/core/indexing.py", line 863, in explicit_indexing_adapter result = raw_indexing_method(raw_key.tuple) File "/prod/zue/fc_development/users/ned/local_share/virtualenvs/preproc-pipelines-HVmgC6OK/lib/python3.8/site-packages/xarray/backends/netCDF4_.py", line 114, in _getitem raise IndexError(msg) IndexError: The indexing operation you are attempting to perform is not valid on netCDF4.Variable object. Try loading your data into memory first by calling .load().
What you expected to happen:
Through netcdf
>>> data = nc.Dataset('http://localhost:5000/dataset') >>> data.variables["x"][...] masked_array(data=[90, 91, 92], mask=False, fill_value=999999, dtype=int16) >>> data.variables["y"][...] masked_array(data=['a', 'b', 'c'], mask=False, fill_value='N/A', dtype='<U1')
and through xarray:
>>> xr.open_dataset('http://localhost:5000/dataset') <xarray.Dataset> Dimensions: (x: 3, y: 3) Coordinates: * x (x) int16 90 91 92 * y (y) <U1 'a' 'b' 'c' Data variables: data (x, y) float64 ...
Environment:
INSTALLED VERSIONS ------------------ commit: None python: 3.8.4 (default, Aug 21 2020, 11:28:17) [GCC 5.4.0 20160609] python-bits: 64 OS: Linux OS-release: 4.4.0-210-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.8.16 libnetcdf: 4.4.0 xarray: 0.18.2 pandas: 1.2.4 numpy: 1.20.3 scipy: 1.6.3 netCDF4: 1.5.0.1 pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.8.3 cftime: 1.5.0 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.05.1 distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 56.0.0 pip: 21.0.1 conda: None pytest: None IPython: None sphinx: None
The text was updated successfully, but these errors were encountered:
No branches or pull requests
What happened:
Adding a string array as a dimension coordinate breaks the data serving through a web service.
Minimal Complete Verifiable Example:
And from the terminal:
So far so good, but when trying to fetch any of the actual data, an IndexError is raised:
As a consequence, xarray also fails to read the dataset:
What you expected to happen:
Through netcdf
and through xarray:
Environment:
The text was updated successfully, but these errors were encountered: