-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
alignment load hint documentationi is incorrect #1382
Comments
I feel like I've seen something like this before, but can't remember the details. I suspect there's a mismatch in assumptions about cell coordinates (edge vs centre) somewhere. Indeed, the offset in the attached image looks like it is off by half-a-pixel rather than a whole pixel. |
@clausmichele it's most likely doing the right thing, but documentation on the topic is lacking, please read through this issue first: |
I know that the coordinates of the data I'm seeing are the pixel centres and not the boundaries, and I don't care if there's a padding but the problem is that the data is not aligned with the original one. Could you please try with the data I provided? |
but that's the point, it won't be aligned with the original data. It's simply not possible in the general case when there are more than one dataset, or more than one band even... To use exactly the grid you need you have to use |
also you should remove "storage" section from you product spec. |
Thanks @Kirill888, using import datacube
from datacube.utils.geometry import intersects, GeoBox
import affine
ref_geobox = GeoBox(4332,2000, affine.Affine(0.0030000000000001137, 0.0, 4.003, 0.0, -0.0030000000000001137, 49.0), 'EPSG:4326')
dc = datacube.Datacube()
data = dc.load(product='SAMPLE_DATACUBE',like=ref_geobox)
print(data)
<xarray.Dataset>
Dimensions: (time: 1, latitude: 2000, longitude: 4332)
Coordinates:
* time (time) datetime64[ns] 2022-12-16T10:09:22
* latitude (latitude) float64 49.0 49.0 48.99 48.99 ... 43.01 43.0 43.0
* longitude (longitude) float64 4.005 4.008 4.011 ... 16.99 16.99 17.0
spatial_ref int32 4326
Data variables:
band_0 (time, latitude, longitude) float64 0.0 0.0 0.0 ... 0.0 0.0 0.0
Attributes:
crs: EPSG:4326
grid_mapping: spatial_ref For storage, why should I remove it? Just in this case since I'll use like=GeoBox or in general? |
@clausmichele In general, this definition is incomplete, it's missing datacube-core/datacube/model/__init__.py Lines 516 to 521 in 7f587ef
You want to use "load hints" instead to supply default projection and resolution to use during errors in docs@SpacemanPaul @pindge docs for product https://github.com/opendatacube/datacube-core/blob/develop/docs/installation/product-definitions.rst |
@Kirill888 I did some tests using load instead of storage, but I can't understand how it is supposed to work, since the values does not seem to follow a logic: So, starting point, using like + GeoBox gives the right coordinates: from datacube.utils.geometry import GeoBox
import affine
import rioxarray
ref_geobox = GeoBox(4332,2000, affine.Affine(0.0030000000000001137, 0.0, 4.003, 0.0, -0.0030000000000001137, 49.0), 'EPSG:4326')
dc = datacube.Datacube()
query = {}
query['product'] = 'SAMPLE_DATACUBE'
datasets = dc.find_datasets(**query)
data = dc.load(datasets=datasets,like=ref_geobox,**query)
print(data.latitude.values)
print(data.longitude.values)
[48.9985 48.9955 48.9925 ... 43.0075 43.0045 43.0015]
[ 4.0045 4.0075 4.0105 ... 16.9915 16.9945 16.9975] Then, here a the results using different align parameters: import datacube
# load:
# crs: EPSG:4326
# resolution:
# longitude: 0.003
# latitude: -0.003
# align:
# longitude: 0.0
# latitude: 0.0
dc = datacube.Datacube()
query = {}
query['product'] = 'SAMPLE_DATACUBE'
datasets = dc.find_datasets(**query)
data = dc.load(datasets=datasets,**query)
print(data.latitude.values)
print(data.longitude.values)
[49.0005 48.9975 48.9945 ... 43.0065 43.0035 43.0005]
[ 4.0035 4.0065 4.0095 ... 16.9935 16.9965 16.9995] -> wrong import datacube
# load:
# crs: EPSG:4326
# resolution:
# longitude: 0.003
# latitude: -0.003
# align:
# longitude: 0.0015
# latitude: 0.0015
dc = datacube.Datacube()
query = {}
query['product'] = 'SAMPLE_DATACUBE'
datasets = dc.find_datasets(**query)
data = dc.load(datasets=datasets,**query)
print(data.latitude.values)
print(data.longitude.values)
[48.999 48.996 48.993 ... 43.005 43.002 42.999]
[ 4.002 4.005 4.008 ... 16.992 16.995 16.998] -> wrong import datacube
# load:
# crs: EPSG:4326
# resolution:
# longitude: 0.003
# latitude: -0.003
# align:
# longitude: 0.003
# latitude: 0.003
dc = datacube.Datacube()
query = {}
query['product'] = 'SAMPLE_DATACUBE'
datasets = dc.find_datasets(**query)
data = dc.load(datasets=datasets,**query)
print(data.latitude.values)
print(data.longitude.values)
[49.0005 48.9975 48.9945 ... 43.0065 43.0035 43.0005]
[ 4.0035 4.0065 4.0095 ... 16.9935 16.9965 16.9995] -> wrong import datacube
# load:
# crs: EPSG:4326
# resolution:
# longitude: 0.003
# latitude: -0.003
# align:
# longitude: 0.001
# latitude: 0.001
dc = datacube.Datacube()
query = {}
query['product'] = 'SAMPLE_DATACUBE'
datasets = dc.find_datasets(**query)
data = dc.load(datasets=datasets,**query)
print(data.latitude.values)
print(data.longitude.values)
[48.9985 48.9955 48.9925 ... 43.0075 43.0045 43.0015]
[ 4.0045 4.0075 4.0105 ... 16.9915 16.9945 16.9975] -> correct! But before finding this values, I also tried 0.002, 0.0005 and many more. So what's the logic behind? The right align values should be 0.0015 in my opinion, looking at the docs (half the resolution, which here is 0.003)
|
This issue shouldn't be closed yet. It is not yet clear why the load + align parameters in the product definition give the showed outputs. |
a val in coord list = The +0.5 comes from converting from grid-edge coordinates to pixel-centre coordinates. E.g. for:
49.0005 = 0.0 + (n+0.5) * 0.003 (n=16333, an integer) I think all the numbers you see as "wrong" work out with that. |
@clausmichele for a coordinate of a pixel EDGE
So every pixel spans from |
So, if Now that I know
Maybe all of this should be documented better in the ODC docs? This rounding to integer (when computing Edit: is the rounding to get |
There's a lot that could be better documented in the ODC docs. Documentation PRs are most welcome! |
Re: load v storage in product metadata. Storage should be deprecated - use load. The exact behaviour of storage at the moment is complicated, but it almost definitely doesn't do what you want. Yes, the documentation is wildly incorrect. |
Expected behaviour
Data loaded from ODC should be aligned with the original one.
Actual behaviour
I am working with data projected in EPSG:4326:
Unfortunately, when loading the data with ODC, the data has one additional row and column, leading to a shift in the data. I attach a zip with a sample file and the product and dataset definitions.
This screenshot shows the misalignment between the resulting data loaded from ODC vs the original one (layer below in dark gray is ODC):
I also tried using the
align
parameter but nothing changed, I am not sure how to properly use it. I also tried to use the load hints, but again no difference.Steps to reproduce the behaviour
Environment information
Datacube version = 1.8.9
Python version = 3.9.13
sample_data.zip
The text was updated successfully, but these errors were encountered: