Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/locations too strict in checking station coordinates #105

Closed
Teddy-1000 opened this issue Apr 4, 2024 · 8 comments · May be fixed by #108
Closed

/locations too strict in checking station coordinates #105

Teddy-1000 opened this issue Apr 4, 2024 · 8 comments · May be fixed by #108
Assignees
Labels
API:EDR Label for API issue and PR related to EDR API For issues and PR related to the APIs

Comments

@Teddy-1000
Copy link
Collaborator

Stations can move or have their position updated to be more precise. As in the example below. I think we should disable this check for now, and find a better way to make sure the position does not diverge too much between time series. In this case the position have moved 1.2 meters, but breaks the entire /locations endpoint of the API. I think such a check should be made somewhere else. As we do not want data with errors accepted in to the system at all, and we should treat the datastore as a source of truth.

{
    "detail": {
        "coordinates": "Station with id `0-578-0-91695 has multiple coordinates: {(20.914, 69.9367), (20.914, 69.93667)}"
    }
}
@Teddy-1000 Teddy-1000 added API For issues and PR related to the APIs API:EDR Label for API issue and PR related to EDR labels Apr 4, 2024
@lukas-phaf
Copy link
Contributor

I assume the check you are referring to is the one where we check that platform_coordinates has only one entry per station.
If we remove it, and there are multiple ones, which one would you use? We have to make sure we don't ping-pong between multiple values on different requests.

In some sense I think this check failing is a good indication that the ingested data is "wrong".

@Teddy-1000
Copy link
Collaborator Author

Teddy-1000 commented Apr 4, 2024

I know @jo-asplin-met-no is currently working on a new gRPC method for use in /locations and I think we should solve this in the same function.
How important is it that the coordinates in /location match with all the time series? If this is not very important I suggest that the new method get the latest position provided for that station. It is not perfect, but is in line with how we treat other parameters where the latest provided is considered the most correct.

@Teddy-1000
Copy link
Collaborator Author

But I think having two divergent time series bringing down the entire /locations endpoint is bad.

@Teddy-1000
Copy link
Collaborator Author

Ok, temporary solution. We check that the diverging points are not too far apart, if within tolerance we sort the coordinates and choose the first one?

@jo-asplin-met-no
Copy link
Contributor

jo-asplin-met-no commented Apr 4, 2024

At MET Norway it is not unusual that the lat,lon location of a station can change from time to time (though I'm not sure about exactly how often it happens - I could probably dig up some stats). There are essentially two reasons for this: 1) a station is physically moved (for various reasons - in 1940 the station at Blindern (our Oslo office) was moved 300m), and 2) the lat or lon component is slightly updated, typically to comply with some (internal) standard wrt. number of decimal places. It was 2) that triggered this morning's test failure as a result of a GUI tool insisting to update the lon component of the station from four to five decimal places (from 19.4737 to 19.47367) even if the position was not even the metadata that my colleague wanted to update (he wanted to update the "short name" of the station, but had to "fix" the lon component too!). This just proves how vulnerable these things can be in practice whenever human editing is involved. Anyway, we need to support these cases, and I agree with Amund in preferring the most recent known position for a station, since that is most likely to "stabilize the system" quickly (at least wrt. both 1) and 2) at MET Norway I guess). The challenge then IMO is how to present/document deviations to the end user. How much of a problem will this be for an end user anyway? (Note: in MET Norway's case it is possible to distinguish between 1) and 2) in case that is relevant for the end user.)

@lukas-phaf
Copy link
Contributor

There is no concept of station in the store, only timeseries. The current /locations endpoint gets the latest location for each timeseries, so the blowup is not due directly to a timeseries position change, but due to different timeseries (actually the last observation of that timeseries) "belonging" to the same station reporting a different position. The underlying problem behind this might still be a station move, however.

But taking the last position in itself doesn't solve this problem, as there could be different positions with the same timestamp, from different timeseries/parameters.

Anyway, I agree with proceeding if the positions are close enough. Any approach that gives a consistent output is fine with me, although it feels strange to use the "smallest" coordinate.

@Teddy-1000 Teddy-1000 self-assigned this Apr 4, 2024
@Teddy-1000 Teddy-1000 linked a pull request Apr 4, 2024 that will close this issue
@Teddy-1000
Copy link
Collaborator Author

I opened a PR with a draft for getting coordinates in a consistent manner.

@jo-asplin-met-no jo-asplin-met-no changed the title /locations to strict in checking station coordiantes /locations too strict in checking station coordinates Apr 5, 2024
@jo-asplin-met-no
Copy link
Contributor

jo-asplin-met-no commented Apr 8, 2024

I updated Issue #107 with a detailed sketch of a possible solution where the data store takes responsibility. @Teddy-1000 and @lukas-phaf: please comment! (the temporary solution proposed by Amund is probably good to have for now anyway, though!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API:EDR Label for API issue and PR related to EDR API For issues and PR related to the APIs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants