fix/remove outlier predictions #1

jmlondon · 2024-10-08T17:58:58Z

For both ribbon and spotted seals, there are 'outlier' predictions showing up in the final movement dataset that need to be addressed.

For example, spotted seal predictions (pl_predict_pts) shows points well into the southern hemisphere

And ribbon seals, while remaining in the northern hemisphere, have some relatively extreme smoothed projections

The likely culprits worth investigating:

observations in the raw data that occur outside of the initial deployment date or specified end date
erroneous observations/location estimates
very long time gaps between observed locations -- this is the most likely scenario and worth looking at some recent code from Devin Johnson to remove these time gaps by splitting into separate segments

The text was updated successfully, but these errors were encountered:

jmlondon · 2024-10-12T00:15:26Z

@emchuron and I had a discussion about two approaches for handling long time gaps between observed locations

A priori split the sequence of observed locations into separate segments ... this would be done based on a specific maximum gap (e.g. 7 days) between observed locations before a new segment is designated. Each segment would be fit and predicted (and pseudo tracks generated) independently before merging back as needed
Fit the complete track and rely on post hoc identification of time gaps ... after fitting based on the complete set of observations we can identify gaps as before. Predictions and pseudo tracks are only generated for the periods outside of the identified gaps

The initial consideration was to focus on the first because it seemed easier to implement and might result in better predictions/pseudo tracks because the gap periods wouldn't have influence on the model fit. After some experimentation though, this approach leads to short segments that may not converge during the model fit. Imagine you might have a stretch of 8 days of no observations followed by 7-10 locations and then another 8 day gap. Fitting a model to just those 7-10 locations can be unreliable.

In most cases, the large time gaps are not resulting in poor model fits or convergence issues. Instead the problem comes on the prediction side when large correlated loops are generated that are unrealistic.

So, I think the second approach is the path worth pursuing and here's what's needed to accomplish that

emchuron · 2024-10-12T00:23:36Z

sounds good to me! Let me know if I can help at all

jmlondon · 2024-10-15T20:55:21Z

There are still some existing 'outlier' data within the spotted seal deployments. Need to do some additional investigation on this.

jmlondon self-assigned this Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix/remove outlier predictions #1

fix/remove outlier predictions #1

jmlondon commented Oct 8, 2024

jmlondon commented Oct 12, 2024 •

edited

Loading

emchuron commented Oct 12, 2024

jmlondon commented Oct 15, 2024

fix/remove outlier predictions #1

fix/remove outlier predictions #1

Comments

jmlondon commented Oct 8, 2024

jmlondon commented Oct 12, 2024 • edited Loading

emchuron commented Oct 12, 2024

jmlondon commented Oct 15, 2024

jmlondon commented Oct 12, 2024 •

edited

Loading