-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
seaexplorer.raw_to_timeseries() outputs questionable navigation data #188
Comments
That's certainly "interesting". Can you share the raw files that made those, and maybe your deployment.yaml file? |
Ping @callumrollo too in case this twigs anything for him. At some point we started using polars for the sea explorer and I didn't follow all the changes super closely. |
Definitely not the behavior we would want! I remember some small changes in interpolated values when we switched from pandas to polars, but they were of the order 0.001 in the tests, nothing of this magnitude. I'll investigate this afternoon. The deployment.yaml file would be really useful, particularly the timebase |
Just sent both of you a subset of the raw files, as well as the deployment.yaml in an email. Thanks again for having a look. |
Hi @clayton33 thanks for sending the raw files. I think I've found the solution. As you described in your email, this mission did not sample science sensors on every row, so some data were discarded during processing. In the deployment yaml, you have:
so pyglider only keeps lines that have ctd or oxygen data. If you want it to keep data from the dives where no science was sampled, you can add one of the glider attitude parameters like roll
When I made this change and reprocessed, it fixed the gaps in pitch/roll/heading, increading the timeseries file size by ~ 10 % Does this solution work for you? PS thanks for providing a fully reproducible example! It really sped up the troubleshooting process. |
Using the following code to compare raw vs pyglider data: import xarray as xr
import pandas as pd
import matplotlib.pyplot as plt
ds = xr.open_dataset('L0-timeseries/dfo-Mersey022-20210531_delayed.nc')
df = pd.read_csv('raw/sea022.66.gli.sub.445', sep=';')
df['datetime'] = pd.to_datetime(df['Timestamp'], format='%d/%m/%Y %H:%M:%S')
fig, ax = plt.subplots(3, 1, figsize=(8, 12), sharex='col')
ax = ax.ravel()
ax[0].plot(df.datetime, df.Heading, lw=5, label='Raw')
ax[0].plot(ds.time, ds.GLIDER_HEADING,label='Processed')
ax[0].legend()
ax[0].set(title='Heading', xlim=[df.datetime.min(), df.datetime.max()])
ax[1].plot(df.datetime, df.Roll, lw=5, label='Raw')
ax[1].plot(ds.time, ds.GLIDER_ROLL,label='Processed')
ax[1].legend()
ax[1].set(title='Roll', xlim=[df.datetime.min(), df.datetime.max()])
ax[2].plot(df.datetime, df.Pitch, lw=5, label='Raw')
ax[2].plot(ds.time, ds.GLIDER_PITCH,label='Processed')
ax[2].legend()
ax[2].set(title='Pitch', xlim=[df.datetime.min(), df.datetime.max()]) With original deplyment.yaml With |
@callumrollo Can this be better documented? For instance, I don't see any mention of |
Yes it's not the most intuitive control. |
I now realize in my initial issue I should have stated the expected behavior (sorry ! I should know better). I was expecting that the |
@clayton33 Your intituition about how it is supposed to work was the original intention - I'm not entirely sure why that has changed. We had a similar discussion about the slocum gliders, and my recommendation to folks is rather than keeping two timebases, for instance if they want all the heading data even for times when science is off, is that they make two separate netcdf files. @callumrollo I'd call this a bug the way it is behaving now. I'm happy to try and fix it, or wait until you are back fro vacation. |
@callumrollo I tried your suggestion (adding one of the nav variables to |
Hi folks -- just jumping in here that I tried running @clayton33's files through the latest version of pyglider, installed by following the instructions at https://pyglider.readthedocs.io/en/latest/Install.html, i.e. I did:
Followed by:
and then:
When I tried running her
and after running @callumrollo's plot script I get the expected result: So, I think the issue here is not with pyglider but with @clayton33's environment or installed version of pyglider. I'll open another issue related to the install of polars (I thought that it would have been taken care of by the |
@richardsc was right, it was an issue with my environment, disregard my comment above. It's working perfectly fine for me now. |
Ok sorry for the packaging issues. I think the folks who use pyglider mostly use from source rather than relying on package managers because changes are still happening fast. I should update the docs and maybe try to make automated releases easier |
No worries Jody -- @clayton33 and I are both learning as we go, as we're not primarily "python people". We definitely are using the source versions of pyglider, but also relying on conda to install all the other dependencies, and sometimes things can get weird. The lesson we learned from this is "don't try and do everything in your base environment" 😄 . I won't close the issue, because the last request for work was to update the docs and @callumrollo is currently on vacation. Or we could close this and open a new issue that better reflects what still needs to be done (as the original issue is completed). |
We're hoping to include heading, pitch, and roll in our data files. Below is an example figure comparing the processed data of these variables with the raw data.
I've deduced that the navigation data gets interpolated to the chosen timestamp using
_interp_gli_to_pld
. Is it attributed to the more coarse sampled navigation data being interpolated to the finer science data time ?Also, to be honest, I'm not much of a python person, so I might be a bit useless to help if this is indeed an issue.
The text was updated successfully, but these errors were encountered: